Hybrid Fuzzy First Principles Modeling
Hybrid Fuzzy First Principles Modeling
(2.5)
which denotes the sensitivity of model output y with respect to parameter . The calculation
can be done numerically using Eulers method. The sensitivity can be approximated by:
S =
y( + ) y()
(2.6)
where y() is output y as a result of the value of parameter and is a small change in
the value of . A time plot of S can be useful to determine if can be assumed constant or
not. This test can only be used for steady state models. If dynamic models are used, average,
integral or nal-time variants of the test can be used, although interpretation of the results is
more difcult.
Structure tests
In modeling, the idea is to keep the model as simple as possible. If one model contains more
parameters than another, its performance can be better than the latter model, but it is also
more complex. If desired, the extra complexity can be penalized by incorporating the num-
ber of parameters in an error criterion for the model. This way, a trade-off between a low
modeling error and model complexity can be made. This kind of tests is especially useful in
black box modeling. See Sohlberg (1998) for more information.
Transparency and interpretability
Structure requirements in the form of transparency and interpretability criteria are hard to
formulate concretely. The modeler can form ideas about this beforehand, but the actual anal-
ysis can best be done a posteriori, because these criteria are subjective and can be subject to
change. For example, if the physical interpretation that can be given to the model is lower
than anticipated, but the extra modeling effort that is required to improve this is very high, the
interpretability criterion could be weakened. So, while having the model available, the mod-
eler can decide if these kinds of criteria are met or not. These issues are especially important
in white and grey box modeling. Examples will be given in other chapters in this thesis.
Statistical tools
There are several statistical tests available to judge the structure of a model. Examples are the
likelihood ratio test (were the ratio of the likelihood functions of two models provide infor-
mation about the loss of performance with respect to model complexity), Lagrange Multiplier
19
Process modeling
tests (for nonlinear models) and Bayesian tests. These test provide possibilities to compare
two models. Since the other tools presented here provide sufcient information for the pur-
poses of this research, statistical tests are not discussed further. For a general overview of
statistical structural tests, see Holst et al. (1992).
2.5.2 Model validation
Validation means that model output is compared with measurements from the actual sys-
tem in order to determine if the model describes the real system accurately enough. This can
be done using various error criteria.
Cross parameter validation
Cross parameter validation can be seen as a form of in duplo parameter estimation. If pos-
sible, the experiment that yields the data for identication is divided in two sub-experiments,
both of which are executed under the same circumstances. Based on this, it should be possible
to reproduce the estimates. The estimates of the parameters are compared with each other.
If one of the estimates resulting form the second experiment differs substantially from the
corresponding estimate in the rst experiment, the part of the model that uses this estimate
should be investigated further.
Residual tests
The residuals are represented by the innovation of the process (or the model error). The
innovation is dened as:
i = y y (2.7)
in which i is the innovation vector, y is the process measurement vector and y is the estimated
process output vector. The innovation is a suitable index to investigate the mismatch between
the real process and the model. The residuals represent unmodeled parts and disturbances.
For the ideal model the innovation should be white and normally distributed (independent of
each other). The easiest way to analyze the innovation is to simply plot it, which can reveal
trends, peaks or other deviations that are not desired.
The most common tests to check the behaviour of the innovation are the calculation of the
auto-correlation and the cross-correlation with the input. The auto-correlation of a single
innovation signal i calculated as:
r
ii
() =
1
N
N
k=1
(i(k)
i)(i(k +)
i) (2.8)
where is the time lag, N is the number of innovation values and r
ii
() is the auto-correlation
of the innovation at lag . Often, the normalized auto-correlation is used:
r
ii
() =
r
ii
()
r(0)
(2.9)
20
If the auto-correlation is plotted as a function of the lag and a condence interval is set, the
performance can be analyzed. The condence interval is usually set at the 99% condence
interval for the asymptotic distribution, calculated as [2.58/
N, 2.58/
k=1
(u(k) u)(i(k +)
i) (2.10)
The normalized form is given by:
r
ui
() =
r
ui
()
_
r
uu
(0)r
yy
(0)
(2.11)
where r
uu
(0) and r
yy
(0) are the auto-correlations of the input and the output at lag 0, re-
spectively (calculated with equation 2.8). The cross correlation can be expressed for positive
and negative lag. A peak for negative lag indicates feedback in the system. This means
that feedback can be present in the process itself, or the experiments that were used to gen-
erate the data were carried out with a control signal which is based on the process output.
The cross-correlation is compared with the 99% condence interval, in the same way as the
auto-correlation.
Another method is the normal distribution test. In this test, a histogram of the magnitudes of
the residuals is plotted. In addition, a Gaussian curve based on the standard deviation of the
innovation is plotted. This way, the distribution of the histogram can be compared with the
normal distribution curve, which provides information about the residual distribution. Ide-
ally, this distribution should also be normal.
Tracking index
A convenient performance index to check if the model describes the measurements accurately
enough is the so called tracking error index. The index is also often used in optimization
methods and for control optimization and essentially is an integrated square error. The track-
ing index for the continuous case is based in the innovation i as a function of time and is
dened as (Ramirez, 1994):
e
t
=
_
t
f
0
(i
T
Qi)dt (2.12)
with [0, t
f
] the time interval for which the index is evaluated. Q is a weighting matrix. The
discrete form of this equation is given by:
e
t
=
N
k=0
(i(k)
T
Qi(k))t (2.13)
Root mean squared error
One of the most commonly used error measures is the Root Mean Squared Error, or RMSE.
21
Process modeling
It is dened as (Sneddon, 1976):
RMSE =
_
1
N
N
k=1
i
2
(2.14)
where i is a single innovation signal.
Parameter identiability
The identiability criterion can be used to determine whether the measurements that were
used provide sufcient information for parameter estimation. This evaluation can be done
by using the covariance matrix Cov(
) M
1
=
0
(2.15)
where M is Fishers information matrix and
0
is the value of the estimated parameter vector.
Fishers matrix is determined by:
M = E(H(
0
)) (2.16)
with H(
0
) the Hessian of the parameter vector
0
. The Hessian is dened in this context as
the second order derivative of the likelihood function. The likelihood function L is dened as
the functional relationship between the observed values of the variables of the system and the
estimate of the parameter vector
, determined from these observations. This functional rela-
tion is governed by a probability function, which also incorporates the model equations. If the
probability function is not known, an uniformdistribution can be chosen as an approximation
(Eykhoff, 1974). The Hessian then becomes:
H() =
_
_
T
_
_
L() (2.17)
If the covariance of some parameters, found in the diagonal of the matrix, is very large,
then those parameters are not very sensitive with respect to the given measured data. This
means that the experiment is not sufciently informative or that the model may be overpa-
rameterized. More about statistical validation measures can be found in Box et al. (1978)
(experimental) or Kendall and Stuart (1979) (more fundamental).
2.6 Remarks
Within the spectrum of model structures, the hybrid models that are discussed in this
work fall in the category of grey box models: part of the model structure will be based on
rst principles, part will be black box. Although they can be static, the focus will be on
dynamic models, which makes them suitable for applications that require dynamic models
(such as process control applications) as well as for applications that require static models.
22
The wide range of model structures that are used in chemical engineering are the result of
the different requirements that applications demand. Even within classes of models, many
variations and different substructures may exist. Based on the requirements, the modeler
needs to select the model structure with the properties that, in his view, can solve the modeling
problem.
As a result, the focus will be on the development and properties of hybrid fuzzy-rst prin-
ciples models instead of some application of a hybrid model. The general structure, design
and analysis of hybrid models will be discussed and illustrated for different processes. This
will provide modelers with a general basis to make the decision whether hybrid models can
be used in their application.
23
3
Hybrid fuzzy-rst principles models
The combination of fuzzy logic with rst principles models as proposed in chapter 1
results in hybrid models. The term hybrid is used here to denote the combination of two
white box and black box modeling techniques. In the literature, such models are also called
grey box, semi-mechanistic or polytopic models. Some research has been done in this area
(most notably by (Psichogios and Ungar, 1992; Thompson and Kramer, 1994)), though little
research has been presented in which fuzzy logic is used in a similar context.
This chapter will present a detailed denition of hybrid fuzzy-rst principles models and will
discuss different model structures. Subsequently, a framework for evaluating model quality
will be derived. Using this framework, the applicability of hybrid models is discussed. In
addition, a structured modeling approach, which will serve as the basis for the development
of hybrid models, will be presented. The main sources of information for building these
models and guidelines for applying them will be also be discussed.
3.1 What is a hybrid fuzzy-rst principles model?
In hybrid modeling, a distinction can be made between a modular approach and a semi-
parametric approach. The latter approach falls apart into a serial and a parallel approach
(Thompson and Kramer, 1994).
In modular design approaches, several blocks of fuzzy logic submodels are combined to
constitute the process model. The structure of the overall model is determined using prior
knowledge, while every block calculates one specic variable or parameter. Advantages of
this approach are that it may improve interpretability and that it may reduce the number
of model parameters. The hierarchical conguration of such models reduces for example
the number of input variables which results in a less complex fuzzy block and thus a fuzzy
block that contains less parameters. A major disadvantage is that good output behavior is
not guaranteed because the combination of the blocks could generate an overall divergent
behavior (Thompson and Kramer, 1994). The fuzzy submodels are only valid in the input-
output domain for which data was present during identication. If the state of such a model
is outside this domain, unpredictable divergent behavior can occur, which is much less likely
to happen with physical equations.
With semiparametric modeling, a fuzzy logic submodel is placed in tandem with a physical
model. The physical model structure is xed and derived from rst principles. In the serial
approach, fuzzy logic submodels calculate model variables which the physical part of the
model requires. The input of these fuzzy submodels is provided by the physical part of the
model. In gure 3.1 an example for a serial semiparametric hybrid model with one fuzzy
block is given.
In the parallel approach, the outputs of the fuzzy logic block and the physical model are
combined to determine the total model output (gure 3.1). The model serves as a best estimate
of the process. The fuzzy logic submodel is implemented such, that it is able to compensate
for any discrepancy between the physical model output and measurements. A disadvantage
of this approach is that desired behavior is not guaranteed, especially if the model is used
27
Hybrid fuzzy-rst principles models
u y
x
physical model
fuzzy block
u y
physical model
fuzzy block
+
+
(a) (b)
Figure 3.1: Serial (a) and parallel (b) semiparametric design approaches. In both cases the hybrid model
consists of a physical and fuzzy part, but in the parallel approach they are not connected
under process conditions that were not included in the identication process.
If rst principles models are preferred over black box models, it is proposed to leave the phys-
ical model structure intact as much as possible and only model those phenomena about which
uncertainty exists (regarding model equations) with fuzzy submodels. The physical model
structure is formed by dynamic mass and energy balances, while the fuzzy submodel(s) de-
scribe production rates, heat and mass transfer, equilibria, growth rates, etc. This way, hybrid
fuzzy-rst principles models are obtained which combine a high level of interpretability with
the expectation of good extrapolating properties. Therefore, a serial semiparametric modeling
approach is used. Thus in this work, hybrid models are dened as a framework of dynamic
mass and energy balances, formulated in state-space form and supplemented with algebraic
and fuzzy equations.
3.2 General properties and applicability
Because of their general form, hybrid fuzzy-rst principles models can be applied in most
research and engineering elds in chemical engineering. The structure of the model depends
on the type of application and the process that is modeled. The focus here is on dynamical
hybrid models, which can be used to describe the dynamical behavior of the process. Fields of
application include process simulation, process control system design, process optimization
and process behavior prediction.
Whether a hybrid model is suitable for a process depends on the application of the model and
is the choice of the modeler. A hybrid model can be developed for any process, but it may
not always be the best choice. This work only presents the construction and the properties
of hybrid models, based on which the modeler can make his or her choice. It is possible,
however, to give some guidelines for deciding whether hybrid models can be a solution to a
modeling problem.
28
3.2.1 Model quality
Usually, the most important quality of a model is its performance. However, model per-
formance does not provide sufcient information about the general properties of a model,
certainly with respect to transparency and interpretability. A discussion about model proper-
ties should therefore not only include performance, but the different qualities of a model as a
whole.
It is difcult to dene the term quality. This may be attributed to several reasons (Kan,
1994). First, quality is multidimensional concept. Second, there are different levels of ab-
straction; one can refer to it in the broadest sense or to its specic meaning. Third, the term
quality is a part of our daily language and the popular views of the term may be very different
from its use in professions in which it is approached from the engineering or management
perspective.
Quality is viewed upon as intangible; it cannot be weighted or measured. Many people
comment on quality that they know it when they see it. In addition, luxury, class and
taste are often associated with quality. These terms are rather vague. In engineering, more
workable denitions have been formulated. (Crosby, 1979) denes quality as conformance
to requirements. This implies that requirements must be clearly stated. This denition does
not take customers requirements into account. A nal product may conformto requirements,
but it may not have been what the customers wanted. Therefore, the role of the customers
should be explicitly incorporated in the denition of quality: conformance to customers
requirements (Kan, 1994).
In order to measure the quality of a model, it should be divided into several more tangible
aspects. In other words, in order to measure quality, metrics should be dened. In software
design, quality analysis is more common than in process modeling. (Boehm, 1973) formu-
lates a hierarchy of software characteristics that results in a characteristics tree. The general
utility of a software product is determined by its as-is utility, its maintainability and its porta-
bility. These three characteristics are then divided into more specic characteristics such as
device independence, completeness, accuracy, accessibility, etc. In analogy to Boehms char-
acteristics tree for software quality, a characteristics tree can be designed as a basis to map
the different aspects of model quality. This is shown in gure 3.2.
To be applicable, model performance has to meet predened requirements. This means that
the model error has to be within acceptable boundaries. Error criteria can be dened to
judge static performance as well as dynamic performance. In addition, model applicability
can be determined by investigating interpolation and extrapolation properties. This provides
information about the performance of the model outside its working area.
The second aspect of model quality is maintainability. Maintainability in this context can be
evaluated by investigating model transparency. If a model is transparent, the model can easily
be maintained by improving or replacing specic parts of the model. Model transparency is
dened by its complexity and interpretability. The complexity and interpretability of the
model equations as well as the model structure can be analyzed.
29
Hybrid fuzzy-rst principles models
Model quality
As-is utility
Maintainability
Portability
Static
performance
Dynamic
performance
Complexity
Interpreta-
bility
Process
indepen-
dence
Figure 3.2: Model quality characteristics tree
Portability provides information about the extend to which a model is built for a specic in-
stance of an installation. Portability may be important for applications in the eld of research
and development or design. High levels of process independence yield transferable models.
Process independence can be investigated by analyzing the types of information that are used
to build the model.
The three aspects are not completely independent. Good extrapolation properties may result
in high levels of process independence but may yield models that are more complex than
desired. Model interpretability is important for design purposes and affects model utility and
maintainability. In addition, not all aspects are equally important for a given application.
This has to be taken into account in when analyzing a hybrid model. Therefore, requirements
with respect to each aspect of model quality should be determined as well as their relative
importance.
3.2.2 Model applicability
It is infeasible to provide a complete list of properties of hybrid models since each model
has a unique purpose. However, aspects for each of the general areas of application (research
and development, design and control) can be formulated. This will enable a comparison
between hybrid models and models at the extremes of the grey box modeling spectrum:
rst principles models and black box models. Based on this comparison, the modeler can
decide of hybrid modeling can provide a solution to the modeling problem.
30
Model Application
Property First Prin. Hybrid Black box R&D Design Control
Static performance ++ + +
Dynamic performance ++ ++ 0
Complexity 0 + ++
Interpretability ++ +
Process independence ++ 0
Table 3.1: Comparison of general model quality aspects
Research and development applications are the most demanding. The model usually has to
provide detailed information about the process behavior. The model needs to be built in such
a way that elements can be improved or changed without affecting the overall structure. For
off line analysis, the model often needs to be process independent. Dynamic and static perfor-
mance can be less important; the qualitative behavior is more important than the quantitative
behavior.
Models built for design studies require accurate performance and need to be portable, cer-
tainly in the case a process has not been built yet. In addition, if optimization procedures are
applied, extrapolation properties need to be good. Complexity and interpretability are less
important; as long as the information that is required is provided.
The most important quality aspect of process models for control is dynamic performance.
The model needs to provide accurate input-output information in order to be able to design
a control structure. If the model is to be used in model-based control approaches, the model
needs to be as simple as possible so that it can be solved quickly online. Portability is less
important; controllers are often designed for a specic installation.
The most relevant quality requirements for each application area are shown in table 3.1. For
every application, the as-is utility of the model will be important; this is the basic aspect by
which to judge model quality. If the model cannot be applied, it is not useful. The importance
of maintainability and portability depends more on the application.
In general, hybrid fuzzy-rst principles models are useful when a physical model is required,
but difcult to construct. In such cases, a physically interpretable model is needed for the
application (interpretable in the sense that, by analyzing the model equations, the phenomena
that are modeled and their mutual relations are clear). When there is a lack of understand-
ing of the observed behavior of the process, which is often highly nonlinear, fuzzy logic
can provide a means of describing this behavior in a accurate and transparent way. In ad-
dition, because the hybrid models structure is based on rst principles, a certain level of
interpretability is guaranteed.
However, if a complete physical understanding of the process behavior is required, hybrid
modeling should not be applied. Although hybrid models are transparent, the fuzzy equations
do not have a physical meaning in the rst principles sense. Fuzzy logic is a black box
technique and not based on rst principles. It may be possible to give a physical interpretation
to the rules of the fuzzy model a posteriori, but this only gives qualitative insight, not insight
in the actual physical processes that are taking place. First principles models can provide
more detailed information, although this results in a higher level of complexity.
31
Hybrid fuzzy-rst principles models
If a physical interpretation is not relevant for the application, black box models may be better
suited than hybrid models. Black models are easier and faster to build than hybrid models.
They are less complex than hybrid models. They are usually also solved faster than hybrid
models (black box models usually calculate model outputs directly, while hybrid models in
state-space form require numerical solvers to calculate model output), which may be impor-
tant in online applications.
Hybrid fuzzy-rst principles models are identied more quickly and more easily than rst
principles models, since they are less complex. In addition, there are modeling tools avail-
able which can determine the fuzzy equation structure and parameters fast and automatically.
The identication effort is far less than is the case with complicated physical or empirical
equations. If the development time for the model is limited and a complete rst principles
model is not required, hybrid modeling can be a good alternative.
As with rst principles models, hybrid models need fairly detailed information about the
behavior of the process. Process variables or parameters need to be available or it must
be possible to estimate them (the system needs to be observable). If this is not possible,
black box models can be a better alternative to hybrid models, because for black box models,
usually input-output data of the phenomena under study is sufcient. However, to model
the process dynamics, black box models need large amounts of dynamical data. In hybrid
models, the dynamics are described by rst principles. This means that less dynamical data
is required and that hybrid models will perform better dynamically than black box models,
certainly with respect to extrapolation.
The level of process independence is largely determined by the sources of information that
are used to build the model. If process data is the main source, the model will be process
dependent. This is the case for black box models. The use of rst principles information
increases the level of process independence. If, in hybrid modeling, the static properties
are derived from process data or expert knowledge and these properties are process specic,
the hybrid model will be more process dependent. Good static performance of the hybrid
model will then be limited to the process or the operating conditions that were used during
identication.
One of the main advantages of the use of fuzzy equations is that nonlinearities can be de-
scribed in a simple way. These nonlinearities become apparent if the process has a large
operating regime. A model for such a process is thus required to describe these nonlineari-
ties adequately. In this case, hybrid models can be suitable. If the process is nonlinear, but
these nonlinearities are not apparent during normal operating conditions, linear approxima-
tions may sufce. Examples of processes with large operating regimes are batch processes,
distributed parameter processes or cyclic processes.
The differences between the types of models discussed, in relation to model quality, are given
in table 3.1.
32
Model
objective
formulation
Key
variables
and req.
formulation
PROBLEM
DEFINITION
DESIGN
PHASE
EVALUATION
PHASE
Basic
modeling
Data
acqui-
sition
Submodel
identifi-
cation
Sub-
process
behavior
estimation
Submodel
integration
Model
analysis
Evaluate
Accept/
reject
Model
adjust-
ment
Modeling
tools
Process
expertise
Process
data
External
objectives
Require-
ments
First
principles
Require-
ments
Figure 3.3: Hybrid model building approach
3.3 Modeling approach
Most literature about modeling focuses mainly on the parameter identication step. Rel-
atively little is written on how to design a specic strategy for model development. This
section will present such a strategy, based on the approaches presented in (Sohlberg, 1998)
and (Edgar and Himmelblau, 1988). The two different approaches are integrated and adapted
for hybrid fuzzy-rst principles models.
The modeling approach is shown in gure 3.3. The approach consists of several sequential
steps, performed independently of each other. Other research on hybrid modeling promotes a
global approach (Psichogios and Ungar, 1992; Roubos et al., 1999; Thompson and Kramer,
1994). A global approach is usually based on training the black box relations within the
hybrid model using error feedback. The advantage of this approach is that it can reduce the
number of steps that have to be taken during model development. The disadvantage of this
approach is that one is easily inclined to only judge overall model t, irregardless of the
complexity and number of fuzzy relations. This is detrimental to model transparency. The
advantage of independent steps is that the modeling problem is reduced to several smaller
and simpler problems. The solutions of these problems are then combined to form the overall
model.
Three main sources of information are generally available when constructing hybrid models.
Physical understandinggenerally forms the basis of the model and is the result of fundamental
research. The modeler has to acquire relevant rst principles knowledge with respect to the
modeling problem, that can be found in the general literature.
Process measurements are the most important source of information of a specic process.
While rst principles provide general information about the behavior of the process, process
measurements are required to identify a suitable process model.
In addition to process measurements, human experience is an important source of information
because it can be used to learn more about dependencies of relevant phenomena of the process
33
Hybrid fuzzy-rst principles models
and thus the structure of the model. A human can, based on his or her experience, denote
whether certain effects are important or negligible. Based on this information, the modeler
can decide whether these effects have to be accounted for in the model. In addition, human
experience can be used to design fuzzy relations which quantify the human experience.
The modeling approach consists of three phases. In the rst phase, the problem is dened,
based on external objectives (the application of the model) and modeling tools and approaches
that are available. In the second phase the model is build, which is evaluated in the third phase.
3.3.1 Problem denition
At the basis of model building lies the formulation of the objectives of the model. On the
one hand, these objectives concern the specic process for which the model is being built.
Objectives of models for product or process design will be different fromobjectives of models
that are developed for process simulation or fault diagnosis. In addition, the domain in which
the model must be valid is to be determined.
On the other hand, objectives can be formulated that apply to hybrid models in general. These
are:
Hybrid models need to consist of a framework of accumulation balances, supplemented
with algebraic equations and fuzzy functions. The most important dynamics of the
process need to be incorporated in the structure. Fuzzy functions only need to be
applied if no suitable physical relation (or other acceptable empirical relation) can be
found.
Hybrid models need to be transparent. This means that the models need to be as un-
detailed as possible within the constraints of the objectives. The modeler should start
with a global form and apply more detail if the objectives are not met.
A more subordinate objective is that the fuzzy functions need to be interpretable. This
means that the fuzzy functions need to be kept as transparent as possible (limited num-
ber of rules, for example). More complex relations are preferably represented by hier-
archical fuzzy systems, which consist of a cascade of several simple fuzzy systems.
The modeling tools or techniques that are available provide some of the context in which
the model is being built. These have to be taken into account when the model objective is
formulated. For example, a model that is to be used for online model based control but that
describes the behavior on a molecular level is impractical.
The objective results in a set of requirements. The key variables are distinguished and per-
formance requirements that these key variables have to meet are set. In addition, other model
quality requirements are formulated, for example the amount of information that the model
has to provide or the level of transparency that is needed.
34
3.3.2 Hybrid model design
The aim of the design phase is to analyze the process and build the hybrid model, based
on the requirements. This is done by dividing the modeling probleminto smaller subproblems
and dealing with them independently. The division is performed in two dimensions. First,
the process is partitioned into so-called subprocesses, which represent different parts of
the process, such as heat transfer, mass transformation or accumulation. Subsequently, a
submodel is developed for each subprocess. The development of the submodels consists of
sequential steps: structure design, behavior estimation, identication and optimization.
The design phase consists of the following steps:
Basic modeling. The process in analyzed and the model structure is designed using
rst principles and process expertise. In this step, a physical framework is designed that
describes the key variables and the mathematical dependencies for nonlinear model
parameters and additional variables are determined. In addition, the parameters that
will be described with fuzzy logic are distinguished. The result can be interpreted as a
number of subprocesses, for which submodels can be developed. The relation between
these subprocesses is represented by a data ow diagram or DFD.
Data acquisition. To be able to perform the identication of the various model pa-
rameters, sufcient process data needs to be available. This data can be gathered by
doing experiments specically designed for obtaining process behavior information if
this information is not readily available. Parameter or state estimation techniques can
also be used to obtain the correct information.
Subprocess behavior estimation. Based on the model structure, the modeling prob-
lem is reduced to several subprocess behavior estimation problems, since the subpro-
cess behavior often cannot be measured directly. See for example (Eykhoff, 1974; Luy-
ben, 1990; Seinfeld and Lapidus, 1974; Sohlberg, 1998; Van Lith et al., 2001).
Submodel identication. Using the obtained data the model parameters and fuzzy
blocks can be identied.
Submodel integration. The submodels are integrated to formthe hybrid model, which
involves connecting the submodels and optimization of the hybrid model performance.
Model adjustment. It may be necessary to adjust the model structure, based on the
results of the evaluation phase. Usually, this means that a larger level of detail is
needed. The choice of a more detailed description will depend on the experience and
skill of the modeler.
The design phase will be discussed in more detail in chapter 4.
35
Hybrid fuzzy-rst principles models
3.3.3 Hybrid model evaluation
Questions that need to be answered in this phase are: Has the correct model been built?
and Has the model been built correctly? The quality of the model needs to be analyzed with
respect to the requirements. Based on the quality analysis, the decision needs to be made if
the model is really suitable or if it needs to be adjusted. The adjustment can take place
in two ways: feedback through renewed identication (based on new identication data) or
feedback through model adjustment. Based on the denition of model quality (section 3.2.1),
some general pointers for model quality analysis can be given.
Model performance
Model performance can be divided in static and dynamic performance. Depending on
the model application, a suitable error criterion needs to be dened, as well as performance
requirements. Interpolation and extrapolation provide additional information about model
validity.
Static performance involves calculating the model error under steady-state conditions. A
distinction is made between model validation under conditions within the working area (am-
plitude interpolation) and validation under conditions outside the working area (amplitude
extrapolation) (Van Can et al., 1998). The working area can be predened or imposed by
process data that is available for identication; the model is most likely to be valid for the
range of the identication data. An extension of amplitude extrapolation is dimension extrap-
olation, in which an input which is kept constant during identication is varied. A convenient
way of visualizing interpolation and extrapolation properties is to plot the error criterion as a
function of the variable that is changed during the experiments.
Dynamic performance can be evaluated by applying a varying input signal to the process
and analyzing the model results. Again, the dynamic model performance can be validated
under conditions within the working area (frequency interpolation) and outside the working
area (frequency extrapolation). The working area is dened by the range of frequencies of
change of the input variables for which the model is valid. Frequency extrapolation can
also be investigated by changing variables that were kept constant during identication. The
dynamic performance can be visualized by a Bode diagram.
Model transparency
Model complexity and interpretability indicate how transparent a model is. Complexity
is determined by the amount of elements a (part of) a model consists of, while interpretability
is determined by the meaning of those elements. Complexity and interpretability of the model
structure and the model equations are taken into account.
36
The complexity of the model structure indicates howwell the model is built. An important aid
in analyzing model structure complexity are the data ow diagrams. These illustrate the con-
nections between the different modeled phenomena (subprocesses) in a clear manner. Points
of interest are the number of hierarchical layers in the model, the number of model equations
and the connectivity of the equations. A transparent model shows maximum cohesion with a
minimum number of connections between the equations.
Model equations are complex if they have a large input-output dimension, are highly non-
linear or contain many parameters and variables. In addition, the mathematical structure
inuences equation complexity; an second order polynomial is less complex than a high or-
der partial differential equation. Equation complexity and model structure complexity are
related; a simple model structure obtained by combining several relations can increase the
complexity of model equations and vice versa.
Interpretability involves physical and non-physical evaluation. A model can be interpretable
in the sense that it is understandable how the model output is derived from the inputs, but this
may have no physical meaning. Again, interpretability of the model structure and the model
equations are taken into account.
Interpretability of the model structure mainly concerns physical interpretability. The structure
is physically interpretable if the elements (the equations) represent a physical process or
phenomenon. This can be analyzed using the data ow diagrams. Interpretability of the
model equations is based on the physical understanding of the mathematical structure of the
equation (the inuence of rst order over second order effects, for example). In addition,
equations can be interpretable while they have no physical basis; a fuzzy model may be
interpretable in terms of qualitative classication but it is not based on physical principles.
Process independence
The level of process independence depends mainly on the sources of information that are
used during model building. If the main source is process dependent, the model will also
be process dependent. Three main sources are available: rst principles, process data and
human expertise.
First principles knowledge makes the model process independent. However, most models
contain a certain level of empirical information which is based on observed process behavior,
which makes the model more process dependent. The level of process dependence is therefore
closely related to the amount of process data that is used during model building. If this is the
main source, the behavior of the model will be based on the observed behavior of that specic
instance of the process. It is then likely that the model will not perform as well for another
instance of the process.
This is also the case for the use of human expertise. Categorizing knowledge is based on the
behavior that is observed from the process and thus process dependent. Process descriptive
or structural knowledge may also be process dependent, but usually to a lesser extend. A
37
Hybrid fuzzy-rst principles models
relation between two variables may be present in several instances of a process, but it may
differ quantitatively.
In general, the best way to evaluate model quality is to use a top-down approach. Given the
model application, rst determine the relative importance of the as-is utility, maintainability
and portability. Subsequently, set requirements with respect to static and dynamic perfor-
mance, complexity, interpretability and process independence. Analyze only those elements
that are relevant for the application. The requirements have to be incorporated in the model
objectives and setting them is part of the problem denition phase.
3.4 The use of human expertise
Human expertise is used on different levels during hybrid modeling. In the rst place, it
is used to analyze the problemand the process and to design the model structure. In addition,
human experience can be used during the design phase to provide quantitative information.
This information can be captured using fuzzy logic. With the use of human expertise, three
aspects need to be considered: the type of knowledge that is available, the eliciting of that
knowledge and using the elicited knowledge for hybrid modeling.
3.4.1 Types of knowledge
Before knowledge processing is discussed it is insightful to investigate different types of
and views to knowledge. Nonaka and Takeuchi (1995) accredits part the economical success
of Japan about 20 years ago to the Japanese view on knowledge, which was unique at the
time. According to their view, knowledge can be divided in two types:
Explicit knowledge. Knowledge that easily can be expressed in words or numbers.
Implicit or tacit knowledge. This type of knowledge can be segmented expertise and
schemata/mental models/beliefs/perceptions so ingrained that it is taken for granted.
Explicit knowledge can be stored easily, implicit knowledge can not. Dealing with implicit
knowledge includes learning something by mind and body. Heavy reliance is placed on
gurative language and symbolism. Some investigation has been done about howimplicit and
explicit knowledge express themselves in process industry (Venkatasubramanian and Rich,
1988). A distinction between deep knowledge and compiled knowledge is made.
Deep knowledge is used to denote knowledge that is generic and process-independent in-
volving concepts, constraints and behaviors of process units that are applicable to a variety
of situations:
Restrictions based on the laws of conservation of mass and energy.
38
Conuence equations, which represent the inuence of one state variable on another
state variable.
A library of fault models of different process units which attempt to explain the local
cause of a low or high state variable value.
Causal models of process units that would generate local effects given some cause.
Deep knowledge can be seen as qualitative knowledge. Deep knowledge gives information
about relations between process units or variables and is not necessarily process specic. This
knowledge can be very useful for designing model structures. This knowledge is the typical
knowledge process engineers are likely to have.
Compiled knowledge is essentially a compilation of experiences typically restricted to a given
process. Such knowledge can represent magnitudes of observed behavior and to a lesser
extend relations between variables that are process specic. Compiled knowledge can be seen
as quantitative knowledge and can be useful for determining model equations and parameters.
Process operators are likely to have compiled knowledge about the process.
3.4.2 Knowledge structures
For knowledge elicitation, an overview about the form in which deep and compiled
knowledge are available provides a good basis. Different approaches can be found, general
as well as specically for process knowledge.
A well known approach for developing knowledge-based systems is KADS
1
. To capture ex-
pert knowledge, KADS proposes to divide knowledge in four hierarchical layers (Tansley and
Hayball, 1993): the domain layer (basic facts and concepts), the inference layer (the process
of thinking), the task layer (inference sequences) and the strategy layer (selecting and plan-
ning of tasks). KADS aims to be a general approach, so it can also be suited for knowledge
processing for hybrid modeling. However, since the type of knowledge that is available and
its application are fairly clear, KADS may be too extensive.
Sestito and Dillon (1994) claims the following structures of knowledge are frequently em-
ployed by the problem solver:
Heuristics or rules of thumb.
Stereotypes, that are used to designate typical examples of some objects or situations.
Solution hierarchies. These are frequently associated with the level of detail the prob-
lem solver wishes to deal with at one time.
Procedures, that represent explicitly dened solution strategies and algorithms.
1 KADS may have originally started life as an acronym, but uncertainty appears to have developed
about what it is an acronym of, and it is generally used as a proper name
39
Hybrid fuzzy-rst principles models
Pattern matching to check if conditions are satised.
Building a model and/or reasoning with that model.
Reasoning with primary case material. This knowledge can be reduced to a set of
heuristics because the knowledge is highly context dependent. This reasoning is often
associated with legal domains, but it also appears in other areas such as patient care.
A more recent approach from the eld of process technology and fault diagnosis in partic-
ular divides knowledge into three categories. First, there is categorizing knowledge, where
(mostly numerical) data is divided into categories (low, for example) that provide a basis
for further interpretation. Second, there is subdividing knowledge. Subdividing knowledge is
knowledge about subdividing a process into autonomous units. This can be physical process
units, but also specic phenomena. The nal category is hypothesis testing knowledge. This
is knowledge about how process variables are related to the behavior of the process.
These views all have some aspects in common. The different interpretations are caused by
the different backgrounds and elds of application. About these similarities, the following
can be said. Categorizing knowledge corresponds to Sestitos pattern matching knowledge
and has some overlap with stereotyping. These are specic examples of compiled knowledge.
Furthermore, subdividing knowledge can be interpreted as the ability to distinguish hierar-
chies, such as with Sestitos solution hierarchies. Reasoning with primary case material and
hypothesis testing are both variants of the same compiled knowledge structure.
3.4.3 Knowledge contents
While information about the type and the structure of knowledge is important because
it provides the context in which the knowledge must be elicited, the actual contents of the
knowledge are obviously also important. The nal model will be based on the contents of this
knowledge. With the model structure in mind, care must be taken to elicit useful knowledge
from the right person.
The contents of the knowledge a person has is highly dependent on the job of the person.
Process engineers are likely to have knowledge that is deep and which is complemented with
compiled knowledge about process behavior that is appropriate for the given process. The
knowledge is mainly process descriptive. They have the ability to subdivide the process into
logical units or distinguish structures. This represents deep knowledge is very useful during
problem denitions, process analysis and model structure design.
Compiled knowledge that operators may have is quantitative and thus is useful for identi-
cation. The knowledge that will be available will mainly be concerning process control. The
knowledge an operator has must be viewed in context with the proposed model structure and
the task the operator has. For example, if an operator performs mainly supervisory control
tasks, the knowledge must be viewed with regard to the process including its control struc-
ture. If an operator controls a variable manually, the operator acts as the controller and then
is part of the control loop. The knowledge then also needs to be viewed with regard to the
40
process including its control structure, but the knowledge in this case is much closer to the
actual process. The knowledge and its application in hybrid model identication needs to be
approached in a different way.
3.4.4 Knowledge elicitation
The actual elicitation of the knowledge concerns the acquisition of the knowledge from
the human and recording this knowledge into an appropriate format. In knowledge elicitation,
several phases can be distinguished (Jansen van der Sligte, 1999):
Problem denition.
Problem analysis.
Knowledge acquisition and recording.
Evaluation.
Problem denition and analysis
The goal of the rst phase is to determine the context of the problem and to gain initial
information about the process. This information is general in nature and provides information
about the process structure and the modeling problem. This phase also investigates which
human knowledge is available. A plant visit may also be benecial to gain insight in the
problem.
An appropriate knowledge acquisition technique needs to be used with respect to the knowl-
edge. This is especially important for compiled knowledge. With hybrid modeling, a fuzzy
model is the most appropriate structure to record compiled knowledge. This means that the
choice of the knowledge acquisition technique is limited to those that are suited for recording
the knowledge in fuzzy systems.
Knowledge acquisition and recording
Common knowledge acquisition techniques are based on interviews between a knowl-
edge engineer (who builds the knowledge based system) and the expert and have been
discussed extensively (see, for example, (Hart, 1992; Tansley and Hayball, 1993)). Different
interviewing techniques are used; well known examples are the focused interview (where the
interview is prepared in detail by the knowledge engineer), the structured interview (where a
few topics are probed in depth by continuously asking for clarication and justication) and
41
Hybrid fuzzy-rst principles models
the tutorial interview (where the expert outlines the main themes and ideas of his knowledge
domain).
Different variations aside, interviewing has been widely used as a basic knowledge acquisi-
tion mechanism (Sestito and Dillon, 1994) and it has been found by some researchers to be
among the most effective. For acquiring deep knowledge, a focused interview usually will
sufce. Based on the initial knowledge the knowledge engineer has obtained, an interview
can be prepared to acquire information about variables, subprocesses, relations and strate-
gies. The knowledge can simply be recorded by making notes or by designing a behavioral
diagram.
More elaborate approaches have been developed for acquiring compiled knowledge. For
recording this knowledge into fuzzy systems, the repertory grid and its variations are ap-
propriate techniques. The experts view of the problem and its domain is represented in a
grid, where the rst row is lled with elements that represent conclusions or solutions to the
problem. The rst column contains constructs (properties) that usually are bipolar (low/high,
big/small). The rest of the grid is lled with a level of truth for a construct, given an element.
The level of truth is given by a number between 1 and 5, where 1 and 5 represent the two
poles of the construct. The grid provides a means to reveal patterns and derive decision rules.
Two variations on the repertory grid that have been designed to build fuzzy systems are
Knowledge Acquisition for Fuzzy Expert Systems approach or KAFES (Hwang, 1995) and
the Fuzzy Associative Memory or FAM (Marsh, 1994). In the KAFES grid, the constructs
consist of variables and there corresponding bipolar values, described by linguistic values.
The grid is lled with a number representing the linguistic value, given an element. The
number denotes a linguistic value between the two poles. For example, if the bipolar value of
a variable is low/high, -2 denotes very low, 1 denotes high and 0 denotes normal. In
addition, a degree of certainty can be introduced. The grid forms the rule base; each column
in the grid represents a fuzzy rule. A drawback of the approach is that all the premises for a
conclusion are captured in one rule. In other words, each consequent part of the fuzzy model
appears only once in the model, which may result in restrictions in exibility that are not
desirable.
For simple problems, the Fuzzy Associative Memory is a good alternative. All possible
relations between input and output variables are showed in the grid, two inputs and one
output at a time. The possible (linguistic) values of the two input variables are listed in the
rst column and the rst row. The grid is lled with a linguistic value of the output variable,
given the combination of the values of the two input variables. The grid is not very suitable
for systems with more than two input variables. However, the class of systems with two input
variables is very common and it is likely that an expert usually does not monitor more than
two input variables at a time. When using more than two input variables, a FAM needs to
be constructed for two inputs at a time, while keeping the other variables constant. Also, a
hierarchical model could be build to describe the multiple input single output system.
The disadvantage of using the KAFES approach is that the expert needs to reason back-
wards: given a conclusion the expert needs to distinguish premises that result in this con-
clusion. Most process operators or engineers are not used to reasoning this way, which may
result in inaccuracy in the knowledge that is elicited. In addition, it is difcult to see from the
42
KAFES grid whether the obtained knowledge is complete. For problems where the several
conclusions can be drawn from similar premises (such as medical diagnosis, where a disease
is characterized by one set of different symptoms, whereas one symptom can be the result of
many different diseases), KAFES may be a practical approach. For elicitation problems such
as presented in this work, however, the FAM approach is better suited and more simple to
apply.
Evaluation
When the knowledge based system is built, its performance can be judged by verication
and validation. This is more important for systems based on compiled knowledge than sys-
tems based on deep knowledge. With verication, it is checked whether the system is build
rightly; the proof of certain logical properties. Validation checks whether the right system is
build; the determination of a homomorphism between a system and its representation.
Gonzalez and Dankel (1993) describes validation and verication of expert systems in more
detail. His discussion is relevant for fuzzy systems also. In his view, verication concerns
investigating the consistency of the rule base and the completeness of the rule base, while val-
idation deals with the role of the expert system and determining the measure of performance
for that system. In addition, more detailed information of the evaluation of fuzzy systems can
be found in Marsh (1994) and Rausis (1998). A detailed discussion, however, goes beyond
the scope of this work.
3.4.5 Using knowledge during identication
The use of compiled (quantitative) knowledge for identication concerns recording the
knowledge into a fuzzy model. This model can be used in hybrid model design to determine
subprocess behavior or can be used for hybrid model parameter identication. In doing so,
under the assumption that a fuzzy model that contains the quantitative knowledge is available,
a distinction is made between model matching and model embedding.
Model matching
In model matching, the fuzzy model that contains the expert knowledge (denoted as the
expert model) is transformed in such a way, that it matches the designed hybrid model
structure (gure 3.4). If the hybrid model structure is designed in advance, it is likely that
the expert model will not describe the input-output behavior that is required. There will
be some form of model overlap, which has to be eliminated. Two different approaches for
accomplishing this have been distinguished, based on the form of the expert model.
43
Hybrid fuzzy-rst principles models
Model
matching
Expert model
Transformed
expert model
Hybrid model
Physical model
framework
Figure 3.4: Model matching. The expert model is transformed to match the physical framework
In the case that the expert model describes (part of) the process behavior as opposed to control
actions, parallel model matching is applied. The expert model is placed in parallel with the
model and is used as a data generator that describes part of the process behavior. This data
then can be used the same way as ordinary process measurements are used. To eliminate
model overlap or to obtain the correct input-output mapping that is required for identication
of the fuzzy submodel, the data needs to be transformed. This can be done using estimation
techniques, which will be discussed in chapter 4. Example 3.1 illustrates a case where parallel
model matching can be applied.
Example 3.1 Consider a systemfor which the following hybrid model structure has been designed:
x
1
= f
1
(x
1
, x
4
, u,
1
) (3.1)
x
2
= f
2
(x
1
, x
2
, x
4
, u,
2
) (3.2)
x
3
= f
3
(x
1
, x
3
, x
4
, u,
3
) (3.3)
x
4
= f
4
(x
4
, u) (3.4)
The parameters are described by the following algebraic equations:
1
= f
5
(x
1
,
4
) (3.5)
2
= f
6
(x
1
, x
2
) (3.6)
3
= f
7
(x
1
, x
2
) (3.7)
4
= f
8
(x
1
, x
2
) (3.8)
44
u y
P P
u
C
e
+
-
sp
(a)
H
y sp
(b)
Figure 3.5: Serial model matching. The expert model C
e
is placed in series with the hybrid model,
consisting of a known part P and unknown part P
u
(a)
Function f
5
is assumed to be unknown and will be described by a fuzzy equation. An expert model
f
e
is available, which describes the change in x
1
, x
1
, as a function of x
1
, x
2
, x
4
and u:
x
1
= f
e
(x
1
, x
2
, x
4
, u) (3.9)
If this expert model is placed in parallel with the hybrid model structure, it can be seen that the
model overlap is given by equations 3.8 and 3.1. The expert model can be transformed to provide
a mapping of the form given by equation 3.5, by using parallel model matching. 2
If the expert model is a controller model C
e
, where the expert is part of the control loop, serial
model matching can be applied. In this case, the expert model is in series with the process
model. In serial model matching, the model structure is converted to transfer functions. The
fuzzy parts of the hybrid model are concentrated to form an unknown block P
u
that is placed
in series with the known part. If the closed loop transfer function H is available or can be
identied, this block is the only unknown block of the system and the problem can be solved.
Figure 3.5 illustrates this.
If the closed loop transfer function H is dened as the transfer function between the setpoint
and the process output, the unknown part can be calculated as:
P
u
= C
1
e
P
1
H
1 H
(3.10)
where P is the known part of the model, C
e
is the expert model and P
u
is the unknown part
of the model.
This requires inversion of the expert model as well as the known process model. Several
techniques for inverting fuzzy models are available (both exact (Baranyi et al., 1997; Baranyi
et al., 1998; da Costa Sousa, 1998; da Costa Sousa et al., 1997) and inexact (da Costa Sousa
et al., 1997; Fischer and Isermann, 1996; Jordan and Rumelhart, 1992)), but inversion can
be omitted if the closed loop transfer function is dened as the transfer function between the
process input disturbances and the process output:
P
u
= P
1
H
1 C
e
H
(3.11)
45
Hybrid fuzzy-rst principles models
Example 3.2 illustrates a case where serial model matching can be applied.
Example 3.2 Consider the system from example 3.1. Assume that instead of f
5
, f
4
is unknown.
The connection with the known part of the model is serial. Assume that an expert model C
e
is
available that describes the following behavior:
u = C
e
(e
x
1
) (3.12)
in which e
x
1
is the error of x
1
with respect to its setpoint x
1,sp
.
The unknown function f
4
can be determined using serial model matching. Interpret x
4
as a system
input to equation f
1
. The known process model P consists of a combination of f
1
, f
5
and f
8
. u
and x
2
are interpreted as additional inputs. The controller C is given by the expert model f
e
. The
closed loop transfer function is dened as:
x
1
= H(x
1,sp
) (3.13)
while in addition
x
1
= P(x
4
, x
2
, u) (3.14)
and
x
4
= P
u
(u) (3.15)
The unknown model part can be determined using equation 3.11, if P
1
can be determined with
respect to x
4
. 2
Although the approach can successfully be applied to simple problems, the exibility is lim-
ited. First of all, the approach is only useful if the unknown part is dynamic. If the unknown
part is static, simpler approaches can be used. In addition, the connection between the known
and unknown parts needs to be linear. If this is not the case, the system can be linearized,
but this may result in a relatively complex model structure and limited model validity. Fur-
thermore, the inversion of the known part can be cumbersome or impossible, certainly when
dead time is present.
Model embedding
With model embedding, the expert model is the starting point for hybrid model develop-
ment. The expert model is supplemented with the physical model framework, which needs
to be transformed if model overlap is present. This way, the expert model is embedded in
the physical model framework, instead of transforming it to match the framework structure.
Model embedding is illustrated in gure 3.6.
If the expert model describes control actions, it needs to be inverted prior to embedding. In
this case, it is required that the controller contains model information. This is the case for an
internal model control structure, but may not always be true. Example 3.3 illustrates model
embedding.
46
Model
embedding
Expert model
Hybrid model
Physical model
framework
Transformed
physical model
framework
Figure 3.6: Model embedding. The expert model is embedded by transforming the physical framework
Example 3.3 Consider the system presented in example 3.1 and also assume that f
5
is unknown.
In addition, assume that the same expert model is available:
x
1
= f
e
(x
1
, x
2
, x
4
, u) (3.16)
This model can be embedded in the physical framework by replacing equations 3.1, 3.5 and 3.8 by
equation 3.16. 2
3.4.6 Remarks
Although deep or qualitative knowledge can easily be applied in hybrid model design,
the use of compiled or quantitative knowledge is complex. Serial model matching has lim-
ited applicability, while model embedding is not always desired if a specic hybrid model
structure is designed in advance.
Since in modern plants most of the quantitative information is registered, it is recommended
to use this registered information instead of eliciting it from humans. Humans base their
experience on the measurements that are available. In addition, information may be lost
during the elicitation process which may result in inaccuracy.
This makes the hybrid modeling approach mostly data-driven, but it should be emphasized
that using deep knowledge to determine the model structure and dependencies is valuable and
that it is worthwhile to investigate this knowledge when solving a hybrid modeling problem.
47
Hybrid fuzzy-rst principles models
3.5 Mathematical considerations
For process simulation or online process behavior prediction, the model needs to be avail-
able in dynamical form. Usually, such systems are solved numerically. A mathematical
advantage of using a dynamical physical framework with algebraic fuzzy equations (which
results in a model in DAE form) is that there are no restrictions with respect to sampling
time or solution method when solving the model. If autoregressive fuzzy submodels are used
to describe process dynamics, these restrictions do apply, resulting in a less exible model
structure.
The type of fuzzy model used here is the Sugeno (TSK) type (Takagi and Sugeno, 1985).
This type of model can be interpreted as a collection of local linear submodels. This type of
relation is extremely suitable to describe highly nonlinear relations based on process data. It
is shown that TSK fuzzy functions are universal function approximators (Ying, 1994). Fur-
thermore, because of its local linear properties, simple algorithms can be used for parameter
estimation, such as the least squares approach (Babu ska, 1996). Many good algorithms for
automatic identication of TSK models from data sets are available, including algorithms
with structure optimization.
Although linguistic (or Mamdani) fuzzy models are easily interpretable for humans, they
usually require more fuzzy rules than a TSK model to describe the same phenomenon, which
makes them more complex. The advantage of using TSK models is reduced complexity,
which makes the models more comprehensible. If the main source of data is human experi-
ence, Mamdani models could be easier to identify than TSK models. Since the main source
of data for the identication are process measurements, TSK models are more suitable than
Mamdani relations.
3.6 Concluding remarks
The structure of a hybrid fuzzy-rst principles model consists of a framework of dynamic
mass and energy balances, supplemented with algebraic and fuzzy equations, formulated
in state space form. The advantage of such a structure is that it combines a high level of
interpretability with the expectation of good extrapolation properties. Because fuzzy logic
can deal with nonlinearity in a simple way, hybrid models are suited for nonlinear processes
with a large operating regime, such as batch or distributed parameter processes.
Three main sources of information can be used to build hybrid models: rst principles, pro-
cess data and human expertise. First principles provide information about the physical frame-
work, while the behavior observed from process data can be used to identify the fuzzy rela-
tions. Human expertise is mainly a source of structural information; process data is more
suitable for quantitative information.
For building hybrid models, a structured modeling approach has been designed which con-
sists of several independent steps, in which the modeling problem is reduced to several
48
smaller and simpler problems which can be solved individually. This is an advantage over a
global approach, in which the modeling problem is approached as a whole.
The modeling approach consists of three phases. In the rst phase, the model objective
and quality requirements are formulated. In the second phase, the hybrid model structure is
designed and subprocesses are distinguished. Submodels for these subprocesses are subse-
quently identied and combined to form the hybrid model. The nal phase determines the
model quality and evaluates if the model meets the requirements. Model quality is determined
by evaluating model performance, complexity, interpretability and the sources of information
that were used.
The modeling approach is not a strict set of rules that has to be followed. It serves as a
guideline for developing hybrid fuzzy-rst principles models and its execution depends on
the modeling problem at hand. The issues discussed in this chapter should, however, provide
the modeler with a basis for hybrid model structure development.
The following chapter will present the design phase of the modeling approach in more detail.
Chapter 5 will discuss general hybrid model properties.
49
4
Hybrid model design
key variables
basic
modeling
behavior estimation
and identification
integration
input-output
structure
hybrid model
Figure 4.1: Design phase. The problem is reduced to several smaller independent problems, which are
solved sequentially
The design phase of the hybrid modeling procedure is the work horse in hybrid model
development. This chapter will discuss the different steps in hybrid model design. The
rst step is basic modeling, in which the hybrid model structure is determined. In addition,
the process is divided into several smaller subprocesses, which are modeled individually.
Basic modeling is followed by data acquisition, subprocess behavior estimation, submodel
identication and submodel integration. The design phase is visualized in gure 4.1. For
each step, different tools will be presented and their application will be discussed.
Throughout the chapter, the steps will be illustrated with a fed-batch bioreactor for penicillin
production. For this process, a simulator is available which will be used to represent the
actual process. The case is simple and does not represent an actual hybrid modeling problem.
However, the process does posses the properties for which hybrid modeling is useful: a large
operating regime and uncertainty about the phenomena that play a role, such as biomass
growth. In addition, the fuzzy submodels that will be developed are double input single
output systems. This allows simple visualization of the model and therefore provides a better
understanding of the results. Because of the simplicity, it provides a good basis for illustrating
and evaluating the various tools that are used during hybrid model design. The case will
therefore be presented as an illustration.
53
Hybrid model design
4.1 Basic modeling
Basic modeling is the rst step of the design phase: it determines the hybrid model struc-
ture and distinguishes the subprocesses. The following steps can be distinguished.
Step 1: Process description and information analysis
The function of the process is formulated and the available information is listed. It has prac-
tical and economical advantages if a model can be built on the readily available process
information (that is, information that is available without performing specic experiments),
be it experimental data or human expertise. This will not always be possible, but the avail-
able information should be analyzed before the model structure is designed. This step should
also inform the modeler if, based on the readily available information, a model can be built
in accordance with the model objectives or that additional experimentation is needed. The
following steps should be executed within the context of this analysis.
Step 2: Process hypotheses
Before the system can be described with mathematical equations, the physical behavior needs
to be described. Knowledge of basic principles from chemistry and physics can be used to
denote qualitative relations. These hypotheses serve as the basis for the formulation of the
mathematical equations.
Step 3: Process structure
The process is divided in subprocesses and the connections between these are determined.
The subprocesses of the model denote the entities that will be described by a certain equation
or a set of equations. Examples are chemical reaction, mass or heat transfer. These entities
are combined using mass or energy balances (the model framework). The inputs and outputs
of the model are also denoted. This step yields a behavioral or data ow diagram (Yourdon,
1989), in which the relations between the entities are shown.
In this step, the fuzzy and rst principles part of the model are denoted. Based on model
objectives, rst principles, process expertise and the experience of the modeler, it can be
decided if it is feasible to describe certain effects using physical equations. In addition, it
can also be decided to lump certain effects, too complex to describe physically, so that
transparency is guaranteed. At this stage, the structure of the hybrid model is determined.
Step 4: Basic equations
The mass and energy balances can be drawn up with the help of the determined hypotheses
about the behavior. Algebraic equations describing the driving forces and equilibria can also
be determined. Model parameters are not determined at this stage. This is also the case for
the fuzzy equations. These will be determined in the identication step.
The result of the basic modeling step is the model structure dened by the basic equations.
The subprocesses that will be modeled with fuzzy logic are also determined. Appropriate
process data can now be collected to facilitate model identication.
54
In the following example a hybrid model structure for a fed-batch bioreactor for penicillin
fermentation will be developed. This reactor will serve as an example throughout the rest of
the chapter and the model is based on the information provided in Thompson and Kramer
(1994).
Example 4.1 Assume that the problem denition phase has been completed. The model objective
is to describe the penicillin concentration during a batch run, based on process inputs and initial
conditions. The model should be based on global physical principles. In order to accomplish this,
the following key variables have been distinguished: the substrate concentration S, the biomass
concentration X, the penicillin concentration P and the volume V .
Step 1: Process description and information analysis
At the beginning of a batch, a small culture of biomass X is present. The tank reactor is lled
slowly with a feed F that contains a substrate S
F
, which causes the biomass X to grow and
produce penicillin P. The feed rate is constant. The tank is stirred and a typical batch run lasts
200 hours. Measurements of the biomass X, substrate S, product P and volume V as well as the
inputs F and S
F
are available every hour.
Step 2: Process hypotheses
Based on the available literature (such as (Bastin and Dochain, 1990; Dunn et al., 1992)), the
general mechanisms that play a role in the process can be identied. The key variables can be
described directly by state equations.
The volume balance is straightforward; the accumulation is given by the feed rate F.
The substrate S is added by the feed stream and is consumed by the biomass. This consumption is
used for biomass growth biomass maintenance and penicillin production. In addition, the dilution
inuences the substrate concentration.
The biomass accumulation is determined by biomass growth, biomass decay and a dilution factor
that is a result of the changing volume. Since a detailed description of these processes is not
required by the model objectives, the assumption is made that no dynamics are present in these
phenomena and that they can be described by algebraic equations. This is common practice (Dunn
et al., 1992).
The accumulation of the penicillin concentration, nally, is determined by the production rate,
dilution and product decay, which is assumed to be constant.
Although actual operation and behavior of these kind of processes are more complex than sug-
gested by this model, its simple nature is useful for illustration purposes and procedure devel-
opment. For example, the following phenomena also characterize the actual process but are not
modeled:
Growth rate decreases with biomass due to oxygen diffusion limitations.
Cell decay becomes appreciable at high biomass concentration.
Maintenance energy of the culture depends on biomass concentration.
The penicillin product decays as the culture ages.
Step 3: Process structure
The process hypotheses can be organized in a data ow diagram which represents the hybrid model
55
Hybrid model design
Biomass
balance
Volume
balance
Substrate
balance
Product
balance
Dilution
rate
Net
growth
rate
Sub-
strate
cons.
Prod.
rate
V
S
f
V
P
P
X
X
S
S
V
D
D
D
S
X
q
p
S
S
X
X
X
F
P
S
V
X
Figure 4.2: Data ow diagram hybrid model bioreactor
structure, as shown in gure 4.2. It is assumed that the equations that govern the biomass growth,
the biomass decay and the production of penicillin are unknown. Biomass growth and decay are
therefore lumped to form a net growth rate. The behavior of the net growth rate and product
formation rate will be derived from the measurements. Using these measurements, fuzzy models
will be built that describe the rates.
Step 4: Basic equations
The set of equations for the hybrid model can now be set up. The accumulation balances that form
the physical framework are given by:
dX
dt
= X( D) (4.1)
dS
dt
= X + (S
f
S)D (4.2)
dP
dt
= q
p
X P(D +K) (4.3)
dV
dt
= F (4.4)
The supplemental algebraic equations are given by:
D =
F
V
(4.5)
56
= f
fuzzy
(S, X) (4.6)
q
p
= f
fuzzy
(S, X) (4.7)
=
Y
x/s
+
q
p
Y
p/s
+m
x
(4.8)
with
m
x
=
m
xm
X
X + 10
(4.9)
and
=
m
S
K
X
X + 10
(4.10)
Here, Y
x/s
, Y
p/s
, m
xm
, K
X
and K are constants. The relation for is based on information from
Thompson and Kramer (1994). Now the structure is designed, the model parameters and the fuzzy
equations can be identied. 2
4.2 Data acquisition
In order to be able to identify the submodels that describe the subprocesses, appropriate
process data needs to be available. This data can consist of historical plant data or it can be
obtained by performing experiments that are designed especially for this purpose.
When designing experiments, care has to be taken that the data will best reveal the behavior
of the subprocess. Experiments can be designed more effectively when knowledge that is
available is used. In addition, exploratory experiments can be useful. Additional experiments
are then designed on the basis of these initial results.
Experiments based on factorial design provide a useful way to measure behavior. In factorial
design, the process behavior is measured at different levels of certain variables (or factors).
Experiments are carried out for all possible combinations of the factors. To limit the number
of experiments for systems with many factors, fractional factorial design can be used (Box et
al., 1978). In this approach, suitable fractions of full factorial designs are generated based on
redundancy information.
Research on experiment design based on statistical methods is abundant and many text books
about experiment design are available. See for example Box et al. (1978).
Example 4.2 Consider the bioreactor from example 4.1. For the identication of the two fuzzy
relations (equations 4.6 and 4.7), sufcient measurements are needed in order to obtain the behavior
of the relations. This means that the data needs to be distributed over a sufciently large domain in
the input space, so that all operating conditions are covered.
The input space is the same for both relations; the input space is formed byS and X. Figure 4.3 (a)
shows the trajectory of a typical batch run in the input space. In order to obtain a good distribution,
it is assumed that several experiments can be done by varying the initial conditions and system
inputs. 16 different experiments were designed (see table 4.1) to accomplish this.
57
Hybrid model design
Batch # X(g/l) S(g/l) P(g/l) V (l) F(l/h) S
f
(g/l)
ID1 5.0 0.5 0.0 20.0 0.110 525
ID2 5.0 0.5 0.0 20.0 0.132 525
ID3 5.0 0.5 0.0 20.0 0.154 525
ID4 5.0 0.5 0.0 20.0 0.176 525
ID5 5.0 0.5 0.0 20.0 0.198 525
ID6 5.0 0.5 0.0 20.0 0.220 525
ID7 7.5 0.5 0.0 20.0 0.110 525
ID8 10.0 0.5 0.0 20.0 0.110 525
ID9 12.5 0.5 0.0 20.0 0.110 525
ID10 15.0 0.5 0.0 20.0 0.110 525
ID11 17.5 0.5 0.0 20.0 0.110 525
ID12 20.0 0.5 0.0 20.0 0.110 525
ID13 22.5 0.5 0.0 20.0 0.110 525
ID14 25.0 0.5 0.0 20.0 0.110 525
ID15 27.5 0.5 0.0 20.0 0.110 525
ID16 30.0 0.5 0.0 20.0 0.110 525
VAL1 8.25 1.0 0.0 20.0 0.165 525
Table 4.1: Initial conditions for batch runs
No actual reactor setup was available to obtain the measurements. Instead, a simulator was used
to generate the measurements. The simulator model is taken from Thompson and Kramer (1994).
Noise was added to the simulation results. A detailed description of the model and the noise that
was added is given in appendix B.
The distribution in the input space that was obtained is shown in gure 4.3. Since the outputs of
the fuzzy relations, and q
p
, cannot be measured, they are estimated from the states of the model.
Therefore, in addition to S and X, P and V are also recorded. 2
0 10 20 30
0
20
40
60
S (g/l)
X
(
g
/
l
)
Start
End
0 10 20 30
0
20
40
60
S (g/l)
X
(
g
/
l
)
(a) (b)
Figure 4.3: Input space distribution of a typical batch run (a) and input space distribution of experiments
58
4.3 Subprocess behavior estimation
Data preprocessing is an almost unavoidable step during process modeling. Obvious ap-
proaches such as ltering, scaling or data reduction can be applied, but more importantly,
estimation techniques provide ways to obtain information about variables or parameters that
cannot be measured directly. In hybrid modeling, this is the case for many subprocess pa-
rameters, such as reaction rates. Estimates of these parameters can then be used during
the identication process. Over the years, much research has been done and numerous pa-
pers and text books have appeared on the subject (see for example (Eykhoff, 1974; Luy-
ben, 1990; Ramirez, 1994; Seinfeld and Lapidus, 1974)). For modeling of chemical pro-
cesses, the techniques should be able to estimate nonlinear and time varying variables or pa-
rameters. This section will discuss two approaches well suited for the problem: the extended
Kalman lter and the PI-estimator.
4.3.1 Kalman ltering
The well-known Kalman lter (Kalman, 1960) was originally developed as a state es-
timator, but can also be used for parameter estimation. The discussion of the lter will be
presented in discrete form (since measurements mostly will be available in discrete form).
Consider the following linear system:
x
k+1
= x
k
+ u
k
+w
k
(4.11)
y
k
= Hx
k
+v
k
(4.12)
where x
k
is the process state vector at time step k, u
k
is the process input vector, , are
state transition and input transition matrices, respectively, y
k
is the model output vector, H
is the measurement matrix and w
k
, v
k
are vectors assumed to be a sequence of white noise
with zero crosscorrelation. The covariance matrices for these noise vectors are given by
E[w
j
w
T
i
] =
_
Q
ij
for i = j
0 for i = j
(4.13)
E[v
j
v
T
i
] =
_
R
ij
for i = j
0 for i = j
(4.14)
Q and R are diagonal matrices. Assume an estimate of the process state x
k
is available. The
hat denotes estimate and the super minus indicates that it is an estimate prior to assimilating
the measurement at time step k. Let
e
k
= x
k
x
k
(4.15)
P
k
= E[e
k
e
T
k
] (4.16)
When an estimate is made, an observer equation can be used to improve the estimate in the
following way:
x
k
= x
k
+G
k
i
k
(4.17)
59
Hybrid model design
i
k
= y
k
H
k
x
(4.18)
with i
k
the innovation and G
k
the Kalman gain at time step k. The Kalman gain is given by
the following equation(Brown and Hwang, 1992):
G
k
= P
k
H
T
k
(H
k
P
k
H
T
k
+R
k
)
1
(4.19)
and the corresponding estimation covariance matrix for x
k
is given by:
P
k
= (I G
k
H
k
)P
k
(4.20)
The optimal estimate x
k
can thus be calculated. To determine the estimate at the next time
step, the process state can be projected ahead using equation 4.11. Its error can be written as:
e
k+1
= x
k+1
x
k+1
(4.21)
= (x
k
+ u
k
+w
k
) x
k
u
k
(4.22)
= e
k
+w
k
(4.23)
which can be used to calculate the estimation error covariance matrix:
P
k+1
= P
k
T
k
+Q
k
(4.24)
The estimate and its covariance are now available, so the Kalman gain can be calculated for
this estimate to obtain the optimal estimate. This process is repeated for each time step.
The lter equations were derived for linear systems. Most systems under investigation here
however, are nonlinear. The lter can be adjusted to be able to deal with nonlinear systems
by introducing a state transition matrix (x
k
, u
k
) that calculates the state transition for each
time step. The same can be done for the input transition matrix. This results in a local
linearization of the system. Equation 4.11 then becomes:
x
k+1
= (x
k
, u
k
)x
k
+ (x
k
, u
k
)u
k
+w
k
(4.25)
The standard Kalman lter conguration is shown in gure 4.7.
Parameter estimation
Parameters or parameter vectors of the process model can also be estimated using the
Kalman lter. This can be done by introducing them as additional system states. This way,
they can easily be incorporated in the general lter structure. Additional state equations of
the following form are introduced:
d
dt
= 0 (4.26)
in which denotes the parameter(vector). By setting the derivative zero, the parameters are
assumed to be constant. However, the lter is also able to estimate time-varying parameters
using this approach. The quality of the estimates of time-varying parameters depends on the
tuning of the lter.
60
Observability and robustness
In order to determine the state of the process uniquely form a series of measurements
over the interval [t
0
, t
f
], the process needs to be observable. Consider the system described
by equations 4.11 and 4.12 and do not consider the noise vectors (this is the noiseless process
model). Because of its recursive structure, the state equation can also be written as:
x
k+1
=
k
x
0
+
k
u
0
(4.27)
with x
0
and u
0
the state and input vector at t = 0. From the measurement equation, the
process output vector can be represented by:
y
k
= H
k
x
0
(4.28)
The linear system is called observable when a unique representation of x
0
in terms of y
k
is possible. When the system is observable, all the estimates can be determined uniquely
with the measurement system. The observability is judged by evaluating the rank of the
observability matrix:
O
0,k
f
=
k
f
i=0
(
i
)
T
H
T
H
i
(4.29)
with k
f
the number of measurements. The system is observable if the observability matrix is
non-singular. For non-linear systems, the observability can be calculated by replacing with
(x
k
, u
k
). Instead of using
k
, the product of the various state transition matrices from time
step 0 to time step k is used:
k
i=0
(x
i
, u
i
) (4.30)
The robust integrity of the Kalman lter can be veried by determining a stability region
(see Ramirez (1994)). Consider the measurement perbutation slope n for the measurement
conguration of the lter and assume that the measurement covariance matrix R is diagonal.
The Kalman lter is stable for all n such that
n >
1
2
(1
0
) (4.31)
where
0
is the minimum eigenvalue of
(R
1/2
G
T
Q
1
GR
1/2
)
1
> 0 (4.32)
The criterion imposes an upper bound on the gains in the Kalman lter. Gains much larger
than the process perbutations can cause the lter to respond too sensitive, resulting in insta-
bility. A value of equation 4.31 close to the maximum value of 1/2 indicates a small region
of robust stability. If a high degree of integrity and robustness is required, then
0
must be
maximized by the choice of Q and R.
61
Hybrid model design
0 50 100 150 200
0.05
0
0.05
0.1
0.15
t (h)
(
h
1
)
0 50 100 150 200
0
10
20
30
40
t (h)
X
(
g
/
l
)
(a) (b)
0 50 100 150 200
1
2
3
4
5
6
x 10
3
t (h)
q
p
(
h
1
)
0 50 100 150 200
0
2
4
6
8
10
t (h)
P
(
g
/
l
)
(c) (d)
Figure 4.4: Kalman estimates of (a), X (b), q
p
(c) and P (d). Lines indicate estimates, dots measure-
ments.
Filter tuning
Although matrices Q and R are dened as covariance matrices, they can be used to
tune the Kalman lter and improve the performance. This is a trial and error process. The
innovation (equation 4.18) is an important tool for tuning the lter. Since the noise vector
w
k
in equation 4.11 is assumed to be a sequence of white noise, it is a good indication that
the lter is tuned well if the innovation is reduced to a sequence of white noise. This can be
checked by calculating the autocorrelation in the innovation.
Example 4.3 A simple Kalman lter can be designed to estimate and q
p
for the bioreactor. The
measurements of X and P are sufcient for the system to be observable. The system can be
represented as:
_
P
q
p
_
_
=
_
_
( D) 0 0 0
q
p
(D +K) 0 0
0 0 0 0
0 0 0 0
_
_
_
_
X
P
q
p
_
_
(4.33)
in which
D =
F
V
(4.34)
62
0 20 40 60
0.5
0
0.5
1
Lag
X
P
Figure 4.5: Autocorrelation in Kalman lter innovation as a function of the lag
The measurement matrix H for this system is given by:
H =
_
1 0 0 0
0 1 0 0
_
(4.35)
and the innovation matrix is dened as:
i =
_
X
P
_
P
_
(4.36)
in which X and P are the measurements of the biomass concentration and the product concentra-
tion, while
X and
_
1e 4 0 0 0
0 1e 4 0 0
0 0 5e 6 0
0 0 0 7e 5
_
_
, R =
_
0.2 0
0 0.01
_
(4.37)
These settings were used for runs ID1 to ID16 and corresponded with an average stability border of
0.49, which indicates a small region of robust stability. The estimates, however, were found to be
of acceptable quality. Estimation results for run ID1 are shown in gure 4.4 and the autocorrelation
in the innovation is shown in gure 4.5. For clarity, only some of the measurements are shown. 2
4.3.2 PI-estimation
A simple alternative to the Kalman lter for nonlinear parameter estimation is the PI es-
timator which is similar to the PI feedback controller (Van Lith et al., 2001). It has many
similarities with the Kalman lter, but is much easier to set up. Consider the nonlinear pa-
rameter of the system under study as a system input. The effect on the behavior of the model
63
Hybrid model design
Process
f H
L
1
y
u
+
+
+
-
L
2
+
+
y
^
x
^ ^
x
.
Figure 4.6: PI-control scheme for parameter estimation
of this new input is dictated by the model equations (as with normal system inputs), while
the nonlinear time-varying behavior of the parameter itself needs to be determined by some
other means.
Since the parameter is viewed as an input, this input can be used to control the behavior of
the model. This behavior needs to be the same as the actual physical behavior of the system
described by measurements. The actual behavior can be seen as a trajectory that the model
needs to follow. The correct nonlinear estimation of this new input will result in correct
model behavior.
This task can be accomplished by designing a simple feedback controller which controls the
behavior of the model by manipulating the nonlinear parameter. The controller output serves
as an estimate for the behavior of the nonlinear parameter. This is only true if the complete
initial state is known. The controller takes the difference between the controlled model output
and the desired trajectory (i.e. the innovation) as input. The controller is chosen to be a
familiar PI-controller for fast response and elimination of offset:
= L
1
i +L
2
_
idt (4.38)
where is the parameter vector and i is the innovation. L
1
and L
2
are diagonal matrices of
appropriate dimensions (with respect to the parameter vector ).
The control scheme is shown in gure 4.6. In this gure, the control action is concate-
nated to the input vector of the system to obtain the model input vector u
:
u
=
_
u
_
(4.39)
Following more conventional PI-controller notation, the tuning parameters, the elements on
the diagonals of L
1
and L
2
, can be written in terms of the controller gain K and integral time
64
constant
i
as follows:
l
1,ii
= K (4.40)
l
2,ii
=
K
i
(4.41)
Comparison with Kalman ltering
The controller interpretation of parameter estimation shows some analogies with con-
ventional parameter estimation using state estimators. If a state estimator, such as a Kalman
lter (or Luenberger observer) is used to make parameter estimations, the parameter is intro-
duced as an additional state variable (see equation 4.26). The lter conguration is shown in
gure 4.7.
In gure 4.7 (a), equation 4.26 is a part of the function block f. Rearranging the equations
yields the conguration shown in gure 4.7 (b). The extra integral block is the state equation
for the parameter (equation 4.26), which is taken outside the function block f. This results in
G being replaced with a different gain matrix G
1
and the introduction of an additional gain
matrix G
2
. Thus gure 4.7 (a) and (b) are different representations of the same Kalman lter.
The congurations in gure 4.6 and gure 4.7 (b) are quite similar with respect to integral
action. But there are also some differences. Consider a single parameter . First of all, in
the controller conguration there is an additional proportional term that adjusts the estimates
for . In the Kalman lter, this term is not present. Secondly, the matrix G
1
of the Kalman
lter in gure 4.7 (b) adjusts the estimates of the complete state x. These adjustments are
made in addition to the adjustments that are only made to parameter . So the estimates of ,
propagated through the model f, result in estimates of x, which are subsequently adjusted by
the innovation i and G
1
. Thus this mechanism can correct estimation errors in x caused by
the estimates of . This mechanism is not present in the controller conguration. However,
this can be seen as an advantage, because in the controller conguration, x is only calculated
accurately if the estimates of are accurate.
Obviously, the calculation of the gain matrices G, L
1
and L
2
differs. The different gains L
1
and L
2
are xed and are set using PI tuning procedures. In practice, tuning of the Kalman l-
ter is not done by adjusting G, which is calculated by the Kalman equations, but by adjusting
the noise covariance matrices Q and R, from which G is calculated.
Controller conguration
When the controller conguration is used in multiple parameter estimation problems,
appropriate control loops need to be chosen. In other words, for each parameter which state
is used to obtain the estimates has to be decided. Analysis has to be done to determine the
65
Hybrid model design
Process
f H
G
y u
+
+
+
-
y
^
x
^ ^
x
.
(a)
G
2
+
+
Process
f H
G
1
y u
+
+
+
-
y
^
x
^ ^
x
.
(b)
Figure 4.7: Standard (a) and alternate (b) Kalman lter conguration
66
interactions between the various parameters and states, so that sensible control loops can be
chosen. Using the relative gain array (see Bristol (1966)), pairs of inputs
j
(the parameters
that have to be estimated) and outputs x
i
(the states that are controlled by the parameter
estimates) can be selected in order to minimize the amount of interaction among the resulting
loops.
If the static gain matrix G
static
of the transfer functions of the system is available, the relative
gain array can also be calculated as follows (Roffel and Chin, 1987):
ij
= g
ij
(G
1
static
)
T
ij
(4.42)
in which g
ij
denotes the ij-th element of the static gain matrix G
static
and
ij
is the ij-th
element of the relative gain array .
The relative gain array provides a measure of the interaction based on steady-state consid-
erations. Therefore, the rule given above for the selection of loops does not guarantee that
the dynamic interaction between the loops will also be minimal. The relative gain array can
be replaced by its dynamic counterpart to account for this (Roffel and Chin, 1987). The
interaction can then be calculated for different frequencies.
Example 4.4 The parameters and q
p
of the bioreactor can be estimated by two PI-controllers.
The model f (see gure 4.6) is formed by the mass balance for the biomass concentration (equa-
tion 4.1) and the product concentration (equation 4.3); both states are measured and the parameters
and q
p
directly inuence the behavior of these states.
Form the equations, it is obvious to control the biomass concentration X with and the product
concentration P with q
p
. This is also conrmed by the relative gain array . Dene the manipu-
lated variable vector as
_
q
p
_
(4.43)
and the controlled variable vector as
_
X
P
_
(4.44)
To determine the static open loop gain matrix G
static
, the nonlinear dynamic system can be lin-
earized using a rst order Taylor expansion and transformed to deviation variables, assuming sta-
tionary operation. Although the system is non-stationary, the only purpose of the relative gain array
and thus the open loop static gain matrix is to determine which input should control which output,
so this assumption can be made. The linearization resulted in the following static gain matrix:
G
static
=
_
X
0
0
F
0
/V
0
0
q
p0
F
0
/V
0
+K
X
0
0
F
0
/V
0
X
0
F
0
/V
0
+K
_
(4.45)
in which the subscript
0
denotes the working point where the linearization was made. Element
ij of G
static
is the open loop static gain of output i with respect to input j. Using equation 4.42,
the relative gain array becomes:
=
_
1 0
0 1
_
(4.46)
The relative gain array is independent of the point of linearization and shows that X should be
matched with and P should be matched with q
p
.
67
Hybrid model design
Estimator type Settings
PI-Estimator K = 0.06,
i
= 2
PI-Estimator q
p
K = 0.15,
i
= 10
Table 4.2: PI-Estimator settings for bioreactor problem
The innovation vector is the same as for the Kalman lter. The two PI-estimators were tuned
manually. The model structure allows the estimators to be tuned separately. First, the controller
for X is tuned. Since q
p
has no inuence on X, the controller for P has no inuence on X, which
means that this can be done. If this controller is tuned well, the error in the estimates of P is only
caused by the error in q
p
. The controller for P can be tuned subsequently.
Estimator settings are given in table 4.2. These settings were used for experiments ID1 to ID16.
The results of experiment ID1 are shown in gure 4.8.
The estimates are good; the trajectories of X and P are followed closely. The Kalman lter
adjusts the estimates of as well as the estimates of X to minimize the innovation i, whereas
the PI-estimator only uses the estimates of . As a result, the noise level in the estimates of is
somewhat higher than in the estimates made by the Kalman lter.
The PI-estimator only estimates the biomass X well if is estimated well, making the estimates
of plausible. A similar statement can only be made for the Kalman lter in example 4.3 if the
tuning parameters (the elements of the process noise covariance matrix Q) corresponding to the
0 50 100 150 200
0.05
0
0.05
0.1
0.15
t (h)
(
h
1
)
0 50 100 150 200
0
10
20
30
40
50
t (h)
X
(
g
/
l
)
(a) (b)
0 50 100 150 200
0
2
4
6
x 10
3
t (h)
q
p
(
h
1
)
0 50 100 150 200
0
2
4
6
8
10
t (h)
P
(
g
/
l
)
(c) (d)
Figure 4.8: PI-estimates of (a), X (b), q
p
(c) and P (d). Dots measurements, line estimates.
68
measured states are set at low values, although in this case, some direct correction on the measured
states remains. 2
4.3.3 Remarks
With regard to the computational structure, the PI-estimator shows many resemblances
with Kalman ltering. Therefore, it can be a good alternative to Kalman ltering. The re-
sults presented in the example were comparable with Kalman lter estimates. Although the
Kalman lter has versatile application possibilities in combined state and parameter estima-
tion, the PI-estimator has the advantage that it is simple, easy to use and easy to tune for
simple single or multiple parameter estimation problems and therefore may be preferred over
Kalman ltering.
4.4 Submodel identication
Submodel identication involves the design of the equations of the hybrid model. These
include the accumulation balances, algebraic equations and the fuzzy equations. The iden-
tication of the balances and the algebraic equations done in the same fashion as with rst
principles modeling. This section will focus on the identication of the fuzzy equations.
In hybrid modeling, TSKtype fuzzy models are used. The design of these models involves the
determination of the number of local linear models (the number of rules), the parameters of
these models or hyperplanes and the operating range for which they are valid (the premise
part). The parameters of the fuzzy equations are derived from the input-output data.
Research efforts in the eld of identication of TSK fuzzy models has been enormous, as
is the number of algorithms. They vary from manual design, tree search methods (Nelles,
1997) to an abundance of combinations of soft computing algorithms. It is unfeasible to
present a thorough evaluation of the different techniques. However, three different approaches
representing three different classes of identication algorithms are presented in order to be
able to give some guidance with regard to building hybrid models. They are fuzzy clustering,
genetic algorithms and neuro-fuzzy methods.
The calculation of the consequent part parameters using the three approaches is similar. The
approaches differ in the way the rule structure and antecedent membership functions are
determined. Fuzzy clustering is the most exible. The presented approach determines the
number of rules and the premise part parameters with minimal a priori knowledge. Genetic
algorithms require more initial information. This approach searches for premise part param-
eters in a predened search space. In addition, the number of rules needs to be specied.
Finally, the neurofuzzy approach which is presented requires a complete initial fuzzy model,
which is optimized to provide a desired input-output mapping.
For basic fuzzy set theory concepts, the reader is referred to appendix A.
69
Hybrid model design
4.4.1 Fuzzy clustering
The basic idea behind clustering is to divide a set of objects into self-similar groups
(or clusters). This similarity is often dened as a distance norm. Clustering methods are
usually based on assumptions about the geometry of the clusters, which include spheres,
lines, hyperplanes, ellipsoids, etc. Various clustering methods can be used to develop fuzzy
models. The goal of this section, however, is to present a useful approach for use with hybrid
modeling. More detailed information about fuzzy clustering can be found in Jain and Dubes
(1988), Pal et al. (1997) or Babu ska (1996).
Since TSK models are used, the clustering algorithm must search for linear subspaces. Each
cluster can then be represented by a rule of the fuzzy model. Cluster prototypes can be
dened as linear subspaces (lines, planes, hyperplanes), while the similarity measure can be
dened as the distance to such a prototype. Algorithms that use this approach include fuzzy
c-varieties (Bezdek, 1981), fuzzy c-elliptotypes (Bezdek et al., 1981) and fuzzy regression
models (Hathaway and Bezdek, 1993).
Gustafson-Kessel clustering
Gustafson-Kessel clustering (GK-clustering) has found to be an effective tool for building
TSK models (Babu ska and Verbruggen, 1994; De Bruin and Roffel, 1996; Zhao et al., 1994).
GK-clustering uses the covariance matrix of a cluster during the calculation of the distance
measure. The advantage of GK-clustering is that the clusters can have different shapes (while
they all have the same volume). This gives the GK-algorithm more exibility in describing
complex systems. In addition, it is insensitive to scaling of the data or initialization (Babu ska,
1996).
The clustering algorithm can be formulated as follows. Let X = {x
j
|j = 1, . . . , N} be a
set of feature vectors in
n
, where N is the number of measurements and n is the dimension
of the input-output space. Let P = (P
1
, P
2
, . . . , P
K
) be a K-tuple of cluster prototypes,
characterized by a cluster center and a covariance matrix F. A partition of X into K fuzzy
clusters will be performed by minimizing the objective function
J(P, U; X) =
K
i=1
N
j=1
(u
ij
)
m
d
2
(x
j
, P
i
) (4.47)
where U = [
ij
]
KN
,
ij
[0, 1] is the fuzzy partition matrix, which denotes to what extend
a feature vector belongs to each cluster. This matrix satises the following conditions:
0 <
N
j=1
ij
< N, i (4.48)
K
i=1
ij
= 1, j (4.49)
70
The minimization of the objective function is performed iteratively. Repeat for l = 1, 2, . . . ,
given the initial fuzzy partition matrix U
(0)
(which is initialized randomly), the number of
clusters K and the termination tolerance :
Compute the cluster centers:
(l)
i
=
N
j=1
m
ij
x
j
N
j=1
m
ij
, 1 i K (4.50)
Compute the new covariance matrices:
F
(l)
i
=
N
j=1
m
ij
(x
j
i
)(x
j
i
)
T
N
j=1
m
ij
, 1 i K (4.51)
Compute the distances of the features to the cluster centers d
2
(x
j
, P
i
):
d
2
(x
j
, P
i
) = (x
j
i
)
T
|F
i
|
1/n
F
1
i
(x
j
i
), 1 i K, 1 j N (4.52)
Compute the membership new grades:
(l)
ij
=
d
2
(x
j
, P
i
)
1/m1
K
k=1
d
2
(x
j
, P
l
)
1/m1
, 1 i K, 1 j N (4.53)
in which m is the fuzzy exponent, which inuences the resulting partition. As m ap-
proaches 1 from above, the partition becomes hard. As m approaches , the partition
becomes maximally fuzzy. For GK-clustering, m is usually set to 2.
If d
2
(x
j
, P
i
) = 0 for some i, then set
ij
= 1 and set
ij
= 0, i = k.
until U
(l)
U
(l1)
< .
Structure optimization
The number of clusters K needs to be determined a priori. However, it is usually not
possible to determine the optimal number of clusters beforehand. The number of clusters
can be limited by merging compatible clusters that show a certain degree of conformity.
A suitable technique that can accomplish this is the modied compatible cluster merging
(MCCM) algorithm (Kaymak and Babuska, 1995). This approach merges clusters based on
cluster distances and covariance matrix eigenvectors.
Let the centers of two clusters be
i
and
j
. Let the eigenvectors of the two corresponding co-
variance matrices be {
i,1
, . . . ,
i,n
} and {
j,1
, . . . ,
j,n
} and arranged in descending order.
The two criteria used for cluster merging are dened as:
c
1
ij
= |
i,n
j,n
| k
1
, k
1
close to 1 (4.54)
71
Hybrid model design
c
2
ij
=
i
j
k
2
, k
2
close to 0 (4.55)
The rst criterion states that if the clusters, viewed upon as hyperplanes, are almost parallel,
then they should be merged. The second criterion states that if the two clusters are sufciently
close, then they should be merged. Evaluating these criteria for all clusters yields two matri-
ces, C
1
[c
1
ij
] and C
2
[c
2
ij
], whose elements indicate the degree of similarity between clusters i
and j.
Based on these matrices, a decision must be made whether to merge two clusters. To make
this decision, C
1
and C
2
are transformed to
C
1
and
C
2
using two exponential membership
functions. These membership functions indicate the degree of compatibility between two
clusters and are dened as follows:
c
1
ij
= exp
7(c
1
ij
1)
2
(1 a)
2
(4.56)
c
2
ij
= exp
7(c
2
ij
1)
2
(1 b)
2
(4.57)
in which
a =
1
n(n 1)
n
i=1
n
j=1,j=i
c
1
ij
(4.58)
b =
1
n(n 1)
n
i=1
n
j=1,j=i
c
2
ij
(4.59)
This makes the decision algorithm problem dependent. The criteria for closeness and par-
allellity may partially compensate for each other. Two clusters that are very close but that
are slightly non-parallel may need to be merged. This is also true for parallel clusters that lie
somewhat apart. To account for this,
C
1
and
C
2
are combined to a matrix C
0
in the following
way:
c
0
ij
=
_
c
1
ij
c
2
ij
(4.60)
To decide which clusters are to be merged, the matrix C
0
is thresholded. One or several
groups of clusters can be merged. The merging is done transitively, which means that if the
values of C
0
indicate that cluster i and j should be merged and that cluster j and k should be
merged, all three clusters i, j and k are merged together. This is accomplished by relational
clustering (Dunn, 1974; Yang, 1993):
C
0
:= C, i := 0
repeat
i = i + 1
C
i
= C
0
C
i1
until C
i
= C
i1
C := C
i
(4.61)
in which C
0
C
i1
= [c
ij
] with c
ij
=
K
k=1
(c
ik
c
kj
). In addition, i = 1, . . . , K and
j = 1, . . . , K.
72
The matrix C is now thresholded with a value . If c
0
ij
> then cluster i and j should be
merged. Since c
ij
always lies between 1 and 0, a threshold of 1 means that clusters are
never merged, while a threshold of 0 means that clusters are always merged. A value of
between 0.55 and 0.75 is recommended (Kaymak and Babuska, 1995).
Undesired results may be obtained by cluster merging if incompatible clusters are located
between clusters that are compatible for merging. A heuristic step can solve this problem
(Babu ska, 1996). This step prevents merging of clusters if there is an incompatible cluster
close to the mutual centers of the compatible clusters in the set M:
min max
i
M,
k
M
d
ik
> max
i
,
j
M
d
ij
(4.62)
in which d
ij
is the distance between cluster centers
i
and
j
projected on the input space:
d
ij
= proj(
i
) proj(
j
) (4.63)
If this criterion is not met, the set of clusters M is not merged.
A fuzzy partition matrix U
.
Model derivation
The result of the clustering algorithm is a description of the fuzzy system. The number
of rules is given by the number of clusters. The premise part is given in terms of the fuzzy
partition matrix U, which represents the clusters by multi dimensional, point-wise dened
membership functions in the input space. In essence, these membership functions represent
the Degree Of Fire (DOF) of the rules. The cluster centers and their covariance matrices F
represent the consequent part. A model description that is independent of the identication
data must be derived in order to be able to use the model.
Premise part parameters
The antecedent membership functions can be derived in several ways. The degree of mem-
bership of a feature for a cluster can be directly calculated in the product space of the input
variables by using the distance measure from the clustering algorithm. An alternative is us-
ing a multi-input version of a parametric membership function that is tted to the point-wise
denition, given by the fuzzy partition matrix. The shape of the membership function given
by the fuzzy partition matrix is similar to sigmoidal membership functions, while the shape
of the clusters is ellipsoidal. The multi-input membership function can then be dened as a
73
Hybrid model design
0 0.5 1
0
0.2
0.4
0.6
0.8
1
input
Figure 4.9: Projection of fuzzy partition matrix and t of parametric membership function
sigmoidal membership function with an ellipsoidal basis. For a system with two inputs x
1
and x
2
, the multi input membership function can be given by:
=
1
1 + expa(r b)
with r =
_
(x
1
x
1
)
2
c
2
+
(x
2
x
2
)
2
d
2
(4.64)
with a, b, c, d function parameters and x
1
, x
2
the cluster center.
A disadvantage of multi dimensional membership functions is that their description is less
transparent than the descriptions by single input membership functions. A more straight-
forward way is to project the multi dimensional point-wise membership functions onto the
input variables. A parametric single input membership function can then be tted through
this projection. This is illustrated in gure 4.9.
The projection can be done directly onto the input variables, which results in an orthogonal
projection. Projection is carried out for all of the input variables. After tting, the clusters
can be reconstructed by applying the AND intersection operator in the product space of the
input variables, so that the Degree Of Fire is calculated.
The reconstruction is not exact. If clusters are located in a non-orthogonal way with respect
to the input variables, the reconstruction may result in decomposition errors, illustrated in
gure 4.10. The error can be reduced by rotating the input space to obtain an orthogonal
orientation, which involves transformation of the input variables. This can be done by using
eigenvector information form the cluster covariance matrices (Babu ska, 1996) or by using
cluster boundary information (De Bruin and Roffel, 1996). However, this error can also be
reduced by least-squares calculation of the consequent parameters, which will be explained
later on. Some information may be lost by projecting non-orthogonally oriented clusters,
but it provides a better possibility to interpret the model because the input variables are not
transformed.
Consequent part parameters
The calculation of the consequent parameters involves determining the hyperplanes that de-
scribe the identication data for each cluster. The calculation is divided in two categories. In
the rst category, the parameters for each hyperplane are calculated directly from the cluster
74
X
Y
cluster
decomposition
error
Figure 4.10: Decomposition error caused by orthogonal projection
center and eigenvector information from the covariance matrix. The normal of the desired
hyperplane is the eigenvector that corresponds to the smallest eigenvalue of the covariance
matrix. The parameters can be calculated from this eigenvector (Babu ska, 1996) or from the
remaining eigenvectors, which span the hyperplane (De Bruin and Roffel, 1996).
A disadvantage of direct calculation of the consequent parameters is that model errors can
occur due to the decomposition error. This model error can be partially compensated for
by estimating the consequent parameters using a least squares approach. The estimation
problem can be approached as several independent weighted least squares problems (one
for each cluster) or as a global least squares problem following from the weighted mean
defuzzication equation (Babu ska, 1996). The global least squares approach provides an
optimal fuzzy model in terms of a minimal model error. However, due to the global approach,
the estimates of the local models are biased. In this work, the weighted least squared approach
is preferred because it provides local interpretation and analysis of the fuzzy model.
The weighted least squares approach assumes that each cluster represents a local linear model
of the system. The membership values
ik
of the fuzzy partition serve directly as the weights
expressing the relevance of the data feature to the local model i. Let X denote a matrix con-
taining the N input features in each row, let y denote a vector containing the corresponding
output values and let W
i
denote a diagonal matrix having the membership values
ik
as its
kth diagonal element:
X =
_
_
x
T
1
x
T
2
.
.
.
x
T
N
_
_
, y =
_
_
y
1
y
2
.
.
.
y
N
_
_
, W
i
=
_
i1
0 0
0
i2
0
.
.
.
.
.
.
.
.
.
.
.
.
0 0
iN
_
_
(4.65)
The consequent parameters of rule i, a
i
and b
i
, are concatenated into a single parameter
vector
i
:
i
=
_
a
T
i
; b
i
T
(4.66)
75
Hybrid model design
Appending a unitary column to X gives the extended input variable matrix X
e
:
X
e
= [X; 1] (4.67)
If the columns of X
e
are linearly independent and
ik
> 0 for 1 k N, then
i
=
_
X
T
e
W
i
X
e
1
X
T
e
W
i
y (4.68)
is the least squares solution of y = X
e
+ where the kth data feature is weighted by
ik
. A
more efcient computer implementation is obtained if each rowof X
e
and y is rst multiplied
with
ik
:
X
i
=
_
i1
x
T
1
i2
x
T
2
.
.
.
iN
x
T
N
_
_
, y
i
=
_
i1
y
1
i2
y
2
.
.
.
iN
y
N
_
_
(4.69)
The parameters are then given by:
i
=
_
X
T
i
X
i
_
1
X
T
i
y
i
(4.70)
An -cut can be incorporated in the calculation in order to prevent biasing by data features
with low membership values for a certain cluster. The low membership of these features is
already taken into account by the weighting. However, if a cluster is small than the number of
features with low membership value may be high, which can bias the calculation. The -cut
c
can be performed as follows:
i
= {x
ik
|x
ik
X
i
,
ik
>
c
,
c
[0, 1]} (4.71)
y
i
= {y
ik
|y
ik
y
i
,
ik
>
c
,
c
[0, 1]} (4.72)
Example 4.5 As an example, fuzzy clustering will be used to build a model of the net growth rate
as a function of the substrate S and biomass X for the hybrid model of example 4.1. Assume
that an input-output data set has been created using experiments ID1 to ID16 (example 4.2) and
that the PI-estimator designed in example 4.4 is available. This means that input-output data is
available in the form of measurements of S and X and estimates of .
The data set can be presented directly to the clustering algorithm. However, due to the nature of
the batch runs, the data is not evenly distributed in the input space. During the latter hours of a
batch, a sort of pseudo steady state is obtained in which X and S do not vary as much as during
the rst hours. This causes data features to lie closer together in the area where X is high and S
is low than data features in the area where X is low and S is high. Since the clustering approach
uses least squares calculation, results may be biased because of this. This problem is solved by
reduction of the data set: all features, for which the distance in the input space to an other feature
is smaller than a pre-determined threshold, are removed. This threshold is set manually by nding
a balance between the number of data features in the reduced data set and the average distance
between these features. The threshold was set to 2 and the distribution of the resulting data set is
shown in gure 4.11.
76
0 10 20 30
0
20
40
60
S (g/l)
X
(
g
/
l
)
0 10 20 30
0
20
40
60
S (g/l)
X
(
g
/
l
)
(a) (b)
Figure 4.11: Input space distribution before (a) and after (b) reduction
The reduced data set was clustered and the threshold was set to 0.4. The initial number of clusters
was 10. This number was reduced to three by the merging algorithm. The next step is to project
the fuzzy partition matrix onto the S and X axis, so that parametric membership functions can be
determined. This is shown in gure 4.12.
The fuzzy model can now be completed by calculating the consequent part parameters using the
least squares approach. For the -cut,
c
was set to 0.5. This yields the following local linear
models:
y
1
= 0.0114S + 0.0048X 0.0859 (4.73)
y
2
= 0.0061S 0.0001X 0.0054 (4.74)
y
3
= 0.0092S 0.0028X 0.0327 (4.75)
0 10 20
0
0.5
1
S (g/l)
mf1
0 10 20
0
0.5
1
S (g/l)
mf2
0 10 20
0
0.5
1
S (g/l)
mf3
(a) (b) (c)
20 40 60
0
0.5
1
X (g/l)
mf1
20 40 60
0
0.5
1
X (g/l)
mf2
20 40 60
0
0.5
1
X (g/l)
mf3
(d) (e) (f)
Figure 4.12: Fuzzy partition matrix projections and tted parametric membership functions
77
Hybrid model design
5
10
15
20
25
10
20
30
40
50
0
0.1
0.2
0.3
S (g/l)
X (g/l)
(
h
1
)
0 5 10 15 20 25
10
20
30
40
50
60
S (g/l)
X
(
g
/
l
)
rule 1
rule 2
rule 3
(a) (b)
Figure 4.13: Fuzzy relation for and rule locations. Dots indicates identication data
The fuzzy model can now be simply formulated as:
IF S = mf
1
AND X = mf
1
THEN = y
1
IF S = mf
2
AND X = mf
2
THEN = y
2
IF S = mf
3
AND X = mf
3
THEN = y
3
(4.76)
Figure 4.13 shows the function hyperplane and the locations of the rules. The Root Mean Squared
Error (RMSE) with respect to the identication data is 0.0248. This error is mainly the result of a
bad t in the region where X is low; the RMSE with respect to data features in the region where
X is high is 0.0091. A more complex fuzzy relation can provide a better description. However,
the observed behavior of the net growth rate is not so complex that a more complex fuzzy relation
is justied from a transparency point of view. In addition, the PI-estimator lags behind the actual
value of the growth rate due to the feedback nature. Estimates of are thus lower than the actual
value. Therefore, the nal evaluation of the fuzzy model should be done after integration in the
hybrid model structure.
Although fuzzy logic is a black box technique, a posteriori analysis of the model shows that the
rules represent three phases during a batch: an initial phase, in which the substrate concentration is
still high and the biomass culture is growing rapidly (rule 1), a nal phase which occurs at the end
of a batch, in which not much substrate is present and the net growth rate is not very high (rule 2)
and an intermediary phase (rule 3). 2
4.4.2 Genetic algorithms
Genetic Algorithms (GAs) are well known for their optimization capabilities. Following
basic Darwinistic propagation, the method is based on a survival of the ttest principle, in
which only the solution candidates with the best desirable properties (e.g. smallest error) from
a population will survive. The candidates that will survive are selected by evaluating their
tness value through the tness function (similar to the objective function in more traditional
optimization algorithms).
This section describes the basic concept of genetic algorithms and the techniques used in this
78
work, for a more detailed discussion see (Goldberg, 1989).
When genetic algorithms are used to identify a fuzzy system, the identication is viewed upon
as an optimization problem. Many applications of developing fuzzy systems with GAs have
been reported (Back and Kursawe, 1995). Using GAs to set up a fuzzy systeminvolves coding
the probleminto chromosomes or strings and setting up a tness function (goal function).
Since TSK models are used, the consequent part of the fuzzy model can be calculated using
a least squares approach, if the premise part is available (see section 4.4.1). Therefore a
hybrid identication approach is used: only the premise part of the fuzzy model is coded
into chromosomes and optimized by the GA. In each iteration, the consequent parts of all
candidates are calculated using the least squares approach, after which the tness function
is calculated. Since the local models in the consequent part of each rule are least squares
optimal, no rule structure optimization is necessary. Optimization of the number of rules
involves a more elaborate approach and signicantly increases the search space.
In order to be able to code the problem, an initial model structure needs to be formulated.
Based on the number of rules and membership functions in this structure, a binary string can
be designed.
Premise part coding
Usually, the problem is coded into a binary string or chromosome which is presented
to the GA for optimization. Real-valued coding is less common, but also can be used. The
coding of the premise part of the fuzzy system involves representing the parameters of the
membership functions in a binary format. The representation chosen here is straightforward
and taken from KrishnaKumar et al. (1995).
The rst step is to map a single parameter to a binary string. To accomplish this, allowed
maximum and minimum values of the parameter are chosen. In addition, the string length
should be chosen. The parameter is discretized by mapping to the string from the minimum
value to the maximum value with a linear mapping in between. For example, for a string
length of 5, 00000 corresponds to the minimumvalue and 11111 corresponds to the maximum
value. The string length is determined by selecting the resolution r, the number of discrete
values in the mapping.
The discrete mappings of the different parameters of a membership function are concatenated
to form a string that represents the membership function. Double sigmoid membership func-
tions, which contain 4 parameters, will be used. Different membership function strings are
then concatenated to form the string with length that represents the premise part, as shown
in gure 4.14.
79
Hybrid model design
01101 11010
Membership function 1
... 11110 00101
Membership function 2
... ...
Input 1
10101 00011
Membership function 1
... 00110 00111
Membership function 2
... ...
Input 2
...
Membership function
parameter
Figure 4.14: GA coding of the premise part
Optimization procedure
The identication involves three main steps, which are performed iteratively:
Optimization of the consequent part using weighted least squares
Evaluation of the tness of the population of possible solution candidates for the
premise part
Evolution of the population, based on the population tness
The procedure is initialized by creating a population of possible solution candidates, in
this case a set of different premise parts for the system, coded as binary strings. This set is
initialized randomly in the search space, spanned by the selected maximum and minimum
values of the coded parameters. The size M of the population depends on the complexity of
the problem and the number of parameters that need to be optimized. One way to assure that
a suitable solution can be found is to create a relatively large population, but this can result
in a large computational effort. Some researchers have found that a good rule of thumb is to
select a population size of 25% of the string length or larger.
The rst step in the procedure is to optimize the consequent part for each member in the
population. The weighted least squares calculation is explained in section 4.4.1. The next
step is to calculate the tness of each member in the population. Since a set of data needs
to be described by the model, the goal is to minimize the model error with respect to the data.
The tness is expressed in terms of this model error: the smaller the error, the tter the
member of the population. A squared error, common in optimization procedures, is used:
J =
1
2
y y
2
2
=
1
2
N
i=1
(y
i
y
i
)
2
(4.77)
in which y denotes the model output vector and y denotes the desired output vector.
Evolution of the population takes place by reproduction. Based on their tness value, mem-
bers of the population are selected for reproduction. Members that are selected to reproduce
are copied and replace members with a low tness value. The number of members M of
a population remains unchanged. This means that good strings get more copies in the next
generation, which emphasizes the basic survival of the ttest concept.
80
Different strategies exist for the selection of population members. The best known procedure
is the roulette wheel. Each member of the population has a roulette wheel slot sized in
proportion to its tness. The roulette wheel is spun and the string at which slot the roulette
wheel stops is selected for reproduction. This means that population members with high
tness are more likely to be selected than strings with low tness. Spinning is repeated until
the next generation is full.
However, performance difculties using the roulette wheel have been reported and improve-
ments of this basic selection procedure have been presented (Goldberg, 1989). Tournament
selection is an example of a procedure that does not have these restrictions. With tourna-
ment selection, a pair of individuals is selected at random from the population. The ttest
of the two becomes a member of the next generation. This process is repeated until the next
generation is full.
Two processes, crossover and mutation, take place during reproduction. The result of these
processes is that population members with new properties can be created. A simple crossover
process is performed in three steps. First, the newly selected strings are paired together
at random. Second, an integer position n along every pair of strings is selected uniformly at
random. Finally, based on a probability of crossover p
c
, the pairs of strings undergo crossover
at the integer position n along the string. This results in new pairs of strings that are created
by swapping all the characters between and including characters 1 and n. Crossover can be
at a single point, two points or at k points. The probability of crossover p
c
is typically around
0.75.
Mutation is simply an occasional randomalteration of a string character, which in this case in-
volves changing a 1 to a zero and vice versa. Mutation is based on the probability of mutation
p
m
, which typically is around 0.001. The mutation operator helps in avoiding the possibility
of mistaking a local minimum for a global minimum because it introduces a random search
element in the vicinity of the population.
The three iteration steps - consequent part optimization, tness evaluation and evolution - are
repeated until a pre-determined number of generations G have been evaluated or when there
is no improvement.
Example 4.6 To illustrate the GA approach, a fuzzy model will be identied for the net growth
rate as a function of the substrate concentration S and the biomass concentration X, as a part
of the hybrid model for the bioreactor described in example 4.1. This will be done using the same
reduced data set as was used in example 4.5.
The GA does not incorporate the optimization of the number of rules nor does it derive the model
structure from the data. Therefore, several different identication runs were performed and ana-
lyzed in order to determine an acceptable number of rules. In each case, the rules were independent.
Double sigmoid membership functions were used in each experiment. In order to guarantee a
certain level of transparency, the parameter description of this membership function was slightly
changed (see equation A.4):
=
1
1 + exp a(x b)
1
1 + expc(x d )
(4.78)
81
Hybrid model design
# rules r G M p
c
p
m
RMSE
2 64 96 250 101 0.77 0.0077 0.0197
3 64 144 250 101 0.77 0.0077 0.0132
4 64 192 250 101 0.77 0.0077 0.0129
Table 4.3: GA setting and results
with
= (
1
c
1
a
) ln(1/0.99 1) +b (4.79)
If d 0, then this membership function always reaches approximately 1. Parameters b and d
determine the position of the membership function, while a and c determine the slope of the S-
curves. For the coding of the premise part, maximum and minimum values for b and d were
based on the input domain. Maximum and minimum values for a and c were chosen arbitrarily to
guarantee a certain level of fuzziness; high values lead to crisp sets, while low values lead to sets
that are too fuzzy.
GA settings and results for three different runs are shown in table 4.3. All models describe the
identication data well and model errors are in the same order of magnitude. Not surprisingly, the
model with 4 rules has the smallest error.
The GA searches for a solution in a large search space. It determines starting points for the search
itself, which makes it not very sensitive to initial parameter values. Not the initial values of the
parameters affect the result, but the specied search space does. However, the GA does not take
the identication data as a starting-point for model derivation, such as fuzzy clustering does. A
priori information about the model structure has to be supplied. In addition, rules can be located in
areas of the domain where no data is present. This means that, depending on the model structure,
undesired extrapolation behavior can occur. In this example, this is the case for the fuzzy models
with 3 and 4 rules (see gures 4.16 and 4.17).
Because of the undesired rule locations and extrapolation behavior of the more complex models,
the 2-rule model is selected, despite its larger error. The model is shown in gure 4.15.
Figure 4.18 shows the parametric membership functions of the 2-rule model. For the -cut in the
consequent part parameter calculation, was set to 0.5, which yields the following local linear
models:
y
1
= 0.0060S + 0.0355X 0.2264 (4.80)
y
2
= 0.0079S 0.0003X 0.0070 (4.81)
The fuzzy model can now be formulated as:
IF S = mf
1
AND X = mf
1
THEN = y
1
IF S = mf
2
AND X = mf
2
THEN = y
2
(4.82)
2
82
0
10
20
0
50
0
0.1
0.2
0.3
S (g/l)
X (g/l)
(
h
1
)
0 5 10 15 20 25
0
20
40
60
S (g/l)
X
(
g
/
l
)
rule 1
rule 2
(a) (b)
Figure 4.15: Fuzzy relation for and rule locations for 2-rule model. Dots indicates identication data
0
10
20
0
50
0
0.1
0.2
0.3
S (g/l)
X (g/l)
(
h
1
)
0 5 10 15 20 25
0
20
40
60
S (g/l)
X
(
g
/
l
)
rule 1
rule 2
rule 3
(a) (b)
Figure 4.16: Fuzzy relation for and rule locations for 3-rule model. Dots indicates identication data
0
10
20
0
50
0
0.1
0.2
0.3
S (g/l)
X (g/l)
(
h
1
)
0 5 10 15 20 25
0
20
40
60
S (g/l)
X
(
g
/
l
)
rule 1
rule 2
rule 3
rule 4
(a) (b)
Figure 4.17: Fuzzy relation for and rule locations for 4-rule model. Dots indicates identication data
83
Hybrid model design
0 10 20
0
0.5
1
S (g/l)
mf1
0 10 20
0
0.5
1
S (g/l)
mf2
(a) (b)
0 50
0
0.5
1
X (g/l)
mf1
0 50
0
0.5
1
X (g/l)
mf2
(c) (d)
Figure 4.18: Parametric membership functions for the 2-rule fuzzy model
4.4.3 Neurofuzzy systems
Neurofuzzy systems can be viewed upon as a combination of fuzzy systems and articial
neural networks (ANNs). The fuzzy inference system is implemented in the framework of
these adaptive networks. This provides the possibility to use backpropagation learning rules,
commonly used to train these nets. Several approaches have been developed in the past.
Some of these have been applied within the context of hybrid modeling, such as NEFPROX
(Nauck and Kruse, 1997) and ASMOD (Kavli, 1993), covering both linguistic and TSK type
fuzzy models. Here, Jangs well-known Adaptive Network based Fuzzy Inference System
(ANFIS) approach is discussed in more detail Jang (1993).
Jang interprets a TSK fuzzy system as an adaptive network, on which adaptive learning rules
can be applied to optimize the system parameters. The interpretation consists of 5 layers.
Consider a simple 2 input, 1 output TSK model consisting of two rules:
IF x
1
= mf
11
AND x
2
= mf
21
THEN y = y
1
IF x
1
= mf
12
AND x
2
= mf
22
THEN y = y
2
(4.83)
in which
y
1
= p
1
x
1
+q
1
x
2
+r
1
(4.84)
y
2
= p
2
x
1
+q
2
x
2
+r
2
(4.85)
The ANFIS interpretation of this fuzzy model is shown in gure 4.19 and can be described
as follows:
Layer 1
The nodes in layer 1 contain the premise part membership functions of the model. For
84
mf
1,1
mf
1,2
mf
2,1
mf
2,2
2
_
_
1
y
1
2
y
2
_
_
y
x
1
x
2
layer 1 layer 2 layer 3 layer 4 layer 5
Figure 4.19: ANFIS architecture for a 2 input, 1 output TSK model
example, the node function for membership function j of input i is given by:
o
ij
= mf
ij
(x
i
) (4.86)
Layer 2
The nodes in layer 2 determine the Degree of Fire (DOF) of the rules of the model.
The DOF of rule i is calculated by the AND intersection operator:
i
= o
1i
o
2i
(4.87)
In this case, the product is used as the AND operator.
Layer 3
Layer 3 calculates the weighted DOF of each rule. The weighted DOF for rule i is
calculated by:
i
=
i
2
j
j
(4.88)
Layer 4
Using the local linear models, the weighted output of each rule is calculated. The
output of rule i is calculated as:
i
y
i
=
i
(p
i
x
1
+q
i
x
2
+r
i
) (4.89)
Layer 5
The output of the fuzzy model is calculated by layer 5:
y =
2
i
y
i
(4.90)
Parameter optimization is carried out using a so-called forward pass and a backward pass.
During every iteration, the consequence part parameters of the fuzzy model are determined
using the weighted least squares approach (see section 4.4.1). An alternate approach to the
85
Hybrid model design
weighted least squares calculation is an iterative calculation of the solution to the least squares
problem (Jang, 1993). In this approach, no matrix inversion is used which results in a com-
putationally more efcient calculation. In this work, the weighted least squares approach is
used. Calculation of the consequent parameters is called the forward pass.
In the backward pass, the error rates propagate backwards and the premise part parameters in
layer 1 are updated by a standard gradient descent learning rule. The parameters in layer 1
are the only parameters that are updated by the learning rule. This means that the error in the
output needs to be propagated to this layer.
Denote the output of a node in the ith position of the kth layer by o
k
i
and let O
k
i
be its function.
Let X = {x
j
|j = 1, . . . , N} be a set of feature vectors in
n
, where N is the number of
measurements and n is the dimension of the input-output space. In this example, n = 3. The
error measure of the ANFIS network is dened as the sum of the squared errors:
E
i
= (x
i,3
y
i
)
2
(4.91)
E =
N
i=1
E
i
(4.92)
In order to develop a learning procedure that implements gradient descent in E over the
parameter space, the error rate E
i
/o for the ith feature vector and for each node output o
needs to be calculated. The error rate for the outputnode can be calculated fromequation 4.91:
E
i
o
5
1,i
= 2(x
i,3
o
5
1i
) (4.93)
For a node j in layer k, the error rate can be derived by the chain rule:
E
i
o
k
j,i
=
#(k+1)
m=1
E
i
o
k+1
m,i
o
k+1
m,i
o
k
j,i
(4.94)
in which #(k) denotes the number of nodes in layer k. This means that the error rate of an
internal node can be expressed as a linear combination of the error rates of the nodes in the
next layer.
Now if a is a membership function parameter in a node in layer 1, the error rate can be
described by:
E
i
a
=
oS
E
i
o
o
a
(4.95)
where S is the set of nodes whose outputs depend on a. Then the derivative of the overall
error measure E with respect to a is:
E
a
=
N
i=1
E
i
a
(4.96)
86
The update equation for the parameter a is:
a =
E
a
(4.97)
in which is a learning rate which is expressed by:
=
K
_
a
_
E
a
_
2
(4.98)
where K is the step size, the length of each gradient transition in the parameter space. The
value of K can be changed to vary the speed of convergence. If K is small, the gradient
method will closely approximate the gradient path, but convergence will be slow. If K is
large, convergence will initially be fast, but the algorithm will oscillate about the optimum.
Based on this, K can be updated according to the following heuristic rules (Jang, 1993):
If the error measure undergoes four consecutive reductions, set K = r
Ki
K, in which
r
Ki
is the increase ratio.
If the error measure undergoes two consecutive combinations of one increase and one
reduction, set K = r
Kd
K, , in which r
Kd
is the decrease ratio.
Jang selects to increase or decrease the step size by 10%. This number is chosen arbitrarily.
In addition, the initial value of K is usually not critical as long as it is not too large.
The forward and backward pass are performed iteratively until some criterion is satised,
for example, until the change in parameters is very small or until a predened number of
iterations e has been completed.
The procedure starts with an initial fuzzy model. This model is trained by optimizing its
membership function parameters, while the rule structure is left intact. This means that the
rule structure needs to be dened beforehand. Neurofuzzy approaches that include rule learn-
ing are available (for example Kavli (1993)), but are not presented here.
Example 4.7 The ANFIS approach is illustrated by building a fuzzy model for the net growth rate
. The same data set is used as in examples 4.6 and 4.5.
As with the genetic algorithms example, no rules optimization is included. In the ANFIS case, this
involves the optimization of the network structure. Therefore, several identication experiments
were executed in order to determine a suitable number of rules. All rules were independent and
double sigmoid membership functions were used in each experiment.
The structure of the initial models inuences the end result. Therefore, the initial models were built
properly, i.e., the membership functions were designed in such a way that a fuzzy partition in the
input space was formed.
Settings and results for three different runs are shown in table 4.4. As with the GA, all models
perform well and the most complex model has the smallest error.
87
Hybrid model design
# rules e r
Ki
r
Kd
K
init
RMSE
2 1000 1.1 0.9 0.05 0.0208
3 1000 1.1 0.9 0.05 0.0197
4 1000 1.1 0.9 0.05 0.0179
Table 4.4: ANFIS setting and results
For all three runs, it is observed that the premise part is not altered much after identication. The
membership functions are only slightly changed. This was observed for all three experiments. As
can be expected from a gradient descent approach, ANFIS is very sensitive to initialization.
No mechanism is provided to maintain a fuzzy partition in the input space. Although this does not
have to impair model results, it may result in models that are difcult to interpret. For the 3-rule
model, rules 1 and 2 completely overlap after identication (see gure 4.21). This indicates that
the chosen model structure may be too complex.
Similar to the presented GA approach, ANFIS does not take the data as a starting point to determine
the model structure but uses an a priori dened model. This can result in undesired extrapolation
behavior if the model structure is too complex, especially in areas of the input-output domain where
there is little data. In this case, the 3-rule and 4-rule model show this behavior. The anomalous
extrapolation is caused by the rules in the area where there is little data.
The rules in the 2-rule model have sufcient data in their activation domain to calculate the conse-
quent parameters successfully. The output surface is similar to the 2-rule model identied with the
GA. The premise part membership functions are shown in gure 4.23 and the local linear models
are:
y
1
= 0.0111S + 0.0132X 0.1383 (4.99)
y
2
= 0.0054S + 0.0030X 0.1964 (4.100)
The fuzzy model can be formulated as
IF S = mf
1
AND X = mf
1
THEN = y
1
IF S = mf
2
AND X = mf
2
THEN = y
2
(4.101)
2
4.4.4 Remarks
Fuzzy clustering requires less a priori structure information than the GA and ANFIS.
The latter two methods need a pre-determined rule base structure and membership functions
to initialize parameter identication, whereas the clustering approach determines the number
of rules automatically. The results are therefore less sensitive to initialization. ANFIS uses
an initial model as a starting point for further optimization. Such an initial model may be
difcult to set up if no prior information is available, unless an expert could give some qual-
itative information. Furthermore, experimental results have shown what can be anticipated:
88
0
10
20
0
50
0
0.1
0.2
0.3
S (g/l)
X (g/l)
(
h
1
)
0 5 10 15 20 25
0
20
40
60
S (g/l)
X
(
g
/
l
)
rule 2
rule 1
(a) (b)
Figure 4.20: Fuzzy relation for and rule locations for 2-rule model. Dots indicates identication data
0
10
20
0
50
0
0.1
0.2
0.3
S (g/l)
X (g/l)
(
h
1
)
0 5 10 15 20 25
0
20
40
60
S (g/l)
X
(
g
/
l
)
rule 1
rule 2
rule 3
(a) (b)
Figure 4.21: Fuzzy relation for and rule locations for 3-rule model. Dots indicates identication data
0
10
20
0
50
0
0.1
0.2
0.3
S (g/l)
X (g/l)
(
h
1
)
0 5 10 15 20 25
0
20
40
60
S (g/l)
X
(
g
/
l
)
rule 1
rule 2
rule 3
rule 4
(a) (b)
Figure 4.22: Fuzzy relation for and rule locations for 4-rule model. Dots indicates identication data
89
Hybrid model design
0 10 20
0
0.5
1
S (g/l)
mf1
0 10 20
0
0.5
1
S (g/l)
mf2
(a) (b)
0 50
0
0.5
1
X (g/l)
mf1
0 50
0
0.5
1
X (g/l)
mf2
(c) (d)
Figure 4.23: Parametric membership functions for the 2-rule fuzzy model
the identied model is closely related to the initial model with respect to premise part pa-
rameters. This makes ANFIS sensitive to choices made before identication. With the GA,
information about the structure of the model also has to be provided (in terms of the rule base
and corresponding membership functions) in advance. The GA searches for a solution in a
much larger search space than the backpropagation algorithm and determines starting points
for the search itself, which makes it not very sensitive to initial parameter values. Not the
initial values of the parameters affect the result, but the specied search space does.
The input-output data for the growth rate is not distributed over the complete input space
of the system. Normal operation of the reactor causes S and X to be limited to a certain
part of the input space. A fuzzy model with independent rules will be able to cope with this
data much better, because the rules of these models will be able to describe working areas
within the part where data is present, without inuencing other rules. For example, if the best
partitioning of the input space is non-orthogonal, fuzzy models with dependent rules will be
limited in providing a simple rule base that handles this.
The advantage of fuzzy clustering is that it focuses on the data and derives a fuzzy model
with independent rules. To obtain the same result with the GA or with ANFIS, prior knowl-
edge has to be provided about the structure and initial location of the rules. This may be
cumbersome for high dimensional systems. If this prior knowledge is not provided, rules
may be present that have no meaning and that can complicate optimization. For example, this
was observed with the 4-rule model that was built with ANFIS. Although the overall result is
good for the part where data is present, rule 4 in the area with high S and X is not desirable
(gure 4.22 (b)).
As with all black box techniques, care has to be taken in extrapolating these fuzzy mod-
els. The 3- and 4-rule models that were built with ANFIS show an example of extrapolation
properties that seriously will impair hybrid model results. Post-processing of the identica-
90
tion results can improve this by assuming linear behavior when extrapolating, which often
is the best assumption that can be made. Since TSK models are a collection of local linear
models, evaluating rules located at the edge of the input space and adjusting membership
functions when necessary will ensure this.
Based on these results, the Gustafson-Kessel clustering approach with structure optimization
will be used as the main identication tool for fuzzy systems throughout the rest of this work.
4.5 Submodel integration
Since the general structure of the hybrid model is a framework of accumulation balances
accompanied with algebraic fuzzy relations, integration of the physical and fuzzy parts is
straightforward. The fuzzy submodels are part of the set of equations that make up the hybrid
model.
However, with respect to the fuzzy submodels, two sources of error may result in unac-
ceptable hybrid model performance. First of all, estimates are made in order to obtain input-
output data. Estimation errors will manifest themselves through the fuzzy model in the hybrid
model. Secondly, the fuzzy models are t to input-output data. Errors resulting from fuzzy
model identication can also cause hybrid model errors. Since the hybrid models are dynam-
ical and usually are simulated as a free run (numerically and in an autoregressive manner),
small errors are integrated which eventually can result in large offset.
If hybrid model performance is unacceptable, it can be improved by manipulating the fuzzy
parts of the hybrid model. This means optimizing fuzzy model parameters with respect to the
hybrid model output.
This section will discuss the integration of the fuzzy submodels of a hybrid model by op-
timization. The main idea is to improve hybrid model performance without discarding the
properties of the fuzzy submodels.
4.5.1 Parameters
The number of parameters in the fuzzy submodels that has to be optimized is quite large.
One rule of the fuzzy model for the net growth rate in the bioreactor case, for example,
contains about 10 parameters, depending on the type of membership function that is used.
Due to the curse of dimensionality the number of parameters increases exponentially for
systems with higher dimensions.
During optimization, it is proposed to account for the meaning of the parameters of the fuzzy
model. With TSK models, the premise part parameters determine the operating range for
which the local linear models are valid. In order to maintain some of the information about
91
Hybrid model design
the operating range that the fuzzy identication algorithm has determined, an optimization
algorithm has to be selected that can deal with this. This means constraints for the premise
part parameters should be introduced. These constraints can put limitations on the level of
fuzziness of the sets and their location in the input domain. The constraints can be determined
from the fuzzy model and heuristic knowledge. Usually the constraints can be formulated as
an allowed percentage of variation.
It is clear that the consequent parameters have the largest impact on the model performance.
Since the impact on model transparency is relatively low, the optimization of these parameters
should be less constrained. In particular, if local interpretation of the model is important, the
optimization should be more constrained than if this is not the case. Since the optimization is
performed globally, and not with respect to the local linear model output, local interpretation
may be impaired.
To reduce the number of parameters, the optimization can also be performed on rule weights
or fuzzy model weights. The optimization of rule weights corresponds with the adjustment of
the parameters of a single local linear model by the same amount, while optimization of the
model weights corresponds with the adjustment of all consequent parameters. Optimization
of rule weights or model gain can be useful for large parameter systems, if similar error
behavior over the entire input domain is observed, or if model errors can be traced to a
specic rule in the fuzzy model. In addition, local interpretation is preserved better than with
unconstrained consequent parameter optimization.
4.5.2 Optimization
In principle, any optimization algorithm that the modeler nds appropriate, can be ap-
plied for hybrid model optimization. Care has to be taken, however, that the optimization
algorithm is able to deal with the number of parameters that will be optimized.
Since the goal is to improve the hybrid model performance by using the results from the
submodel identication step as a starting-point, direct search (simplex) or gradient-based
approaches are suitable to perform the task. Approaches such as genetic algorithms (for
example applied in Roubos et al. (1999)) discard the existing model parameters and are only
appropriate if the search space is limited, thus if heavy constraints are imposed that are based
on the parameters of the fuzzy submodel before optimization.
For large parameter optimization problems, the approach described in Branch et al. (1999)
was found to be suitable. This approach transforms large parameter problems into a two
dimensional quadratic approximation for a certain trust region by using a preconditioned
conjugate gradient approach. This quadratic problemis subsequently solved. Box constraints
are incorporated by reecting the search path when it encounters a bound. The algorithm
is available commercially as part of the MATLAB Optimization Toolbox. Here, only a brief
overview is given.
92
Trust region method
Let f(x) be the objective function to be minimized with respect to the vector x with n
parameters. The basic idea is to approximate f(x) with a simpler function q that describes
the original function reasonably well in the neighborhood of x. This neighborhood is the so-
called trust region N. Atrial step d is calculated by minimizing q over N. The optimization
problem can be reformulated as:
min
d
{q(d)|d N} (4.102)
The current point is updated to x +d if f(x +d) < f(x). If this is not the case, the current
point remains unchanged and N is shrunk, after which a new trial step is calculated. To be
able to solve the problem, an appropriate approximation function q, a method for choosing
the trust region N and a solution method for the trust region subproblem (equation 4.102)
need to be available.
The approximation q can be dened by the rst two terms of the Taylor expansion of f at x.
The trust region N is usually spherical or ellipsoidal. The problem can now be stated as:
min
d
{
1
2
d
T
Hd +d
T
gDd
2
} (4.103)
with g the gradient and H the Hessian of the objective function, D a positive diagonal scaling
matrix and a positive scalar.
Good algorithms are available for solving equation 4.103 (Byrd et al., 1988) but they are time
consuming for large numbers of parameters. Therefore, a different approach is applied. If
the trust region problem is restricted to a two-dimensional subspace S, the solution of 4.103
requires little computational effort.
Subspace calculation
The two-dimensional subspace S =< s
1
, s
2
> is spanned by two vectors: the direction
of the gradient s
1
= D
2
g (the scaled gradient vector) and either a direction of negative
curvature (the steepest descent direction):
s
T
2
Hs
2
< 0 (4.104)
or an approximate scaled newton step:
Ms
2
= g (4.105)
where
g = D
1
g = diag(|v|
1/2
)g (4.106)
M = D
1
HD
1
+diag(g)J
a
(4.107)
93
Hybrid model design
J
a
is the Jacobian of |v| and v is a vector that implements box constraints. The solution to
equation 4.105 can only be found if
M is positive denite. If this is not the case, the direction
of negative curvature 4.104 is selected for s
2
.
Box constraints
If constraints are imposed on the parameter vector x, then the optimization problem can
be formulated as:
min
x
{f(x)|l < x < u} (4.108)
where l is a vector of lower bounds and u is a vector of upper bounds. Some (or all) compo-
nents of l may be equal to , some components of u may be equal to . The constraints
are implemented by the vector v:
IF g
i
< 0 AND u
i
< THEN v
i
= x
i
u
i
IF g
i
0 AND u
i
> THEN v
i
= x
i
l
i
IF g
i
< 0 AND u
i
= THEN v
i
= 1
IF g
i
0 AND u
i
= THEN v
i
= 1
(4.109)
If, during a step, a constraint is hit, the step is reected off the constraint. This increases the
step size (Branch et al., 1999). For a (single) reection step, given a step d that intersects a
constraint, consider the rst bound constraint crossed by d and assume it is the i-th bound
constraint. Then the reection step d
R
= d except in the i-th component, where d
R
i
= d
i
.
Algorithm
The optimization algorithm can be summarized as follows:
1. Subspace calculation: Formulate the two-dimensional trust region subproblemS
2. Trust region method: Solve the two-dimensional subproblemS. Calculate the trial step
and use reections if necessary
3. If f(x+d) < f(x) then set x = x+d and start again with step 1 or else go to step 4
4. Adjust to shrink the size of the trust region and determine a new trial step d in step
2
4.5.3 Objective function
All relevant hybrid model outputs should be incorporated in the objective function. For
dynamical models, the goal function should be able to deal with measurement trends instead
94
of steady state errors. In addition, scaling of the different errors is appropriate. A straightfor-
ward and common least squares criterion that incorporates this can be formulated as follows:
J =
1
2
m
j=1
n
i=1
w
i
(y
ij
y
ij
)
2
(4.110)
in which y
i
is the vector with (normalized) measurements of output i and y
i
is the vector
with (normalized) model output values of output i. w is a weighting vector. The choice of
the weights depends on the problem.
In Psichogios and Ungar (1992), the parameters of a neural net (that is part of the hybrid
model) are optimized with respect to the hybrid model output. The objective function in this
work incorporates sensitivity functions (partial derivatives of the output to the parameters that
are optimized) in the objective function. This way, the error is propagated back to the output
of the neural net. However, if gradient descent based optimization approaches are used,
sensitivity information of the model output with respect to the parameters will be calculated
by the algorithm. This omits the need to incorporate sensitivity equations explicitly in the
objective function.
4.5.4 Remarks
The optimization of the hybrid model as presented in this section can be extended to
incorporate other model parameters. However, the larger the number of parameters that is
optimized, the more likely it is that the different parameters will compensate for each others
errors. For example, if several fuzzy models are optimized simultaneously, then they can
compensate for each others contribution to the hybrid model output, which results in a loss
of meaning of the fuzzy model output. This is even more the case if other model parameters
are included.
To gain insight in the inuence of the different fuzzy models or model parameters on the
hybrid model output, sensitivity analysis can be worthwhile before performing the integration
step. This will provide information about the main sources of the error in the model output.
With respect to this, it is also recommended to investigate the contributions of the specic
rules in the fuzzy models. The analysis forms the basis for selecting the set of parameters
that is optimized and for determining objective function weights.
It is clear that the complexity of the problem is reduced if the integration is performed on
smaller, independent parts of the hybrid model. For example, if the fuzzy model for a re-
action rate is optimized, the mass balance which contains this reaction rate should only be
incorporated. Whether this is possible depends on the measurements that are available, the
model structure and the sensitivity analysis.
With the optimization of large sets of parameters, it is difcult to determine whether the global
optimum is found. By using gradient based algorithms, it is more likely that a local optimum
is found. However, it is usually possible to formulate criteria for acceptable performance of a
model. The goal of the optimization should therefore not be the determination of the optimal
95
Hybrid model design
0
10
20
0
50
0
2
4
6
x 10
3
S (g/l)
X (g/l)
q
p
(
h
1
)
5 10 15 20 25
10
20
30
40
50
S (g/l)
X
(
g
/
l
)
rule 1
rule 2
rule 3
(a) (b)
Figure 4.24: Fuzzy relation for q
p
and rule locations. Dots indicates identication data
set of parameters, but nding a solution that meets these criteria.
Example 4.8 Consider the hybrid model for the fed-batch bioreactor (example 4.1) and assume
that the fuzzy model for the net growth rate that was identied with fuzzy clustering in exam-
ple 4.5 is available. Furthermore, assume that a fuzzy model for the production rate q
p
as a func-
tion of the substrate concentration S and the biomass concentration X is available and that this
model was identied with fuzzy clustering, based on estimates by a PI-estimator. The estimates
of q
p
were made by using measurements of P and the mass balance for the product concentration
(equation 4.3). The settings for the PI-estimator are shown in table 4.2, while the settings for the
clustering were 5 for the initial number of clusters and 0.55 for the cluster merging threshold .
The fuzzy model is shown in gure 4.24.
In this example, the large scale algorithm will be illustrated. The initial performance of the hybrid
model for run VAL1, before optimization, is shown in gure 4.25. It can be seen that for this
run, the concentration of P starts to decrease after about 160 hours, which is not possible. This is
caused by anomalous behavior of q
p
. The production rate decreases after 160 hours, which results
in a decrease in production of P. Because the amount of P in the reactor is diluted, the net result
is a decrease in the concentration of P. The increase in S is caused by a decrease in the substrate
consumption rate, which is a result of the decrease in q
p
.
The goal is to improve hybrid model performance with respect to the errors in the biomass X, the
substrate S and the product P. This will be done by using the measurements from the identication
batch runs ID1 to ID16 (see example 4.2) as a reference. The objective function is formulated
according to equation 4.110:
J =
1
2
(e
2
S
+e
2
X
+e
2
P
) (4.111)
in which e
S
, e
X
and e
P
are normalized error signals, dened as:
e
S,k
= |
S
ij
S
ij
S
j
| with k = i j (4.112)
e
X,k
= |
X
ij
X
ij
X
j
| with k = i j (4.113)
e
P,k
= |
P
ij
P
ij
P
j
| with k = i j (4.114)
96
0 50 100 150 200
0
20
40
60
t (h)
X
(
g
/
l
)
0 50 100 150 200
0
5
10
15
t (h)
S
(
g
/
l
)
(a) (b)
0 50 100 150 200
0
2
4
6
8
10
12
t (h)
P
(
g
/
l
)
(c)
Figure 4.25: Run VAL1 results before and after optimization. Dots indicate measurements, dashed line
hybrid model before optimization, dotted line hybrid model after optimization for X (a), S (b) and P
(c)
with index i indicating the time step and index j indicating the batch run.
S indicates model
estimates for S and
S indicates the average value of S for a batch run, with similar denitions for
X and P.
Although it is possible to optimize the fuzzy models for and q
p
sequentially (rst, with respect
to X and then q
p
with respect to P), it is interesting to see how the large scale algorithm deals with
optimization of a large set of parameters in a hybrid model. Therefore, all the parameters of the
two fuzzy models will be optimized simultaneously. The result is that a set of 66 parameters will
be optimized; 48 premise part parameters and 18 consequent part parameters. The premise part
parameters are constrained; the bounds are set at the initial values +/- 10 %. No constraints were
placed on the consequent part parameters.
The results of the optimization are shown in gure 4.25 (for clarity, only some of the measurements
are shown). The anomalous behavior has been removed and model offset has been reduced to
acceptable levels. The results are shown in table 4.5. The errors are the normalized errors for run
VAL1 at the end of the batch.
Figure 4.26 shows the local linear models of the fuzzy relations before and after the optimiza-
tion. In both models, the rule locations have not been altered much, due to the constraints. The
consequent parameters of the fuzzy model for have only been changed slightly in order to ac-
complish good behavior of X. The consequent parameters in the model for q
p
have been changed
substantially. The local interpretation of the linear submodels has been impaired.
97
Hybrid model design
Error Before optimization After optimization
e
X
0.29 0.03
e
S
0.76 0.06
e
P
0.17 0.02
Table 4.5: Optimization results
These results can be explained in two ways. The rules in the fuzzy model overlap, as a result of
which the overall model output is different than described by the linear models. This is shown
in gure 4.27. It is this overall output that determines the performance of the hybrid model. In
addition, the sensitivity of P for changes in q
p
is relatively small for rules 1 and 2. The sensitivity
can be calculated using the sensitivity equations (Caracotsios and Stewart, 1985). Results are
shown in gure 4.28. This means that changes in the parameters of these rules do not have a large
impact on hybrid model performance. The sensitivity is relatively large for rule 3. The parameters
of this rule have only been adjusted slightly to improve the hybrid model (see gure 4.26).
Whether behavior as illustrated by the model of q
p
should be accepted depends on the objectives
of the modeler. The overall hybrid model performance is good. If, however, according to the
modelers judgement, the fuzzy relationship in a certain working area is unrealistic, it could be
rejected. It should be noted that fuzzy logic is still a black box technique and that care should
be taken in associating a physical meaning with the results. Furthermore, it could be argued that
less importance should be attached to areas with low sensitivity in relation to areas with larger
sensitivity. An advantage is that fuzzy logic provides a means to learn more about the relative
5
10
15
20
25
10
20
30
40
50
0
0.1
0.2
0.3
S (g/l)
X (g/l)
(
h
1
)
rule 1
rule 2
rule 3
5
10
15
20
25
10
20
30
40
50
1
2
3
4
5
x 10
3
S (g/l)
X (g/l)
q
p
(
h
1
)
rule 1
rule 2
rule 3
(a) (b)
5
10
15
20
25
10
20
30
40
50
0
0.1
0.2
0.3
S (g/l)
X (g/l)
(
h
1
)
rule 1
rule 2
rule 3 5
10
15
20
25
10
20
30
40
50
1
2
3
4
5
x 10
3
S (g/l)
X (g/l)
q
p
(
h
1
)
rule 1
rule 2
rule 3
(c) (d)
Figure 4.26: Local linear models for and q
p
before ((a) and (b)) and after ((c) and (d)) optimization
98
5
10
15
20
25
10
20
30
40
50
0
0.1
0.2
0.3
S (g/l)
X (g/l)
(
h
1
)
5
10
15
20
25
10
20
30
40
50
1
2
3
4
5
x 10
3
S (g/l)
X (g/l)
q
p
(
h
1
)
(a) (b)
5
10
15
20
25
10
20
30
40
50
0
0.1
0.2
0.3
S (g/l)
X (g/l)
(
h
1
)
5
10
15
20
25
10
20
30
40
50
1
2
3
4
5
x 10
3
S (g/l)
X (g/l)
q
p
(
h
1
)
(c) (d)
Figure 4.27: Fuzzy models for and q
p
before ((a) and (b)) and after ((c) and (d)) optimization
importance of working areas. In addition, it allows the modeler to tune performance independently
in these areas.
The sensitivity equations could be used to reduce the size of the optimization problem in advance
by leaving out optimization of parameters with limited sensitivity. It should be noted, however,
that the analysis above was done after the results were obtained and that optimization results may
be slightly different if the complete model is not included in the optimization. It may, for exam-
ple, be difcult to decide when the sensitivity is low enough to exclude a rule from optimization
beforehand. 2
4.6 Model adjustment
The model that is built using basic modeling is based on the hypothesis and the available
process measurements. Sometimes, this information may not prove to be sufcient to build
the model. In addition, some of the hypothesis may not be correct (mainly due to simplica-
tion). This means that the model needs to be adjusted.
In essence, model adjustment is basic modeling, based on the model analysis results. The
results from this step need to state the nature of the problems as clearly as possible. With
this information, a new hybrid model structure is designed, following the basic modeling
99
Hybrid model design
0
10
20
0
50
0
5000
10000
S (g/l)
X (g/l)
X
/
0
10
20
0
50
0
1000
2000
3000
S (g/l)
X (g/l)
P
/
q
p
(a) (b)
Figure 4.28: Sensitivities for X with respect to (a) and P with respect to q
p
(b)
procedure.
Model adjustment can be done in two ways: at the same level of detail or at a higher level of
detail. It is advisable to rst try developing another model at the same level of detail before
going to a higher level of detail. If this not possible, then a higher level of detail needs to be
selected. The increase in detail needs to be done in small steps. Based on the results of the
model analysis step, the modeler needs to decide which structure adjustments are necessary.
In principle, this means the addition of model equations. An example is the introduction of
temperature dependency in heat capacity coefcients. In some cases, the adjustment of the
model structure requires the addition of balance equations. If this is the case, the deliberation
needs to be made whether the increase in modeling effort and complexity is justied with
respect to model performance.
4.7 Concluding remarks
The design phase of the hybrid modeling procedure can be summarized as follows:
Determine the hybrid model structure and distinguish subprocess modeling problems,
for which fuzzy submodels should be developed
Based on this structure, acquire relevant process data
Estimate unmeasurable behavior of the subprocesses
Identify the fuzzy submodels using fuzzy clustering
Integrate the fuzzy submodels by optimizing their parameters with respect to the hybrid
model output
The steps are performed sequentially and independently of each other. The advantage is that
the modeling problem is split into several simpler problems. Combining the solutions of
100
these problems may lead to a non-optimal hybrid model; the result is that an integration step
is needed.
For the subprocess behavior estimation, the PI-estimator is a simple and useful alternative
to more elaborate estimation approaches. Its simple nature makes it possible to set it up
quickly. Care should be taken, when there is large coupling between two or more parameters
that are estimated simultaneously. For many applications, however, the approach can be
applied without many problems (Van Lith et al., 2001).
Fuzzy clustering provides a good way to identify the fuzzy submodels. It requires little a pri-
ori model structure information. In addition, because of the structure optimization algorithm,
it is insensitive to initialization. It also derives a fuzzy model with independent rules directly
from the data, which results in models that are not likely to show anomalous extrapolation
behavior.
During submodel integration, the parameters of the fuzzy submodels are optimized with re-
spect to the hybrid model output. To maintain part of the results from the identication step,
these fuzzy models should be used as a starting point, which makes gradient based optimiza-
tion algorithms suitable tools.
Although algorithms for optimizing large sets of parameters are available, it is obvious that
the number of parameters should be kept as low as possible. This makes sensitivity analysis
worthwhile. In addition, the optimization problemcan be reduced by optimizing rule weights
or model weights.
Model adjustment essentially provides a feedback procedure to adjust the model structure
if a hybrid model performs unacceptably. The model adjustment step incorporates the same
elements as the basic modeling step. Based on the initial hybrid model, a new model structure
can be developed, after which the data acquisition, subprocess behavior estimation, submodel
identication and integration steps are performed.
101
5
Hybrid model properties
To illustrate the concepts of the structured hybrid modeling approach, this chapter will
deal with the development of a more elaborate hybrid model than discussed in the previous
chapters. In addition, this model will provide the basis for a discussion on hybrid model
properties, by comparing it with the two extremes of the hybrid modeling spectrum: a rst
principles model and a fuzzy model. This provides more information about the applicability
of hybrid model in general.
A continuous pulp digester was selected as the test case. This is a unit operation used to
produce paper pulp fromwood chips. It is a nonlinear distributed parameter process, of which
the process conditions vary considerably. In addition, the description of the internal behavior
often is empirical, for which many different approaches exist. This makes the process suitable
for development of a hybrid model.
For the comparison, a rst principles model is available in the form of the Extended Purdue
Model, an industrially accepted benchmark model for the process. This is a rigorous model
that describes the internal and overall dynamic behavior of the digester. The rst principles
model will provide the basis for hybrid model development, which means that the hybrid
model is not based on actual process data. However, by using this approach, it will be pos-
sible to investigate which essential characteristics of the process should be described, while
good performance and interpretability is maintained. The rigorous model will be reduced to
a simpler model structure which describes the essential behavior of the digester, given the re-
quirements that were dened for the hybrid model. As a result, new model equations need to
be derived, by which the hybrid modeling approach can be illustrated. The fuzzy model will
also be based on the rst principles model. Input-output data, generated by the rst principles
model, is used to construct the fuzzy model.
The comparison of the three models will be based on the ve model quality aspects: static
performance, dynamic performance, complexity, interpretability and process independence.
The evaluation will be related to the three general areas of model applications: research and
development, design and control.
First, this chapter will discuss the pulping process and the rst principles model. Based
on this information, the hybrid model objective and model requirements will be discussed,
followed by the design and evaluation. Then the fuzzy model will be designed and evaluated,
after which the three models will be compared.
5.1 Process description
The production of pulp for paper products has been a widely studied subject, especially
in the Scandinavian countries and Canada. In particular, the continuous digester, the unit op-
eration for manufacturing pulp, has received much attention. It is a challenging unit operation
for control studies due to unmeasured disturbances, long time delays and nonlinear behavior
(Wisnewski et al., 1997). In addition, different interpretations about the physical processes
that take place have been the basis for model development, resulting in a variety of models.
This makes the unit operation an interesting candidate for developing a hybrid model.
105
Hybrid model properties
For more information about wood and pulp processing, the reader is referred to Jahn and
Strauss (1992).
5.1.1 Wood and pulp processing
Wood contains a mixture of polymers that can be split in three groups:
Lignin (20 - 30%)
Cellulose
Hemicellulose; anhydrides of Xylose, Arabinose, Glucose, Mannose, Galactose, Xylan
and Galactoglucomman
The strength of the wood structure comes from the cellulose bers. Cellulose is a very strong,
chemically highly resistant, berous polymer. The other elements are easily degradable by
treatment with acids or bases.
Lignin serves to hold the cellulose bers and other elements and binding them together into
the anatomical structure we know as wood. Lignin is susceptible to degradation and dis-
solution by treatment with strong bases under elevated temperatures. Lignin can also dis-
integrate when treated with acid sulphite solutions or oxidizing agents. Thus, lignin can be
removed fromthe wood, leaving separated cellulose bers in the formof a pulp. This pulp can
then be processed to produce paper. The main quality parameter for the pulp is the so-called
Kappa number #, which is a measure for the amount of lignine in the wood.
In general, the paper production process incorporates:
Wood preparation
Pulping (mechanical or chemical)
Pulp screening and cleaning
Pulp bleaching
Paper production
Mechanical pulping is used for low quality paper. With mechanical pulping, a log is pressed
against a rotating disc or cylinder, which yields what is called groundwood pulp. In this work,
the focus is on chemical pulping.
106
Wood preparation
Bark contains many short bers besides colored, non-berous materials. This yields pulp
that is not suitable of making paper. If the pulp contains bark, the resulting paper is not very
strong and discolors. Therefore, the bark needs to be removed as much as possible. This is
accomplished by letting the logs scrub each other in a rotating cylinder. Other methods exist
to debark the logs, for example removing the bark with knives or by means of water jets.
After the logs have been debarked, the wood can be used directly in the following production
steps. When the wood is used as raw material for chemical pulping, it is necessary to chop
the logs into wood chips. These woodchips usually have dimensions of about 0.5 x 2 cm.
The logs are chopped into thin slices using a rotating disc containing radial placed knives.
The slices fall apart to the desired woodchips due to the enormous forces the rotating discs
exert on them.
Chemical pulping
Lignin constitutes approximately 20 to 30 % of the total weight of wood. In manufactur-
ing puried pulps of high whiteness, it is necessary to remove as much lignin as possible with
minimum loss or degradation of the carbohydrate cell wall. Usually this is done by pulping
to liberate the bers and then bleaching the bers to the desired whiteness. Unbleached pulps
are usually dark brown in colour and are used in that form for grocery bags and wrapping
paper.
Lignin is not a pure component, but can be considered as a class of substances. Lignin is a
complex, three-dimensional polymer containing mostly phenylpropane units joined together
by various ether and carbon-carbon linkages. Aim of the chemical pulping processes is to
deliberate the lignin by dissolving or degradation. There are two methods to accomplish this:
reactions with acids or bases. Both methods are carried out in an aqueous environment under
elevated pressure and temperature.
The sulphite process uses a cooking liquor of sulphurous acid and a salt of this acid. While
calcium was the most widely used base at a time, it has been replaced with sodium, magne-
sium, and ammonia. The sulphate process uses a mixture of sodiumhydroxide and sodium-
sulphide as the active chemical. The term sulphate process is used because sodiumsulphate
is used as make-up chemical. The word Kraft is also used to refer to this process for it is the
German word meaning strength. This method produces the strongest pulp. In the past, am-
monium base NH
4
OH was used and had the advantage that this liquor could be vaporized
and burned without any residue. Only air pollution occurred. The SO
2
in the gas efuents
can be lead through a gas-washer and react with fresh ammoniumhydroxide.
The sulphite process is carried out in batch reactors using longer residence times than the
Kraft process with temperatures of 140-150
C. This process is most appropriate for wood
species such as spruce and pine trees where a relatively light colored and strong pulp is
107
Hybrid model properties
Impreg-
nation
Cooling
Washing
Cooking
Heating
chips
flow
liquor
flow
Black liquor
Pulp product
Wood chips/White liquor
lower heater
upper heater
wash heater
quench
recircu-
lation
Filtrate
(a) (b)
upper heater pipe
upper heater screen
lower heater pipe
lower heater screen
quench pipe
extraction screen
wash pipe
wash screen
Figure 5.1: Digester unit operation (a) and functional zones (b)
produced. The sulphite process can therefore be found in those regions where the wood
species mentioned above exist.
5.1.2 The continuous pulp digester
The continuous pulp digester is a unit operation designed to convert woodchips into a
cellulose ber pulp. The pulping of the woodchips is accomplished by cooking the woodchips
in a hot solution of sodium hydroxide and sodiumsulphide, referred to as white liquor. The
digester is essentially a tubular reactor, where woodchips travel from the inlet at the top to
the outlet at the bottom of the digester. The surrounding liquor ow is either in co-current
or in counter-current with respect to the woodchips, dependent on the functional zone where
the woodchips are owing. A schematic overview of the digester and its functional zones is
given in gure 5.1.
In the impregnation zone the woodchips are brought into contact with a co-current ow of
white liquor. The white liquor components (primary sodium hydroxide and sodium sulphide)
diffuse into the pores of the porous woodchips. As the temperatures in this zone are relatively
low (around 385 K), practically no delignication takes place. The length of the impregna-
tion zone is 5.5 m.
108
Reactor height 32 m
Reactor diameter 5 m
Throughput 3.7 m
3
/min
Pulp yield 600 t/d
Residence time 2 h
Pressure cooking zone 7 bar
Reaction temperature 430 K
Table 5.1: Typical digester dimensions and operating conditions
In order to reach the required reaction temperatures, in the various zones part of the white
liquor ow is drawn off the digester. After being heated in external heat exchangers, the white
liquor ow is fed back into the heating zone. The white liquor is supplied through a system
of concentric pipes. The digester contains two heating zones for gradually heating the white
liquor to the desired reaction temperatures. This way, the upper heater may be considered as
a pre-heater. The actual temperature control is achieved by adjusting the lower heater outlet
temperature. The upper heater outlet temperature is approximately 410 K, the lower heater
outlet temperature is about 430 K. The total length of the heating zones extends to 4.7 m.
The actual delignication takes place in the cooking zone, which extends to 12 m. The
majority of the delignication takes place in this section. The Kappa number of the pulp is
drastically reduced. The reaction products diffuse into the white liquor at the same time the
reactants diffuse from the white liquor into the pores of the woodchips. In the cooking zone,
typical temperatures are about 430 K and pressures are about 7 bar (Jahn and Strauss, 1992).
A concentric pipe ending 1.5 m above the end of the cooking zone injects a relatively cold,
non-reactive liquor ow. The temperature of the woodchips is reduced and the delignica-
tion reactions are quenched. The quench liquor is a process ow coming from the digester,
referred to as black liquor which is essentially a contaminated mixture of white liquor and
wash liquor. Black liquor contains few reactive components.
In the washing zone, a counter current ow washes the degradation products fromthe pulp. In
the counter current region a large average driving force is maintained between the woodchips
and the washing liquor. A process ow referred to as ltrate is used as washing liquor.
The wash liquor feed conguration presented in gure 5.1 is a simplied one. In an actual
digester, a part of the wash liquor is fed into the digester through the cold blow line, which
prevents caking of the pulp in the bottom of the digester. The temperature of the wash liquor
has a large inuence on the pulp quality. As diffusion is favored by high temperatures, more
degradation products are washed from the pulp at higher temperatures. However, too high
temperatures cause continuation of the delignication process, damaging the cellulose bers
structure yielding an inferior paper quality.
In the cooling zone, part of the injected ltrate ow goes down and thus travels co-current
with respect to the woodchips. This ow cools the woodchips. At the bottom of this zone,
the outlet device is located.
Typical digester dimensions and process conditions are given in table 5.1.
109
Hybrid model properties
5.1.3 Digester control
The primary control objective for the digester is to produce pulp at a specied Kappa
number at maximal yield. A Kappa number that is too high increases the cost necessary
to bleach the pulp, a Kappa number that is too low means damaging the cellulose bers
and using more chemicals and energy than necessary. Process optimization involves using
a minimum input of energy and chemicals to achieve the target Kappa number. Adequate
process control minimizes the variation in the measured Kappa number and creates room for
optimization because it enables operating more closely to the target Kappa number.
Controlling the digester is a difcult task because of the long time delays involved, non-linear
behaviour and unfrequent availability of process measurements. The Kappa number can only
be determined by an off-line analysis, taking up to one hour. If a disturbance affecting the
Kappa number occurs, it will take at least one process dead time increased by the time re-
quired for analysis before the disturbance is detected. Furthermore, because many inputs
affect the Kappa number, multivariable control is highly appropriate for the digester.
Traditional digester control involves stabilizing the most important variables of the process
in such a way that reaction conditions remain constant. Variations in observed or estimated
Kappa number are compensated by adjusting the lower heater outlet temperature.
5.2 Extended Purdue Model
The Kraft pulping process for both batch and continuous digesters has been modeled to
various levels of complexity. Wisnewski et al. (1997) present a short overview of digester
modeling based on physical principles. In addition, modeling approaches include black box
techniques such as Wiener models (Godasi and Palazoglu, 2001), Partial Least Squares mod-
els (Alexandridis et al., 2001) and neural networks (Dufour et al., 2001).
A well known rst principles model for a continuous pulp digester is the Purdue Model
(Smith and Williams, 1974) which is a detailed and industrially accepted digester model
(Wisnewski et al., 1997). The model has been extended by providing improved denitions
of mass concentrations and volume fractions and a more detailed description of the mass and
energy transport. This Extended Purdue Model (EPM) is described in detail in Wisnewski et
al. (1997) and will serve as the reference model for hybrid and fuzzy model development.
5.2.1 Model structure
In Wisnewski et al. (1997), the continuous pulp digester is modeled as a series of 50
Continuously Stirred Tank Reactors (CSTRs) leading to a lumped parameter system. Con-
sequently, true plug ow behaviour is approached. Modeling the digester as a Plug Flow
Reactor (PFR) instead would yield partial differential equations (PDEs). However, modern
110
A
F
I
H
G
E
D
C
B
J
chips
flow
liquor
flow
Black liquor
Pulp product
Wood chips/White liquor
lower heater
upper heater
wash heater
quench
recircu-
lation
Filtrate
Chips flow Liquor flow
Black liquor
Filtrate
upper
heater
lower
heater
wash
heater
(a) (b)
Height
0 m
5.5m
10.2 m
21.5 m
28.3 m
32 m
I
m
p
r
e
g
n
a
t
i
o
n
H
e
a
t
C
o
o
k
i
n
g
W
a
s
h
C
o
o
l
Figure 5.2: Digester unit operation (a) and model owsheet (b)
simulation packages such as gPROMS are able to solve PDEs directly. Therefore, the CSTR
description is transformed to a distributed parameter system (Jansen, 2000). This yields the
exibility to solve the model as accurately as needed. By specifying more discretization in-
tervals, more accurate results will be obtained. In a CSTR series approach, this can only be
achieved by changing the model structure. The digester model owsheet is based on the
CSTR model structure and incorporates the heater and recycle loops. The model owsheet is
shown in gure 5.2.
The digester model is implemented by treating certain disturbances constant and implement-
ing them as parameters. In an actual digester, these parameters are not constant and change
over time. Some of the disturbances are non-measurable and difcult to anticipate. In the
context diagram (gure 5.3), the most important digester unit operation inputs, outputs and
disturbances are shown.
In the model, the following process input disturbances are treated as constant with respect to
time:
Compaction behaviour. The model uses a linear compaction prole throughout the
digester. In an actual digester, the compaction prole will vary with time and place.
Especially at the extraction screen, the woodchip hold-up uctuates. Furthermore, the
assumed compaction prole will represent a mean situation that never occurs in actual
111
Hybrid model properties
continuous pulp
digester
volumetric wood chips flow
volumetric white liquor flow
temperature upper heater
temperature lower heater
temperature entering wood chips
temperature entering white liquor
temperature quench
quench flow
pulp properties / Kappa number
extraction flow composition
additional composition measurements
various temperature measurements
wash liquor composition at outlet
yield
c
o
m
p
o
s
i
t
i
o
n
/
m
o
i
s
t
u
r
e
w
o
o
d
c
h
i
p
s
r
e
a
c
t
i
o
n
p
r
o
p
e
r
t
i
e
s
w
o
o
d
c
h
i
p
s
c
o
m
p
o
s
i
t
i
o
n
f
i
l
t
r
a
t
e
c
o
m
p
o
s
i
t
i
o
n
w
h
i
t
e
l
i
q
u
o
r
c
o
m
p
a
c
t
i
o
n
b
e
h
a
v
i
o
r
Figure 5.3: Digester context diagram
digester operation.
White liquor is supplied from a preceding unit and will posses varying alkali contents
with respect to time.
The ltrate composition (weak black liquor) will also vary, because it is made up in
preceding process apparatus.
Kinetic parameters. Different wood species have different kinetic parameters and these
will more or less vary among woodchips. The modeled values are to be treated as mean
values.
Composition of woodchips. There are many wood species and composition will vary
along individual woodchips of the same species, due to variable moisture content or
age.
5.2.2 Modeled states
The most important output variable of a digester is the Kappa number. The primary
objective of digester control is to maintain the Kappa number within certain limits. The
Kappa number measures the amount of residual lignin within the woodchips. In addition,
the overall digester yield is often calculated. The yield is a measure of the amount of wood
substance recovered in comparison with the amount of wood substance fed to the digester.
In order to calculate these properties using a physical model, the appropriate wood and liquor
components need to be modeled, as well as the temperature in the digester. In the EPM,
112
Phase Index State Modeled effects Additional algebraic
equations
Chip s, 1 High reactive lignin Convection, reaction Reaction rate
s, 2 Low reactive lignin ,, Reaction rate
s, 3 Cellulose ,, Reaction rate
s, 4 Galactoglucomannan ,, Reaction rate
s, 5 Araboxylan ,, Reaction rate
Temperature Convection, reaction heat,
conduction, diffusion,
bulkow
Diffusion coefcient
Entrapped e, 1 Active alkali Convection, reaction,
diffusion, bulkow
-
e, 2 Passive alkali ,, -
e, 3 Active hydrosulde ,, -
e, 4 Passive hydrosulde ,, -
e, 5 Dissolved lignin ,, -
e, 6 Dissolved carbohydrates ,, -
Free liquor f, 1 Active alkali Convection, diffusion,
bulkow
-
f, 2 Passive alkali ,, -
f, 3 Active hydrosulde ,, -
f, 4 Passive hydrosulde ,, -
f, 5 Dissolved lignin ,, -
f, 6 Dissolved carbohydrates ,, -
Temperature Convection, reaction heat,
conduction, diffusion,
bulkow
-
Table 5.2: EPM overview
three phases are distinguished: the solid phase, which comprises the wood substance, the free
liquor phase, and the entrapped liquor phase, which comprises the liquor that has entered the
porous wood chips. This phase serves as a transport medium for mass and energy from the
internal wood surface of the woodchips to the surrounding bulk, the free liquor phase. The
woodchips, having a slightly larger velocity, travel down with the free liquor. The modeled
components are shown in table 5.2. The index denotes howthe components will be referenced
in the equations.
The Kappa number # and the yield can be calculated as follows:
# =
s,1
+
s,2
0.00153
5
i=1
s,i
(5.1)
=
5
i=1
s,i,exiting
5
i=1
s,i,entering
(5.2)
in which denotes the mass concentration of a substance and the subscripts refer to the
species given in table 5.2.
The EPM incorporates temperature dependent mass diffusion, temperature dependent reac-
tion rates, composition dependent heat capacities and heat exchange between entrapped and
free liquor phase due to conduction. In addition, the following assumptions have been made:
Adiabatic behavior; no heat exchange with the surrounding environment
No radial gradients in temperature or composition
Entrance, wall and cross ow effects are neglected
113
Hybrid model properties
0 10 20 30
0
50
100
150
200
z (m)
#
(
)
0 10 20 30
350
400
450
z (m)
T
(
K
)
(a) (b)
Figure 5.4: Steady state Kappa prole (a) and temperature proles (b). Solid line T
c
, dashed T
f
Compaction prole is known and constant with time
The EPM equations and parameter values are given in appendix C. The data ow diagram
(DFD) is also presented there.
5.3 Extended Purdue Model analysis
Since the EPM will serve as the reference in this study, the complexity, interpretability
and process independence of the model will be analyzed (see gure 3.2). This analysis can be
used later for comparison with the hybrid model and the fuzzy model. In addition, the EPM
will be analyzed with respect to its purpose in this work; to provide a basis for developing
a hybrid model that is used to investigate hybrid model quality properties. First, the general
behavior of the EPM will be discussed.
5.3.1 Model behavior
Static and dynamic behavior
The steady state Kappa number prole #(z) of the reactor for typical operating condi-
tions is given in gure 5.4. The temperature prole of the solid and entrapped phase T
c
(z)
and of the free liquor phase T
f
(z) are shown as well. The model simulation was carried out
with 50 discretization steps for the reactor height z.
The Kappa prole indicates that little reaction occurs in the impregnation zone (0 < z <
5.5 ). This is due to relatively low temperatures. After the heating zone, the temperature is
much higher and signicant reaction takes place, which results in a decrease of #. In the
114
0 10 20 30
0
20
40
60
80
z (m)
e
,
1
,
f
,
1
0 10 20 30
0
5
10
15
z (m)
e
,
3
,
f
,
3
(a) (b)
Figure 5.5: Steady state active alkali (a) and active hydrosulde proles (b). Solid line entrapped phase,
dashed free liquor phase
wash zone (21.5 < z < 28.3 m), the reaction is quenched by the cold counter current wash
ow. The degradation products are washed from the pulp, not affecting the Kappa number
#.
The wood chips that enter the digester have a higher temperature than the white liquor (g-
ure 5.4). At the end of the impregnation zone, the temperatures are the same. At the upper
heater outlet, hot free liquor enters the digester and as a result, there is a discontinuity in the
free liquor temperature. The same is true for the temperature at the lower heater outlet. In the
cooking zone, reaction temperature is reached. The exothermic reactions cause a further in-
crease in temperature; the temperature of the wood chips is higher than that of the free liquor.
The counter-current ow in the washing section results in a constant temperature difference
between the chips and the liquor.
Figure 5.5 shows the active alkali
e,1
,
f,1
and active hydrosulde
e,3
,
f,3
concentration
proles, the main components of the white liquor. In the impregnation zone, concentration
equalization due to diffusion takes place. The pores of the woodchips are initially lled
with water. As the woodchips travel co-currently with the surrounding free liquor along the
impregnation zone, concentrations of entrapped and free phase components approach each
other. At the heater outlets, small discontinuities occur due to mixing with free liquor having
a slightly different composition. Along the cooking zone, the active alkali concentration
decreases due to exhaustion. Diffusion fromthe free phase into the entrapped phase within the
woodchips takes place. In the washing zone, the effective alkali concentrations are reduced
drastically, due to the counter current wash ow.
In the impregnation zone, the dissolved lignin diffuses from the free liquor phase (
f,5
) into
the entrapped phase (
e,5
). Because the entering free liquor ow is mixed with black liquor,
there are already dissolved solids present in the entering free liquor ow (gure 5.6). In the
beginning of the cooking zone, the temperature is high enough to cause signicant deligni-
cation. As a result, the concentration of dissolved lignin in the entrapped phase becomes
higher than the surrounding free liquor phase. Diffusion is the mechanism responsible for
the ow of dissolved solids to the free liquor. Due to the delignication process, the free
liquor absorbs more and more dissolved lignin, causing the concentrations to increase. In the
115
Hybrid model properties
0 10 20 30
0
20
40
60
80
100
z (m)
e
,
5
,
f
,
5
Figure 5.6: Steady state dissolved lignin prole. Solid line entrapped phase, dashed free liquor phase
washing zone, the degradation products are washed from the pulp by bringing the entrapped
liquor into contact with a relatively clean counter-current washing liquor.
To illustrate the dynamic behavior of the EPM, several experiments were carried out, based
on the experiments in Wisnewski et al. (1997). The experiments involve step changes in the
lower heater heat supply Q
lh
(with constant upper heater heat supply Q
uh
) and the white
liquor ow
f
. In addition, Bode diagrams for the Kappa number and the yield as a function
of Q
lh
and
f
are constructed. Frequency analysis results are shown in table 5.3. The table
also shows the response amplitudes # and for a positive and a negative step change,
respectively.
The step changes are performed at t = 0 min. The response to a 10 % increase or decrease in
lower heater heat supply (gure 5.7) indicates a negative gain (the higher temperature results
in a higher reaction rate and thus in a lower kappa number). The gain of the response to a 7
% change step in the free liquor ow rate
f
at constant Q
lh
and Q
uh
is positive (reaction
temperature is lower since the same heat is supplied to more liquor), as shown in gure 5.8.
Figures C.4 to C.7 in appendix C show Bode diagrams for the yield and the Kappa number
0 100 200 300 400 500
25
30
35
40
45
t (min)
#
(
)
0 100 200 300 400 500
0.5
0.52
0.54
0.56
0.58
0.6
t (min)
)
(a) (b)
Figure 5.7: Kappa (a) and yield (b) response to steps in lower heater heat supply. Solid line positive
step, dashed negative
116
0 100 200 300 400 500
25
30
35
40
45
t (min)
#
(
)
0 100 200 300 400 500
0.5
0.52
0.54
0.56
0.58
0.6
t (min)
)
(a) (b)
Figure 5.8: Kappa (a) and yield (b) response to steps in the liquor ow rate. Solid line positive step,
dashed negative
of the EPM. The response of the system was not simulated for very high frequencies due to
numerical inaccuracies. Time constants and system order O are obtained from the Bode
diagrams and are listed in table 5.3. The gures show second order behavior for changes in
Q
lh
, while third order behavior is observed for changes in
l
. The higher order for changes
in the ow rate (while Q
lh
is constant) is caused by the change in concentration. This is
an additional effect that is not observed for changes in Q
lh
, since during these changes, the
ow rate is kept constant. The fraction in the orders is caused by numerical dispersion. For
example, a block pulse is affected by numerical dispersion, which results in detection of a
slightly higher order. In addition, there may be some inaccuracies in the Bode diagram. The
dead times were determined using step changes in the inputs.
Overall mass balance
The performance of the EPM can be veried with respect to its overall mass balance. The
mass balance is calculated for standard operating conditions, while ignoring the mass ow
of the solvent (water). This way, the mass balance describes only the ingoing and outgoing
mass ows for the components that are solved and the solid phase mass ow. The actual
throughput will be larger. Results are shown in table 5.4.
Minor differences due to numerical inaccuracies may be tolerated. The error in the mass
balance amounts to 1.0 %. This is very reasonable and indicates that there are no errors in
model consistency with respect to the mass balances.
Input O
[input],
O
[input],
[input],
[input],
[input]
# () ()
(min) (min) (min)
Q
lh
2.3 2.3 42 43 87 -5.51, 6.17 -0.022, 0.022
f
2.9 3.3 35 35 113 4.71, -4.06 0.019, -0.020
Table 5.3: Frequency response analysis results for EPM
117
Hybrid model properties
Stream Mass ow in (kg/min) Mass ow out (kg/min)
Solid 751.9 416.1
Entrapped 0.057 118.7
Free 496.5 51.7
Extraction - 802.4
Filtrate 154.7 -
Total 1403 1389
Table 5.4: Overall mass balance EPM
5.3.2 Complexity
The complexity of the EPM is analyzed using the model owsheet and the DFD, given in
appendix C. The model owsheet consists of 10 sections of different length. These sections
are a more detailed representation of the ve functional sections in the digester (impregnation,
heating, cooking, washing and cooling) and are needed to be able to implement the various
heater recycle loops and screens. The owsheet is straightforward and not complex.
The DFD however, is somewhat more complex. It represents the model that is used for each
section in the owsheet. The diagram is constructed in accordance with general guidelines
for designing DFDs (Yourdon, 1989). The connections between the effects that are modeled
(see also table 5.2) are clearly shown.
The DFD is constructed for the general classes of components (solid phase, entrapped phase
or free liquor phase components). Including each component separately would result in an
unnecessary complex diagram; there are 17 components in total. The disadvantage is that the
connection between the components through the reaction stochiometry and heat capacity is
not represented.
The most complex equations in the EPM are the mass and energy balances. The other equa-
tions are nonlinear algebraic equations that are rather straightforward. The complexity of the
model equations is relatively low because equation substitutions are kept to a minimum.
5.3.3 Interpretability
The model owsheet represents the physical structure of the digester and its connection
with other unit operations (the heaters). Due to the low level of complexity, interpretation is
straightforward. The strength of the model is that it provides information at a detailed level.
The different physical phenomena and phases that are modeled by section models of the EPM
are clearly represented in the DFD and can be interpreted.
The equations that can be interpreted best are the mass and energy balances. Each term in
these equations represents a physical phenomenon such as mass transfer or heat generated
by the reaction. The algebraic equations that describe these phenomena cannot be easily
interpreted froma rst principles point of view. Many include lumped parameters (such as the
118
heat transfer coefcient U) or are tted to empirical data (such as the relation for the diffusion
coefcient D
ce
. This is the result of the level of detail of the model; more detailed behavior
(for example on the level of a wood chip particle) is not included. The only interpretation
that can be given to these equations is the type of mathematical dependence (linear, quadratic,
etc.).
5.3.4 Process independence
Process independence is determined by the sources of information that are used during
the construction. For the EPM, rst principles are the most important source of information.
No or little experimental data is available to construct a model that can describe the concen-
tration proles in the way the EPM does. The behavior is mainly determined by the level of
rst principles information that is used. This is augmented by expertise that is available in
the industry, which is based on lab experiments or observed behavior. The relation for the
diffusion coefcient and the introduction of several t parameters are examples of this.
The detailed description of the behavior and the rst principles basis result in a certain level of
process independence. The model can be used for different installations of this type of reactor,
probably after ne-tuning of some of the coefcients. In addition, the EPM is an extension of
an industrially accepted digester model, the Purdue Model (Wisnewski et al., 1997), which
illustrates successful application on a wide variety of installations.
With respect to process design, the EPM has limited application possibilities. It only pro-
vides global information about design parameters. The empirical relations may have limited
validity, which makes the model more suited for process operation studies than for process
design.
5.3.5 Remarks
The quality of the EPM with respect to a traditional application (controller design, pro-
cess optimization) is not discussed here. Some aspects of application in the three general
areas (R&D, design and control) will be discussed later. However, whether the model quality
meets the requirements with respect to the application in this work can be analyzed.
The EPM will serve as the basis for hybrid and fuzzy model development. In addition, it
will be compared with the hybrid and the fuzzy model with respect to model quality. The
EPM is suitable for these purposes, as was shown in this section. It provides a dynamical
description of a nonlinear process with a large operating regime. The description is detailed
and can be reduced to a more simple structure, which only describes the essential behavioral
characteristics. This means that new model equations need to be derived, as will be presented
in the following section. In addition, the EPM can be used as a simulator to provide process
data for the identication of hybrid and fuzzy model parameters.
119
Hybrid model properties
5.4 Hybrid model problem denition
5.4.1 Objective
The objective of the hybrid model of the pulp digester is to facilitate the investigation of
different hybrid model properties in comparison to models at the extremes of the spectrum:
rst principles models (in this case the EPM) and fuzzy models. Based on this analysis,
conclusions can be drawn about the position of hybrid models in this spectrum, which pro-
vides more information about the applicability of hybrid models in general. The analysis can
be performed by evaluating model quality with respect to model performance, complexity,
interpretability and process independence.
In addition, it is interesting to determine how well such a hybrid model can deal with sim-
plied dynamics, without impairing performance and physical interpretation. In other words,
the hybrid model needs to describe the essential physical characteristics of the pulp digester
with respect to the application of the model. The key variables and model requirements then
provide the basis for determining the most important characteristics that should be included
in the model.
5.4.2 Key variables and requirements
The major quality requirement of the hybrid model is that it can be compared to the EPM.
Together with the objective, this breaks down into the following requirements:
1. The hybrid model of the pulp digester should describe the Kappa number # and
the yield .
The two most important variables are the Kappa number # and the yield , especially
at the reactor outlet. These are the key variables of the hybrid model. This means that
the model needs to describe the concentrations of the reactants in such a way that #
and can be calculated.
2. Dynamic and static behavior should match the behavior of the EPM.
This means that the hybrid model should be dynamic and that the states should be
modeled in such a way, that the same type of behavioral interpretation can be given to
the hybrid model as can be given to the EPM.
3. The hybrid model structure needs to be based on physical principles.
Since the model will be hybrid, part of the structure will be based on rst principles.
In addition, the input-output structure of the fuzzy submodels needs to be physically
interpretable. In general, this does not have to be the case. For this model however,
input-output structure interpretability is required to maintain a certain level of hybrid
model interpretation, in order to make comparisons to the EPM possible.
4. The hybrid model structure needs to be less complex than the EPM.
One of the goals is to describe the digester behavior with a model that is based on the
120
essential physical phenomena that play a role. This results in a model that is less com-
plex than the EPM, but still provides a high level of interpretability. With respect to
this, it is interesting to investigate how model performance depends on the simplica-
tions.
From equations 5.1 and 5.2, it can be concluded that the concentrations of the wood com-
ponents need to be modeled in order to calculate # and . Not all concentrations need to
be modeled separately, though. With respect to the EPM, components (s, 3) to (s, 5) can be
lumped, which still provides sufcient information to calculate the two key variables # and
. The next step is to determine in detail which components need to be modeled and what
the hybrid model structure should be to meet the requirements.
5.5 Hybrid model design
The design of the hybrid model for the pulp digester can be interpreted as a bottom up
procedure. Taking the key variables and the requirements as a starting point, the relevant
states and physical processes are determined, followed by the model structure with respect to
the unit operation. After this, model identication can take place.
This section presents the various steps of the design phase: basic modeling, data acquisition,
subprocess behavior estimation, submodel identication and submodel integration.
5.5.1 Basic modeling
The apparent choice is to derive the hybrid model structure from the EPM by a form
of model reduction. By formulating interpretability requirements, the model reduction is
performed from a physical point of view instead of a mathematical point of view. This way,
requirement 4 is met. Based on requirements 1, 2 and 3, the behavior of the digester can be
analyzed and process hypotheses describing the essential behavior can be formulated.
Step 1: Process description and information analysis
The process is described in section 5.1. The EPM provides detailed information about the
behavior, which can be used during model identication. In particular, the state proles
over the reactor are useful; it is quantitative physical information. From these, all additional
variables and parameters can be derived. In addition to the model structure and expertise
that is available from the eld of digester modeling, these proles are an important source of
information.
Step 2: Process hypotheses
In the digester, wood reacts with liquor to remove lignin from the wood. It is a two-phase
system; there is a wood phase and a liquor phase. The ow regime is plug ow. To start
the reaction, the liquor is heated. At a certain point, the reaction is quenched by washing the
121
Hybrid model properties
wood with cold liquor.
In this system, the following phenomena take place: convection, reaction, mass transfer and
heat transfer. These phenomena can be unied in a framework of dynamical mass and energy
balances. In the EPM, the phenomena are described by static algebraic equations, which
determines the level of detail that is available. To meet requirement 2, the phenomena should
by described in a similar fashion in the hybrid model.
Step 3: Process structure
The derivation of the process structure involves the model reduction step. The reduction
will be performed on two levels: on the level of the model owsheet and on the level of the
section model. In both cases, the goal of the reduction is to determine the most important
model characteristics, with a minimum loss of transparency (requirement 2 and 3).
The model owsheet of the EPM consists of 10 sections, which are located in 5 functional
zones (see gure 5.1). The functional zones are relevant for the interpretation of the behav-
ior and should be maintained in the hybrid model. The number of sections, however, is less
important for meeting the interpretability requirements; interpretation of the Kappa number
prole, for example, does not require detailed information about the quench recirculation. In
particular, the double heater recycle loop can be omitted without compromising interpretabil-
ity. The function of the heaters is to initiate the reaction, which for the most part takes place
in the cooking zone. This is indication by the drop in # in gure 5.4. The same functional
interpretation can be given to a single heater that provides the same amount of energy as the
two heaters combined. In addition, the heater recycles can be omitted. This may inuence the
dynamic behavior of the model and therefore should be investigated in the evaluation phase.
As a result, the hybrid model can describe each functional zone with one section. There are
thus four sections (gure 5.9):
Section I: Impregnation
Section II: Cooking
Section III: Washing
Section IV: Cooling
The section model of the EPM describes three phases which contain 17 components in total.
Mathematically, there is no need to describe the solid and entrapped phase separately. The
EPM does not describe mass or heat transfer between the two phases. The description of the
reactions that take place do not require specic information about the phases, except for the
concentrations of the reactants. The concentration of the solid species is based on the volume
of the chips (which is equal to the sum of the volumes of the solid and entrapped phase),
while the concentration of the entrapped species is based on the volume of the entrapped
phase. In addition, the concentrations in the entrapped phase are required to describe mass
and heat transfer between the entrapped and free liquor phase.
The section model can be simplied by lumping the entrapped and solid phase to a reaction
phase, which essentially describes the physical phenomena in the wood chips. The poros-
122
chips
flow
liquor
flow
Black liquor
Pulp product
Wood chips/White liquor
lower heater
upper heater
wash heater
quench
recircu-
lation
Filtrate
(a) (b)
Height
0 m
5.5m
21.5 m
26.7 m
32 m
I
m
p
r
e
g
n
a
t
i
o
n
C
o
o
k
i
n
g
W
a
s
h
C
o
o
l
I
II
III
IV
Chips flow Liquor flow
heater
wash
heater
Filtrate
Black liquor
Figure 5.9: Digester unit operation (a) and hybrid model owsheet (b)
ity (see equation C.16) can be used to compensate for the change in concentration of the
entrapped components, which as a result of the lumping is based on the chip volume instead
of the entrapped liquor volume. As a result, the hybrid model will describe two phases, a
reaction phase which corresponds to the solid and entrapped phases of the EPM, and a liquor
phase, which corresponds with the free liquor phase in the EPM. The assumption will be
made that the reaction phase is ideally mixed and that bulkow can be neglected.
The hybrid model needs to describe the Kappa number # and the yield . This means that
the concentrations of the wood species (s, 1) to (s, 5) are required. However, species (s, 3)
to (s, 5) are not required explicitly. The section model can be simplied by lumping them to
a species carbohydrates.
A similar approach can be followed for the species in the liquor phase. In the EPM, the use
of different species in the free liquor phase is only required to be able to describe the reaction
rate; the reaction rate equation C.6 requires species (e, 1) and (e, 3). Thus (f, 1) and (f, 3)
are required. A transparent simplication can be accomplished by viewing upon the reaction
as a reaction between lignin (species (s, 1) and (s, 2)), carbohydrates (species (s, 3) to
(s, 5) lumped) and liquor (species (e, 1) and (e, 3) lumped). This reduces the number of
differential equations in the model, while still meeting interpretability requirements.
The result of the lumping of the phases and species is that the kinetic equations of the EPM
123
Hybrid model properties
Phase Index State Modeled effects Additional fuzzy
equations
Reaction r, 1 High reactive lignin Convection, reaction,
diffusion
Reaction rate
r, 2 Low reactive lignin ,, Reaction rate
r, 3 Carbohydrates (cellulose,
galactoglucomannan,
araboxylan)
,, Reaction rate
r, 4 Active liquor (active alkali,
active hydrosulde)
Convection, reaction,
diffusion
-
r, 5 Passive liquor (passive
alkali passive hydrosulde)
,, -
r, 6 Dissolved lignin ,, -
r, 7 Dissolved carbohydrates ,, -
Temperature Convection, reaction heat,
conduction, diffusion
Diffusion coefcient
Liquor l, 1 Active liquor (active alkali,
active hydrosulde)
Convection, reaction,
diffusion
-
l, 2 Passive liquor (passive
alkali passive hydrosulde)
,, -
l, 3 Dissolved lignin ,, -
l, 4 Dissolved carbohydrates ,, -
Temperature Convection, reaction heat,
conduction, diffusion
-
Table 5.5: Hybrid model overview
are no longer valid. These relations are nonlinear and cannot be easily transformed to accom-
modate the new species (see equations C.6, C.7 and C.8). To describe the reaction rates in the
hybrid model, fuzzy equations will be used. These will be derived from observed behavior.
In addition, the diffusion coefcient D
ce
will be described by a fuzzy equation. In the EPM,
an empirical equation was used. Due to the lumping of the entrapped and free liquor phase
species, diffusion behavior is affected (see equation C.3).
The reaction rates of the other species in the EPM are calculated through reaction stochiom-
etry (equations C.9 and C.35). Since this operation and the lumping are both linear with
respect to the components, the new reaction stochiometry can readily be calculated from the
stochiometry in the EPM.
To build a consistent hybrid model with closed mass and energy balances, the reaction prod-
ucts need to be described as well. This results in the overall section model as is presented in
table 5.5.
Step 4: Basic equations
Similar to the EPM, each section of the hybrid model will be modeled using the same section
model. The Kappa number # and the yield can be calculated as follows:
# =
r,1
+
r,2
0.00153
3
i=1
r,i
(5.3)
=
3
i=1
r,i,exiting
3
i=1
r,i,entering
(5.4)
in which denotes the concentration in kg/m
3
of a species and the subscripts refer to the
species given in table 5.5.
The physical framework of the hybrid model will consist of 13 states:
124
Mass balances for species (r, 1) to (r, 7)
Mass balances for species (l, 1) to (l, 4)
Energy balance for the reaction phase, resulting in the reaction phase temperature T
r
Energy balance for the liquor phase, resulting in the liquor phase temperature T
l
To meet requirement 3, the states are described as a function of time and location in the
reactor. The structure of the state equations is similar to the structure of the state equations in
the EPM.
The reaction rates R
r,i
of components (r, 1) to (r, 3) will be described by fuzzy equations,
while the other reaction rates are derived from them through stochiometry. The reaction
rate depends on the species concentration, the liquor concentration, and the reaction phase
temperature. This results in the following structure:
R
r,i
= f
fuzzy
(
r,i
,
r,4
, T
r
) for i = 1 . . . 3 (5.5)
The diffusion coefcient D
cr
is also described by a fuzzy equation. The simplications affect
the behavior of the diffusion. The lumping of species (e, 1) and (e, 3) and (f, 1) and (f, 3),
respectively, is nonlinear with respect to the diffusion term. This will be discussed later.
Similar to the EPM, it will depend on the reaction phase temperature:
D
cr
= f
fuzzy
(T
r
) (5.6)
Other section model equations involve simple linear algebraic equations that calculate the
total mass of a phase, the overall heat capacity of a phase, etc.
The heaters in the hybrid model are described by the same equations as in the EPM. The
complete set of model equations and the hybrid model DFD are presented in appendix D.
5.5.2 Data acquisition
Except for the fuzzy models, all parameters of the hybrid model can be obtained from
the EPM. The EPM is used as a benchmark model and has been extensively veried with
industrial data. To be able to identify the fuzzy relations, sufcient data about the behavior
is needed, which can be generated by using the EPM as a simulator. Since the hybrid model
describes the internal behavior of the digester, the data needs to be available in the form of
state proles, that represent the states as a function of the location in the reactor.
This data can be obtained by performing several simulation experiments under different
operating conditions. The measurements of the state proles can then be used as identi-
cation data. Since the fuzzy relations are static, no dynamic data is required. This means that
a limited number of experiments is sufcient; only steady state measurements of the process
125
Hybrid model properties
f
T
lh
1.76 2.0 2.34 2.6 2.94 3.2
420 ID1 ID4 ID7
430 VAL1 VAL3 VAL5
438 ID2 ID5 ID8
445 VAL2 VAL4 VAL6
450 ID3 ID6 ID9
460 VAL7 VAL8
Table 5.6: Identication and validation run settings
operated under different conditions need to be obtained. Measurements for identication as
well as for validation need to be available.
The different steady state proles can be obtained by manipulating process inputs. During
simulation of the model, the assumption is made that the wood ow rate and composition
do not change. This reduces the number of experiments that have to be set up while the
hybrid model requirements are still met. The main process inputs that can be used to control
the Kappa number # (lower heater temperature and free liquor ow rate) will be used to
acquire the data. It is also possible to manipulate the ltrate ow rate, but this only affects
the process behavior in the last part of the reactor, which results in a marginal contribution.
To compare results after identication, the hybrid model has to be simulated under the same
conditions as the EPM. Since in the hybrid model the upper heater and lower heater are
lumped, the simulation conditions of the hybrid model are based on the heat supply Q
h
instead of heater temperature T
h
. The total heat supply can be determined by calculating the
heat supply of the lower and upper heater in the EPM, using equation C.29 and the lower and
upper heater temperatures.
The hybrid model is being built based on model reduction of the EPM. This means that
the EPM generates the required identication data. To acquire sufcient information, the
lower heater temperature or energy supply and the liquor ow rate can be varied within the
validity range of the EPM. However, to facilitate investigation of extrapolation behavior with
physically meaningful experiments, the range of variation during identication should be
limited. Therefore, the lower heater temperature and free liquor ow rate are varied between
approximately +/- 10 % of their normal operating values (appendix C). This results in 9
different identication runs ID1 to ID9. In addition, 8 validation runs VAL1 to VAL8 are
designed. An overview is given in table 5.6.
5.5.3 Subprocess behavior estimation
To identify the fuzzy models for R
r,1
, R
r,2
, R
r,3
and D
cr
, input-output data is required.
The inputs can be obtained from the EPM, the outputs cannot. In the hybrid model, species
(e, 1) and (e, 3) are lumped to form species (r, 4). Species (e, 1) and (e, 3) are both inputs
to the EPM kinetic equations. However, since the kinetics are nonlinear, they are no longer
valid for species (r, 4). This means that the outputs of the fuzzy models, R
r,1
, R
r,2
and R
r,3
,
126
cannot be readily obtained from the EPM. The outputs are therefore derived from observed
behavior by estimation. Since the lumping affects the behavior of the diffusion term in the
model, the diffusion coefcient is also estimated.
The reaction rates and diffusion coefcient can be estimated with the help of the mass bal-
ances of species (r, 1) to (r, 3) and (l, 1) of the hybrid model (equations D.1 and D.3) and
the steady state proles from the identication experiments. Because these proles describe
a steady state, the state equations reduce to ODEs. This facilitates the use of PI-estimators
(section 4.3.2). The system is mathematically equivalent to a batch process; as in a batch
process the conditions vary with time, the conditions in the digester vary with the location in
the reactor. The proles can be interpreted as a dynamic response with respect to the location
in the reactor.
Four PI-estimators were designed. Since detailed information on the behavior of the EPM
is available (all state variables can be measured), the model parts f of the PI-estimators
(as shown in gure 4.6) are kept simple. Consider the estimation of reaction rate R
r,1
. The
estimates can be derived from the steady state mass balance of species
r,1
:
0 =
r
S(1 )
r,1
z
+R
r,1
(5.7)
which can be rewritten to
r,1
z
=
S(1 )
r
R
r,1
(5.8)
Incorporating this equation in a PI-estimator structure, an estimated prole for reaction rate
R
r,1
can be obtained. The model equations for the PI-estimators of R
r,2
and R
r,3
are similar.
For the estimation of D
cr
, the PI-estimator model equation becomes:
l,i
z
=
S
l
D
cl
(
1
r,4
l,1
) (5.9)
Substituting
D
cl
= D
cr
1
(5.10)
results in
l,i
z
=
S(1 )
l
D
cr
(
1
r,4
l,1
) (5.11)
In this equation,
r,4
is treated as an input.
The PI-estimators were tuned manually by comparing the concentration proles with the
reference values from the EPM. During setup, it was not possible to nd appropriate tuning
parameters that yielded acceptable estimate proles for the complete digester. A single PI-
estimator cannot cope sufciently with the change in EPM behavior in the different sections.
Therefore, PI-estimators for each of the four sections of the hybrid model were designed.
While designing the PI-estimators for D
cr
in the four sections, it was observed that the esti-
mators in sections I, III and IV provided little useful information. In section I, both tempera-
ture T
r
and estimated D
cr
do not change much. In sections III and IV, no acceptable tuning
127
Hybrid model properties
Estimator Controlled Other K
I
variable variables (input) I II III IV I II III IV
R
r,1
r,1
r
0.2 0.25 0.25 0.35 2 2 0.4 0.8
R
r,2
r,2
r
0.15 0.2 0.2 0.4 1 0.5 0.3 10
R
r,3
r,3
r
0.2 0.2 0.2 0.3 5 1 0.3 0.8
D
cr
l,1
l
,
r,4
, - -0.045 - - - 5 - -
Table 5.7: PI-estimator structures and settings
parameters could be determined. To obtain small errors, the estimators had to be tuned ag-
gressively. Due to the limited number of measurements that were available for the estimators
of the smaller sections III and IV, overshoot and oscillation was observed. Therefore, only a
PI-estimator for D
cr
in section II was used.
The PI-estimator settings were used for all identication runs (ID1 to ID9). Table 5.7 gives
an overview of the PI-estimator structures and tuning parameters K and
I
.
The estimates of the reaction rates and the corresponding species are shown in gure 5.10.
Only the results of run ID5 (the standard operating conditions) are shown; the results of the
other runs are similar. The estimates are of good quality. The errors in the estimates
r,1
to
r,3
and
l,1
are small. The estimates of the reaction rates are negative because they describe
consumption.
In the estimates of D
cr
, a spike occurs around z = 8 m. This is caused by a sudden increase
in the estimation error of
l,1
(gure 5.10 (h)). The measured prole shows a sudden
decrease in
l1,
, which is the result of the lower heater. The lower heater outlet is located at
z = 8.67 m. The lower heater inlet is located at z = 10.2 m, where the concentration in
l,1
is lower. The recycle ow is mixed with the liquor phase at the lower heater outlet, resulting
in a decrease in concentration. The estimator cannot compensate for this bump.
Consider the estimates of the reaction rates R
r,1
to R
r,3
. For each of the reaction rates, PI-
estimator reinitialization can be observed. At the start of each section, the estimation error
is set to zero and the initial estimate of the reaction rate is set to the last estimate of the
previous section (for section I, initial estimates of the reaction rates are set to zero, based on
the assumption that no reaction takes place at z = 0 m). Because the PI-estimator lags 1 step
(due to the feedback structure), the estimates of the reaction rates remain constant for the rst
time step (gures 5.10 (a), (c) and (e)).
The estimates of the reaction rates can be interpreted physically. In section I at z = 0 m, the
reaction rate is low. In the beginning of section I, the reactants are brought together and the
reaction starts slowly. By the end of section I, the rates have converged to a pseudo steady
state.
In section II, the reaction phase is heated up. The inuence of the two heaters can clearly
be observed. The upper heater (at z = 5.5 m) causes an increase of the reaction, followed
by another increase caused by the lower heater (at z = 8.7 m). After that, the reaction rate
decreases because the reactants are being depleted. In sections III and IV, the reaction is
quenched and the reaction rate decreases to zero.
128
0 10 20 30
0.3
0.2
0.1
0
0.1
I II III IV
z (m)
R
r
,
1
(
k
g
/
(
m
3
m
i
n
)
)
0 10 20 30
0
10
20
30
I II III IV
z (m)
r
,
1
(
k
g
/
(
m
3
)
(a) (b)
0 10 20 30
2.5
2
1.5
1
0.5
0
0.5
I II III IV
z (m)
R
r
,
2
(
k
g
/
(
m
3
m
i
n
)
)
0 10 20 30
0
20
40
60
80
100
120
I II III IV
z (m)
r
,
2
(
k
g
/
(
m
3
)
(c) (d)
0 10 20 30
2
1.5
1
0.5
0
0.5
I II III IV
z (m)
R
r
,
3
(
k
g
/
(
m
3
m
i
n
)
)
0 10 20 30
250
300
350
400
450
I II III IV
z (m)
r
,
3
(
k
g
/
m
3
)
(e) (f)
10 15 20
0
0.1
0.2
0.3
0.4
II
z (m)
D
c
r
(
m
i
n
1
)
10 15 20
30
40
50
60
II
z (m)
l
,
1
(
k
g
/
m
3
)
(g) (h)
Figure 5.10: PI-estimates run ID5. Solid line estimates, dots measurements
129
Hybrid model properties
Fuzzy model Inputs
dr
# data features k
0
cm
# rules RMSE
R
r,1
r,1
,
r,4
, T
r
2 179 10 0.7 4 0.024
R
r,2
r,2
,
r,4
, T
r
1 333 10 0.7 6 0.14
R
r,3
r,3
,
r,4
, T
r
5 198 10 0.7 6 0.12
D
cr
T
r
1 36 5 0.2 2 0.026
Table 5.8: Fuzzy model identication settings and results
The inuence of the lower heater results in some overshoot in reaction rates R
r,2
and R
r,3
.
From a physical point of view, more smooth behavior is expected. However, the overshoot
could not be reduced any further without compromising the estimates of the concentrations.
Therefore, the estimation results as presented in gure 5.10 were accepted.
The use of the nine identication runs resulted in a total of 450 estimates for the reaction
rates and 243 estimates for the diffusion coefcient D
cr
. The required input-output data for
the identication of the fuzzy models is now available.
5.5.4 Submodel identication
The four fuzzy models will be identied with GK-clustering in combination with cluster
merging. This way, a simple fuzzy model is derived from the input-output data without the
need to impose an initial model structure (section 4.4.1). The identication step consists of
preparation of the data sets, clustering in combination with cluster merging, premise part
construction by projection and consequent part construction using weighted least squares.
The rst step is to reduce the data sets in such a way that the data features are distributed
evenly across the input space. Usually, the data reduction threshold
dr
(see example 4.5) is
related to the maximum distance of two neighboring features. However, in this case, the data
sets for the four fuzzy models consist of 9 trajectories, in which the data features have a small
mutual distance, but between which the distance is relatively large. Care has to be taken that
the threshold is not set at a value which results in loss of information. The thresholds were
set manually and are shown in table 5.8.
For clustering, the following parameters have to be set: the initial number of clusters k
0
and the merging threshold
cm
. For each model, several values of
cm
were investigated.
The values that were selected yielded relatively simple models with acceptable performance.
Settings as well as clustering results are also shown in table 5.8.
The model for the reaction rates are shown in appendix D. Figure 5.11 (a), (c) and (e) show
the performance of the reaction rates for run ID5. These results were generated by supplying
the measured proles of the input variables
r,i
,
r,4
and T
r
. In addition, gure 5.11 (b),(d)
and (f) show parity plots with respect of all identication data, which indicate the quality of
the t.
All fuzzy models perform well. The overall behavior is similar to that of the estimates. The
parity plots indicate that modeling errors are larger when the reaction rate is larger (more
130
0 10 20 30
0.3
0.2
0.1
0
0.1
I II III IV
z (m)
R
r
,
1
(
k
g
/
(
m
3
m
i
n
)
)
0.4 0.3 0.2 0.1 0
0.4
0.3
0.2
0.1
0
R
r
,
1
,
f
u
z
(
k
g
/
(
m
3
m
i
n
)
)
R
r,1,est
(kg/(m
3
min))
(a) (b)
0 10 20 30
2.5
2
1.5
1
0.5
0
0.5
I II III IV
z (m)
R
r
,
2
(
k
g
/
(
m
3
m
i
n
)
)
3 2 1 0
3
2
1
0
R
r,2,est
(kg/(m
3
min))
R
r
,
2
,
f
u
z
(
k
g
/
(
m
3
m
i
n
)
)
(c) (d)
0 10 20 30
2
1.5
1
0.5
0
0.5
I II III IV
z (m)
R
r
,
3
(
k
g
/
(
m
3
m
i
n
)
)
2 1 0
2.5
2
1.5
1
0.5
0
R
r,3,est
(kg/(m
3
min))
R
r
,
3
,
f
u
z
(
k
g
/
(
m
3
m
i
n
)
)
(e) (f)
Figure 5.11: Fuzzy model results reaction rates (ID5) and parity plots (ID1-ID9). In (a), (c) and (e),
dots indicate estimates, solid line fuzzy model.
131
Hybrid model properties
380 400 420 440
0
0.1
0.2
0.3
0.4
0.5
T
c
(K)
D
c
r
(
m
i
n
1
)
380 400 420 440
0
0.1
0.2
0.3
0.4
0.5
T
c
(K)
D
c
r
(
m
i
n
1
)
(a) (b)
Figure 5.12: Fuzzy model results diffusion (a) and reduced data set (b)
negative). When the fuzzy models are incorporated in the hybrid model, these errors are
integrated. Whether this effect plays a major role in the performance of the hybrid model,
will become apparent during submodel integration.
Figure 5.12 shows the input-output identication data of runs ID1 to ID9 for the fuzzy model
for D
cr
. In gure 5.12 (a), the spikes in the estimates are removed manually, resulting
in limited information between T
r
= 405 K and T
r
= 415 K. Figure 5.12 (b) shows the
reduced data set. The gure shows that the diffusion coefcient D
cr
is not a strict function
of the reaction phase temperature T
r
but that there is some other inuence. The results are
caused by the nonlinear effect that the lumping of species (f, 1) and (f, 3) has. This can
be explained by solving the mass balances for (f, 1) and (f, 3) analytically (with neglection
of the bulk ow and assuming steady state) and lumping the analytical solutions to form an
equation that describes species (l, 1). If this result is compared with the analytical solution
of the mass balance that is imposed on (l, 1) (equation 5.11), it can be seen that the structures
are different. The PI-estimator derives D
cr
from the behavior that is described by the lumped
analytical solution by using an incorrect imposed model structure, which results in the
estimates for D
cr
as shown in gure 5.12.
The problem can be solved by lumping the diffusion term into a relation that describes dif-
fusion as a function of
l1
,
r4
and T
r
. However, this results in a loss of transparency.
Despite the mediocre quality of the estimates, the proposed structure to describe diffusion is
maintained. Consequently, diffusion can still be interpreted as described by a driving force
(
l,i
(1/)
r,i+3
) and a diffusion coefcient, which is temperature dependent. The re-
duced data set is used to build a fuzzy model for D
cr
. Results are shown in gure 5.13. The
inuence of errors in the fuzzy model on hybrid model results will be investigated during
submodel integration.
132
10 15 20
0
0.1
0.2
0.3
0.4
II
z (m)
D
c
r
(
m
i
n
1
)
380 400 420 440
0
0.1
0.2
0.3
0.4
0.5
T
c
(K)
D
c
r
(
m
i
n
1
)
(a) (b)
Figure 5.13: Fuzzy model results for D
cr
, run ID5 (a) and D
cr
as a function of T
c
5.5.5 Submodel integration
The fuzzy submodels for reaction rates R
r,1
, R
r,2
, R
r,3
and D
cr
were combined with the
physical framework to form the hybrid model. Figure 5.14 shows the Kappa number prole
for ID5, table 5.9 shows key variable errors at z = 32 m, the reactor outlet.
The hybrid model shows the same qualitative behavior for the Kappa number as the EPM.
However, there is a discrepancy between the two proles. Table 5.9 shows the model errors
at the reactor outlet; there are signicant errors in the hybrid model (
,hm
and
,hm
denote
absolute errors,
,hm
and
,hm
denote relative errors).
The hybrid model errors are caused by:
Model reduction. The simplications result in slightly different behavior.
PI-estimates. Errors in the estimates of the reaction rates and the diffusion coefcient
manifest themselves through the fuzzy models in the hybrid model results.
0 10 20 30
0
50
100
150
200
z (m)
#
(
)
I II III IV
Figure 5.14: Initial # prole hybrid model run ID5
133
Hybrid model properties
Error ID1 ID2 ID3 ID4 ID5 ID6 ID7 ID8 ID9
,hm
-2.60 -3.03 3.88 -5.22 -3.85 2.68 -5.78 -3.23 5.57
,hm
0.03 0.07 0.15 0.07 0.12 0.17 0.08 0.11 0.43
,hm
-0.02 -0.03 -0.03 -0.02 -0.03 -0.03 -0.01 -0.02 -0.02
,hm
0.03 0.05 0.06 0.03 0.05 0.07 0.02 0.03 0.04
Table 5.9: Hybrid model errors before submodel integration
Identication errors. The error of the fuzzy models with respect to the estimates also
is present in the hybrid model results.
Error integration. The error increases towards the end of the reactor.
The fuzzy models will have to be adjusted in order to improve hybrid model performance.
This involves optimization of the fuzzy model parameters. To gain insight in the inuence of
the fuzzy models on the errors, the sensitivity of R
r,1
, R
r,2
, R
r,3
and D
cr
with respect to the
Kappa number and the yield is calculated.
Sensitivity analysis
The sensitivity of the kappa number and the yield with respect to changes in the reaction
rates and diffusion coefcient was determined by introducing fuzzy model weights w
m
in
the hybrid model. The model weights can be interpreted as multipliers of the reaction rates
and the diffusion coefcient. The model is evaluated for different values of these multipliers.
Determining the change in the Kappa number and the yield with respect to changes in the
multipliers give the sensitivity:
S
R
r,i
=
#/#
w
R
r,i
m
/w
R
r,i
m
for i = 1, . . . , 3 (5.12)
S
D
cr
=
#/#
w
D
cr
m
/w
D
cr
m
(5.13)
S
R
r,i
=
#/#
w
R
r,i
m
/w
R
r,i
m
for i = 1, . . . , 3 (5.14)
S
R
r,i
=
#/#
w
D
cr
m
/w
D
cr
m
(5.15)
The hybrid model was simulated for the conditions of run ID5; the assumption is made
that sensitivity does not change much within the identication domain. Each multiplier was
changed from 1.00 to 1.05. Results are shown in table 5.10.
The sensitivity of the yield is the highest for changes in R
r,3
. The sensitivity for the other
parameters is lower than for R
r,3
, but approximately in the same order of magnitude. The
sensitivity of the Kappa number # is the highest for changes in R
r,2
, while the sensitivity
for changes in the other parameters is relatively low.
134
S
R
r,1
S
R
r,2
S
R
r,3
S
D
cr
# 0.19 1.50 0.26 0.03
0.01 0.07 0.14 0.05
Table 5.10: Sensitivities hybrid model
Parameter optimization
The most straightforward approach is to improve by optimizing R
r,3
and # by op-
timizing R
r,2
. This means that two optimizations are performed sequentially. If this yields
unacceptable results, other reaction rates or the diffusion coefcient can be incorporated in
the optimization procedure.
The reaction rates R
r,2
and R
r,3
inuence both # and . The sensitivity S
R
r,2
is much
higher than S
R
r,2
, while S
R
r,3
and S
R
r,3
are in the same order of magnitude. Because
S
R
r,2
> S
R
r,2
, the optimization of R
r,2
will not affect as much. Therefore, R
r,2
should be
optimized after R
r,3
is optimized.
The two fuzzy models contain 192 parameters. To keep the optimization simple, the number
of parameters that is optimized should be kept as low as possible. This can be accomplished
by optimizing fuzzy model weights or rule weights instead of the individual model parame-
ters. To provide sufcient exibility to deal with nonlinearity, the rule weights w
R
r,i
j
j
of the
two models are optimized. This results in the optimization of 6 parameters in each of the two
fuzzy models.
The goal function for the optimization of R
r,3
is a standard quadratic error criterion. Instead
of incorporating the error in , the error in
r,3
is used; if the states of the model are described
well, the model outputs are described well. In addition, interpretation of the behavior of the
states is improved. The goal function is given by:
J =
1
2
9
j=1
n
i=1
(
r,3,epm,i,IDj
r,3,hm,i,IDj
)
2
(5.16)
in which
r,3,epm,i,IDj
denotes the ith value in the EPM concentration prole of
r,3
for
experiment IDj and
r,3,hm,i,IDj
denotes the ith value in the hybrid model concentration
prole of
r,3
for experiment IDj. n is the number of features per experiment. The goal
function for the optimization of R
r,2
is similar to equation 5.16:
J =
1
2
9
j=1
n
i=1
(
r,2,epm,i,IDj
r,2,hm,i,IDj
)
2
(5.17)
Bounds were imposed on the rule weights:
0 w
R
r,i
j
2 (5.18)
in which i denotes the species and j denotes the rule number.
135
Hybrid model properties
Rule #: 1 2 3 4 5 6
R
r,2
0.73 0.93 0.47 1.19 1.14 1.70
R
r,3
0.99 1.38 1.04 1.95 1.25 1.12
Table 5.11: Optimized rule weights
A standard Nelder-Mead simplex algorithm is used to perform the optimization. A direct
search method omits the need to calculate gradient information, which is cumbersome for the
hybrid model. Although the algorithm may require many iterations and can get stuck in local
optima, it can be used as long as performance requirements are met.
Results
The optimized weights are shown in table 5.11. Figure 5.15 shows the results before and
after the optimization step, table 5.12 shows the errors after optimization.
Figure 5.15 shows that the hybrid model concentration proles followthe measurements more
closely after optimization. The error in the yield is negligible (table 5.12). The overall error
in the Kappa number # at the reactor outlet has been decreased, although for some runs,
the error is larger (ID2 and ID5).
The optimization results show that the hybrid model behavior matches the static behavior of
the EPM for runs ID1 to ID9 closely. Table 5.13 shows that the hybrid model is consistent
with respect to the mass balances within 3 %, the result of rounding errors. The liquor ow
at the reactor outlet is assumed to be zero; it is assumed that the amount of liquor phase that
leaves the reactor at the bottom is negligible.
5.6 Hybrid model analysis
The next step is to analyze the hybrid model in detail to determine whether the quality
requirements are met. The hybrid model of the digester will be analyzed with respect to static
performance, dynamic performance, complexity, interpretability and process independence.
Error ID1 ID2 ID3 ID4 ID5 ID6 ID7 ID8 ID9
,hm
2.47 -3.86 -0.28 -0.82 -4.87 -1.09 -1.00 -2.61 2.23
,hm
0.03 0.10 0.01 0.01 0.15 0.07 0.01 0.09 0.17
,hm
0.00 -0.01 -0.01 0.00 -0.01 -0.01 0.01 0.00 0.00
,hm
0.01 0.03 0.02 0.01 0.02 0.02 0.01 0.01 0.00
Table 5.12: Hybrid model errors after submodel integration
136
0 10 20 30
0
20
40
60
80
100
120
z (m)
r
,
2
(
k
g
/
m
3
)
I II III IV
0 10 20 30
0
20
40
60
80
100
120
z (m)
r
,
2
(
k
g
/
m
3
)
I II III IV
(a) (b)
0 10 20 30
250
300
350
400
450
z (m)
r
,
3
(
k
g
/
m
3
)
I II III IV
0 10 20 30
250
300
350
400
450
z (m)
r
,
3
(
k
g
/
m
3
)
I II III IV
(c) (d)
0 10 20 30
0
50
100
150
200
z (m)
#
(
)
I II III IV
0 10 20 30
0
50
100
150
200
z (m)
#
(
)
I II III IV
(e) (f)
Figure 5.15: Simulation results run ID5 before (a,c,e) and after (b,d,f) optimization. Solid line hybrid
model, dots measurements
137
Hybrid model properties
Stream Mass ow in (kg/min) Mass ow out (kg/min)
Reaction 748.5 564.0
Liquor 494.4 -
Extraction - 879.4
Filtrate 154.6 -
Total 1398 1443
Table 5.13: Overall mass balance hybrid model
5.6.1 Static performance
The static performance of the hybrid model will be analyzed by evaluating the results of
the hybrid model for runs VAL1 to VAL8 (table 5.6) and by investigating the static extrapo-
lation behavior.
Table 5.14 shows the errors at the reactor outlet for the validation runs. In gure 5.16, parity
plots for # and are given. The overall static performance is good. The errors are slightly
larger than the errors for the identication runs, but still within acceptable limits. Similar to
the identication runs, the error in # is larger than the error in . The relative error in #
for VAL7 is large, while the absolute error is in the same order of magnitude as the errors in
VAL1-VAL4. The relative error is large since the # at the reactor outlet is relatively low for
VAL7 (gure 5.17).
The qualitative behavior is in accordance with the measurements. Figure 5.17 illustrates
this for VAL1 (interpolation) and VAL7 (extrapolation). Details, however, are different. For
example, in VAL7, # of the hybrid model starts to decrease at a lower z than the # of
the EPM, but in section II, the reaction rate in the hybrid model is lower than in the EPM,
resulting in a higher #.
Because the error in is similar to the error during the identication experiments (tables 5.12
and 5.14), it can be concluded that the fuzzy model for R
r,3
performs well for the validation
runs. Since the fuzzy model for R
r,3
performs well, the larger error in # is caused by the
error in R
r,2
, for which the Kappa number has the highest sensitivity. The inuence of the
other fuzzy models on the Kappa number is much smaller. As a result, the errors caused by
these fuzzy models are much smaller than the error caused by the fuzzy model for R
r,2
.
To investigate the interpolation and extrapolation properties in more detail, the hybrid model
performance as a function of Q
h
(which is equivalent to the energy supply of the lower and
upper heater, that can be derived fromT
uh
and T
lh
in the EPM) and
l
(equivalent to the free
liquor ow
f
in the EPM) was calculated.
Error VAL1 VAL2 VAL3 VAL4 VAL5 VAL6 VAL7 VAL8
,hm
7.55 -4.37 -5.81 -2.00 -6.04 0.81 -6.62 -1.16
,hm
0.14 0.17 0.12 0.10 0.12 0.04 0.55 0.18
,hm
0.03 -0.02 0.00 -0.01 0.00 0.00 0.01 0.01
,hm
0.05 0.04 0.00 0.02 0.01 0.01 0.01 0.02
Table 5.14: Hybrid model errors validation runs
138
20 40 60 80
20
40
60
80
#
epm
()
#
h
m
(
)
0.45 0.5 0.55 0.6 0.65
0.45
0.5
0.55
0.6
0.65
h
m
(
epm
()
(a) (b)
Figure 5.16: Parity plots for # (a) and (b). Solid dots ID runs, open dots VAL runs
0 10 20 30
0
50
100
150
200
z (m)
#
(
)
I II III IV
0 10 20 30
0
50
100
150
200
z (m)
#
(
)
I II III IV
(a) (b)
Figure 5.17: Simulation results # run VAL1 (a) and VAL7 (b). Solid line hybrid model, dots mea-
surements
139
Hybrid model properties
0 2 4 6 8
x 10
5
0
50
100
150
Q (kJ/min)
#
(
)
0 2 4 6 8
x 10
5
0
0.2
0.4
0.6
0.8
1
Q (kJ/min)
)
(a) (b)
Figure 5.18: Static extrapolation results for # (a) and (b). Solid line hybrid model, dots EPM.
Vertical lines indicate identication domain
Figure5.18 (a) shows the Kappa number at the reactor outlet as a function of the heat input
Q, which can be interpreted as a general production curve. As temperature rises, the pro-
duction increases exponentially. At higher temperatures, exhaustion is observed, resulting in
depletion of the reactants.
The liquor ow
l
was kept constant at 2.34 m
3
/min. Qwas varied from0 (heaters switched
off) to 25 % of the identication range beyond the maximum identication value. Within the
identication range (350000 Q 860000), the hybrid model error is relatively constant
and small. For Q > 860000 kJ/min, the Kappa number of the EPM converges to zero due
to exhaustion of components
r,1
and
r,2
. Since the mass balances of the hybrid model are
based on rst principles, the hybrid model Kappa number will also converge to zero.
At low Q, the production as calculated by the hybrid model becomes constant. Under these
conditions, only rule 2 is active in the fuzzy model for R
r,2
. Although the reaction conditions
differ slightly due to the changing Q, the overall reaction rate does not change much. The
result is that # remains constant.
1.5 2 2.5 3
0
50
100
150
l
(m
3
/min)
#
(
)
1.5 2 2.5 3
0
0.2
0.4
0.6
0.8
1
l
(m
3
/min)
)
(a) (b)
Figure 5.19: Static extrapolation results for # (a) and (b). Solid line hybrid model, dots EPM.
Vertical lines indicate identication domain
140
Figure5.18 (b) shows the yield as a function of Q. The results are similar to gure5.18 (a).
However, at high Q, the yield of the EPM converges to a constant value, while the yield
of the hybrid model continues to decrease. In the EPM, the conversion is limited by inert
concentrations; a fraction of species s, 3 and s, 4 does not react, which means that the yield
has a minimum value that is greater than zero. This limitation is not accounted for in the
hybrid model, as a result of which the yield converges to zero.
To investigate extrapolation performance resulting from variations in the liquor ow rate, Q
h
was kept constant at 627000 kJ/min (the value for ID5). The ow rate
l
was varied from
1.46 m
3
/min to 3.23 m
3
/min. The results for the Kappa number and the yield are shown
in gure 5.19.
Although the concentration of the liquor in the reaction phase (
r,4
) rises as a result from
the increase in
l
, the reaction phase temperature decreases, since the heat supply of the
heater remains constant. The net result is that conversion decreases with higher owrates,
as illustrated in gure 5.19. At low ow rates,
r,4
becomes the limiting factor, which also
results in a decrease in conversion.
Changes in the liquor ow rate do not affect the production as much as changes in the heat
supply of the heater. As a result, similar performance is observed throughout the extrapolation
range. For #, the error remains constant at approximately the same value as the error
for ID5. The error is mainly caused by the error in
r,2
, which in turn is the result of the
performance of the fuzzy model for R
r,2
. The performance of the yield is good over the
entire extrapolation domain, indicating good performance of the fuzzy model for R
r,3
.
5.6.2 Dynamic performance
Since the dynamic structure of the hybrid model is based on rst principles, the model
is expected to perform well dynamically. This will be analyzed using step responses of step
changes in the heat supply of the heater Q
h
and the liquor ow rate
l
. In addition, frequency
extrapolation will be investigated.
Figure 5.20 shows the results for a step change in the lower heater heat supply at t = 0 min.
The change in Q
lh
of the EPM and Q
h
of the hybrid model was identical. The step was
performed for the operating conditions of run ID5. The steady state offset between the EPM
and the hybrid model can clearly be seen.
Table 5.15 shows the amplitudes # and for the step change. Values for both a positive
and a negative step change are given. Figure 5.20 (a) shows the response of the Kappa number
at the reactor outlet for a change in Q
h
. The change in Kappa number is approximately
the same for both models. This means that under these conditions, the steady state offset
remains constant. The dynamic behavior of both models is identical; both show a second
order response and similar dead times. Figure 5.20 (b) shows the response of the yield. The
inuence of the step on the yield is very small, but performance of the hybrid model is similar
to the performance of the EPM.
141
Hybrid model properties
0 100 200 300 400 500
25
30
35
40
45
t (min)
#
(
)
0 100 200 300 400 500
0.5
0.52
0.54
0.56
0.58
0.6
t (min)
)
(a) (b)
Figure 5.20: Hybrid model step response results for # (a) and (b) for positive and negative step
change in Q. Dots EPM, lines hybrid model
Figure 5.21 shows the response of the Kappa number and the yield for a step change in the free
liquor ow
l
. The gain of both models with respect to changes in
l
is different. This can
be explained as follows. A step change in
l
results in a change of the liquor concentration
(r, 4). It was found there is not much difference between the concentrations of species (r, 4)
in the EPM and the hybrid model. This indicates that diffusion is described in a similar
fashion by both models and that the difference in gain is the result of different sensitivities of
the reaction rates to changes in species (r, 4).
In addition, the gure shows that two effects play a role. Consider the response of the hybrid
model. If the ow rate is increased, then the concentration of the liquor reactants in the
reaction phase will increase over the entire reactor as a result from the increase in the driving
force of the diffusion. This results in a decrease of the Kappa number. However, since the heat
supply of the heaters is constant, the temperature will drop, resulting in an overall increase
in the Kappa number. These effects have the same inuence on the yield, but since the yield
does not change much, the inuence of the increase in concentration can be neglected.
The rst effect is not observed in the EPM. Since the volume of the free liquor phase in the
EPM is not constant due to compaction, the effect of a step change in the free liquor ow
rate is different. The second effect has a similar inuence as it has on the hybrid model. The
overall Kappa number is increased as a result form the decrease in reaction temperature.
The model orders O, time constants and dead times indicate that dynamic performance
of the hybrid model is good (see table 5.15). The dead times were derived from the step
responses, while the orders and time constants were determined using the Bode diagrams,
given in appendix D. The inuence of the simplications is observed in the hybrid model
Model Input O
[input],
O
[input],
[input],
[input],
[input]
# () ()
(min (min (min)
EPM Q
lh
2.3 2.3 42 43 87 -5.51, 6.17 -0.022, 0.022
f
2.9 3.3 35 35 113 4.71, -4.06 0.019, -0.020
Hybrid Q
lh
1.8 1.9 40 45 84 -5.77, 6.40 -0.018, 0.018
model
f
2.4 2.7 34 34 109 4.23, -3.69 0.014, -0.013
Table 5.15: Frequency response analysis results for hybrid model
142
0 100 200 300 400 500
25
30
35
40
45
t (min)
#
(
)
0 100 200 300 400 500
0.5
0.52
0.54
0.56
0.58
0.6
t (min)
)
(a) (b)
Figure 5.21: Hybrid model step response results for # (a) and (b) for positive and negative step
change in
l
. Dots EPM, lines hybrid model
response to a change in the ow rate, but the overall response is in accordance with the EPM.
The time constants and dead times of both models are approximately the same. There are
some differences with respect to the orders of the models. The order of the hybrid model
is slightly lower than the order of the EPM, since fewer PDEs are present. In addition,
numerical inaccuracies during the construction of the Bode plots inuence the results; the
frequency range limits accurate determination of the order. However, the overall performance
indicates that the dynamic behavior of both models is similar.
The analysis of the dynamic behavior of the hybrid model can also be viewed in the context
of frequency extrapolation. Frequency extrapolation occurs when the model is simulated at
different frequencies than the frequencies that were used during identication. In the case of
the hybrid model, the identication was performed using static data. Within this context, all
dynamic experiments presented in this section can be interpreted as frequency extrapolation
experiments. From this, it can be concluded that the hybrid model possesses good frequency
extrapolation properties.
5.6.3 Complexity
The owsheet of the hybrid model contains four sections that describe the most important
functional zones of the digester. The upper and lower heater are lumped and the heater
recycle ows are omitted. The quench recirculation is also omitted; in the EPM, quench
recirculation is set to zero (Wisnewski et al., 1997). The owsheet is more straightforward
than the owsheet of the EPM, while it is still possible to distinguish the different zones in
the digester.
The DFD of the section model (appendix D) clearly shows the reduced complexity of the
relations between the modeled phenomena. In comparison to the EPM, the entrapped phase
mass balance is not present, as is the bulk ow. Although the DFD is simpler, it does not
show the relations between the different components. The relation is only illustrated by the
model equations, which makes interpretation more cumbersome.
143
Hybrid model properties
The complexity of the model equations is similar to the EPM. Although less states are mod-
eled, the structure of the mass and energy balances is identical; in both models, the balances
consist of partial differential equations, describing the states as a function of time and place.
The connection between the balances in both models is also similar. The main difference is
that the hybrid model contains fuzzy submodels.
The fuzzy submodel for the diffusion coefcient is a very simple two rule SISO model. The
three fuzzy submodels for the reaction rates are more complex. The MISO models each are a
collection of local linear models. The activation of these models is based on the values of the
input variables. Although the structure of these submodels is clear, it is difcult to determine
which of the local models is most important, or how the individual local models determine
the behavior of the reaction rates.
The complexity of the fuzzy models is mainly determined by the number of rules and the rule
base. The clustering algorithm in combination with structure optimization provides a way to
minimize fuzzy model complexity, which yielded acceptable results. Other approaches for
evaluating fuzzy model complexity exist and more information can for example be found in
Setnes (1999).
5.6.4 Interpretability
The static behavior of the model can be explained physically by using the model ow-
sheet and section model DFD. For example, the interpretation of the temperature prole was
based on the model owsheet in order to distinguish the different functional zones. On a more
detailed level, the development of the concentration proles can be explained by using the
section model DFD. A decrease in temperature results in a decrease in reaction, as a result of
which the Kappa number rises.
The interpretation that is given to the simulation results or the model structure is based on the
process hypotheses that were used to build the model. This result in a level of interpretation
which is limited to the information that these hypotheses provide. In the case of the hybrid
models, the level of interpretation is limited to relations in terms of accumulation, convection,
reaction or diffusion. Physical interpretation on a more detailed level is more difcult and
even impossible for the fuzzy submodels, which give an input-output mapping.
The fuzzy models do, however, provide information about characteristic operating regimes
that can be interpreted; the reaction rate is high for high concentrations and high tempera-
ture. In addition, the distribution of the membership functions gives information about the
nonlinearity of the model output with respect to an input. For both R
r,2
and R
r,3
, a char-
acteristic transition can be observed at around T = 425 K. Such a transition can also be
observed for R
r,1
, but is less clear. The nonlinearity with respect to
r,4
is comparable for
all three reaction rate models. The membership functions of
r,4
in the models for R
r,2
and
R
r,3
are similar. The model for R
r,1
shows the same trends;
r,4
= mf4 and
r,4
= mf4
in the model for R
r,1
corresponds with
r,4
= mf3 and
r,4
= mf5 in the model for R
r,2
,
respectively. The nonlinearity with respect to the wood species differs for the three models,
144
which indicates different kinetic characteristics.
To explain the differences in the behavior of the fuzzy models in more detail, the consequent
parameters should be included in the analysis. However, the analysis of the observed be-
havior (section 5.5.3) and the premise part membership functions provides some a posteriori
information about the reaction kinetics. This is partly possible due to the transparent structure
of fuzzy models.
The data used to construct the fuzzy relations for the reaction rates and diffusion coefcient
is derived form the framework that was proposed a priori. The data was not obtained from
measurements or lab analysis. The fuzzy relations are specically designed for the physical
framework that is presented. Therefore, the reaction rates that are calculated by the fuzzy
models should be interpreted as a measure for the reaction rate, appropriate for the chosen
model structure. This is also true for the diffusion coefcient.
The level of physical interpretation of the hybrid model is similar to the EPM. In the EPM,
the level of interpretation is also limited by the physical framework. Interpretation on a more
detailed level involves analysis of empirical relations. The EPM, however, has a more detailed
model structure, which means that it provides more information about the unit operation and
the relation between specic wood and liquor components.
5.6.5 Process independence
The level process independence is based on the sources of information that are used. In
the case of the hybrid model, the main source of information was the EPM. This means that
in general, the hybrid model can be used for the same applications as the EPM, as long as the
model can provide the required information. Some comments, however, need to be made.
The EPM can be viewed upon as the process for which the hybrid model was built. The
hybrid model is completely based on this process, which makes it process dependent. It
has the same possibilities and limitations as the EPM has. In addition, with respect to static
performance, it is only valid for operating conditions that were used during identication.
If the model is to be used for other conditions, for example other wood species, it can be
adjusted by re-identifying the fuzzy equations. New identication data is then required. The
rest of the model can remain unchanged.
The application possibilities of the hybrid model on different continuous pulp digester instal-
lations are the same as the possibilities the EPM has. The model may need adjustment of the
fuzzy relations or other model parameters to provide acceptable performance. However, due
to the rst principles information, the main physical phenomena that play a role are accounted
for, which means that the adjustments will be minor.
Similar to the EPM, the hybrid model has limited process independence with respect to pro-
cess design. It provides little information about design parameters, except for reactor height
and width. In addition, the need for parameter adjustment in order to use the model for differ-
145
Hybrid model properties
ent digester installations makes it more suited for studying process operation than for process
design.
5.6.6 Remarks
The hybrid model describes the key variables of the pulp digester acceptably (require-
ments 1 and 2). The maximum static error is around 5 units for # and around 0.01 for .
The dynamic characteristics of the hybrid model and its reference are similar, although the
detected order of the hybrid model was somewhat lower than the order of the EPM.
The description of the variables is based on a simplied interpretation of the process, which
results in a model that is transparent and a model structure which can be interpreted physically
(requirements 3 and 4). Although it provides less detailed information than the EPM, it is still
possible to explain the digester behavior in terms of convection, reaction and diffusion.
The differences in model structure between the EPM and the hybrid model manifest them-
selves in the states that are modeled and hence the observed order (requirement 5). The
lumping of some of the species is based on physical assumptions and results in differences in
the equation structure (such as for the reaction rates) or in the model structure (the connec-
tions between the modeled effects).
Since the fuzzy models describe static relationships, no dynamical data was needed during
identication. The dynamic characteristics are determined by the rst principles framework.
This makes the modeling approach useful if limited amounts of dynamic data are available
and rst principles provide sufcient information to design the model framework.
5.7 Fuzzy model problem denition and design
To determine hybrid model properties in relation to rst principles models and fuzzy
models, a third, fuzzy model for the digester is developed. This model is completely black
box and represents one extreme of the model spectrum (section 5.4.1). As with the hybrid
model, the EPM will serve as the reference. In order to be able to compare the fuzzy model
with the hybrid model, the input-output structure has to be similar. The fuzzy model should
describe the static as well as the dynamic behavior of the digester.
5.7.1 Model structure
To meet the model requirements, the fuzzy model will describe the key variables: the
Kappa number # and the yield at the reactor outlet. The most simple model structure is a
black box approach, in which # and are described as a function of the inputs. This results
146
in a model that does not provide information about concentration proles, such as the EPM
and the hybrid model do. However, the purpose of the fuzzy model is to provide a black box
description of the digester behavior. This means that a description of the key variables based
on concentration information is not required and that the fuzzy model can describe # and
directly instead.
Similar to the hybrid model, the fuzzy model will describe the kappa number and the yield
for a xed wood chip ow rate and one type of wood. The input variables that are required
are the heat input Q
h
and the free liquor ow
l
. The model will consist of two Multiple
Input Single Output (MISO) models; one for each of the output variables.
The dynamics are incorporated by using an ARX structure. A second order structure will be
used (see section 5.3.1). In addition, the dead times with respect to the input variables needs
to be incorporated. This yields the following model structure:
#
k
= f
fuzzy
(#
k1
, #
k2
, Q
h,k
Q
,
l,k
) (5.19)
k
= f
fuzzy
(
k1
,
k2
, Q
h,k
Q
,
l,k
) (5.20)
in which
k
denotes the time step and
Q
and
l
denote the dead times with respect to Q
h
and
l
.
The fuzzy model will be of the Sugeno type. The MISO models can then be interpreted as a
collection of local linear ARX models.
5.7.2 Data acquisition
The identication data needs to represent the dynamic behavior of the pulp digester. To
accomplish this, identication and validation data sets have been designed based on ramped-
RMRI (RandomMagnitude Random Interval) input signals. Such a signal consists of a series
of ramped steps of random magnitude with, spaced at random intervals.
To design the input signals, a magnitude domain, an interval domain and a ramp domain
need to be specied. The input signals are allowed to vary between the standard operating
conditions +/- 10 %, similar to the identication data of the hybrid model.
To generate the input signal for Q
h
, the EPM was subjected to ramped-RMRI signals for both
the upper heater heat supply Q
uh
and the lower heater heat supply Q
lh
. The signal that is
used for Q
h
during identication of the fuzzy model consists of the sum of these two signals.
The length of the interval between two steps,
is
, is chosen at random using the following
rule of thumb:
1
4
ss
<
is
<
5
4
ss
(5.21)
with
ss
= + 5 (5.22)
147
Hybrid model properties
Model
Q,[output]
(min)
,[output]
(min) k
0
cm
# rules
# 87 113 6 0.7 3
87 113 6 0.4 5
Table 5.16: Settings and structure results fuzzy model
where is the rst order time constant and is the dead time with respect to the input
signal. The ramp domain was arbitrarily set: the minimumduration was 2 min, the maximum
duration was 200 min. To acquire sufcient information, 25 different steps were performed,
resulting in a signal duration of approximately 8000 min.
The EPM was simulated using the ramped-RMRI input signals. The resulting input-output
data sets for identication and validation are shown in appendix A.
5.7.3 Model identication
Both MISO models were identied using GK-clustering in combination with structure
optimization (section 4.4.1). Since the large values of Q
h
can give computational problems
while evaluating the membership functions, the signal was scaled by a factor of 1e5. Settings
and fuzzy model structure results are shown in table 5.16. The fuzzy models are given in
appendix A.
Figure 5.22 shows the simulation results of a free run, in which the fuzzy model is simulated
in an autoregressive manner. For clarity, only some of the measurements are shown. The
performance of the fuzzy model is good for the kappa number as well as for the yield. Some
mismatch is observed during fast changes (for example around t = 7500 min. Overall, the
fuzzy model matches the identication data.
For all rules in the fuzzy MISO models for the Kappa number, the consequent parameters
corresponding with #
k1
have approximately the value of 2. All consequent parameters
corresponding with #
k2
have the value of 1 (equation E.5). This indicates second order
behavior with approximately critical damping (Stephanopoulos, 1984).
5.7.4 Model validation
To validate the fuzzy model, a validation input-output data set was designed, similar to
the identication data set (see appendix A). Simulation results are shown in gure 5.22 (c)
and (d). Similar to the results for the identication data set, model mismatch occurs for fast
changes. In addition, at around t = 3700 min, large errors are observed for # and . Here,
a combination of input signals is encountered that was not present in the identication data
set, resulting in a large model error. This indicates poor extrapolation properties. Otherwise,
performance is similar to the performance for the identication data set.
148
0 1000 2000 3000 4000 5000 6000 7000 8000
0
20
40
60
80
100
t (min)
#
(
)
(a)
0 1000 2000 3000 4000 5000 6000 7000 8000
0.4
0.45
0.5
0.55
0.6
0.65
0.7
0.75
t (min)
)
(b)
0 1000 2000 3000 4000 5000 6000 7000 8000
0
20
40
60
80
100
t (min)
#
(
)
(c)
0 1000 2000 3000 4000 5000 6000 7000 8000
0.4
0.45
0.5
0.55
0.6
0.65
0.7
0.75
t (min)
)
(d)
Figure 5.22: Fuzzy model simulation results identication data set (# (a) and (b)) and validation
data set (# (c) and (d)). Dots EPM, lines fuzzy model
149
Hybrid model properties
0 2 4 6 8
x 10
5
50
0
50
100
150
200
Q (kJ/min)
#
(
)
0 2 4 6 8
x 10
5
0
0.2
0.4
0.6
0.8
1
Q (kJ/min)
)
(a) (b)
Figure 5.23: Static extrapolation results fuzzy model for # (a) and (b). Lines indicate identication
domain
5.8 Fuzzy model analysis
The fuzzy model will be analyzed with respect to the same aspects as the hybrid model:
static and dynamic performance, complexity, interpretability and process independence. This
makes comparison between the results possible. Of the quality aspects, the performance is
the most important.
5.8.1 Static performance
Figure 5.23 shows the performance of the fuzzy model as a function of Q
h
. The liquor
ow
l
was kept constant at the standard operating value (ID5). For both the Kappa number
and the yield, the fuzzy model performs well within the identication domain. Outside this
domain, the extrapolation behavior is approximately linear. This is mainly caused by the
linear nature of the identication data with respect to Q
h
in the identication domain. In
addition, if the model is extrapolated far enough, only one of the fuzzy rules will become
active. This means that an output variable is described by one linear ARX model. For #,
this results in negative values for high Q
h
. This shows the black box nature of the model; no
information is incorporated stating that the Kappa number cannot become negative.
Evaluating the static performance of the fuzzy model as a function of
l
(gure 5.24) gives
similar results. In the identication domain, the fuzzy model performs well. The fuzzy model
shows linear extrapolation behavior for changes in the liquor ow rate, mainly caused by the
linear nature of the data in the identication domain.
The results show that the static extrapolation performance of the fuzzy model for the yield
is comparable to the performance of the hybrid model. For the Kappa number, the hybrid
model performs slightly better, especially for high values of Q
h
.
150
1.5 2 2.5 3
0
50
100
150
l
(m
3
/min)
#
(
)
1.5 2 2.5 3
0
0.2
0.4
0.6
0.8
1
l
(m
3
/min)
)
(a) (b)
Figure 5.24: Static extrapolation results fuzzy model for # (a) and (b). Lines indicate identication
domain
5.8.2 Dynamic performance
The behavior of the fuzzy model for a step change in Q
h
is different from the behavior
of the EPM (gure 5.25). The step was performed at t = 0 min, while the initial conditions
were taken from experiment ID5. The steady state offset can be observed, while the model
gain is approximately the same as the gain of the EPM. The fuzzy model shows overshoot;
the response is underdamped.
The results for a step change in the liquor ow
l
are given in gure 5.26. The step was
performed at t = 0 min and the initial conditions were taken from ID5. For these conditions,
the steady state performance of the fuzzy MISO model for the yield is good (as was already
observed in gure 5.24). Table 5.17 shows the amplitudes # and . Values for a positive
and a negative step change, respectively, are given. The gains of the fuzzy model are smaller
than the gains of the EPM. The gains for step changes in
l
are approximately equal for
positive an negative steps. This indicates that probably one of the rules is active under those
0 100 200 300 400 500
25
30
35
40
45
t (min)
#
(
)
0 100 200 300 400 500
0.5
0.52
0.54
0.56
0.58
0.6
t (min)
)
(a) (b)
Figure 5.25: Fuzzy model step response results for # (a) and (b) for positive and negative step
change in Q
h
. Dots EPM, lines fuzzy model
151
Hybrid model properties
0 100 200 300 400 500
25
30
35
40
45
t (min)
#
(
)
0 100 200 300 400 500
0.5
0.52
0.54
0.56
0.58
0.6
t (min)
)
(a) (b)
Figure 5.26: Fuzzy model step response results for # (a) and (b) for positive and negative step
change in
l
. Dots EPM, lines fuzzy model
conditions. As a result, the behavior is given by one of the local linear second order ARX
models.
Frequency response analysis shows that there are considerable differences between the sys-
tem orders O and time constants of EPM and the fuzzy model (table 5.17). The system
orders and time constants were determined using the Bode diagrams. During model design,
different model orders were investigated and the second order structure provided the best re-
sults. The simple model structure does not allow the orders that were observed from the EPM
to be described accurately. The fuzzy model imposes second order behavior for step changes
in both Q
h
and
l
. However, in the EPM, the order of the behavior for a steps change in Q
lh
is different than for
f
.
The fuzzy model describes the identication and validation data set sufciently accurate.
However, frequency response analysis reveals that there are differences between the dynamic
characteristics of the EPM and the fuzzy model. These differences will be more apparent for
high frequencies. To improve the dynamic characteristics of the fuzzy model, more informa-
tion needs to be incorporated during model identication.
5.8.3 Complexity, interpretability and process independence
The structure of the fuzzy model is very simple. It consists of two MISO fuzzy ARX
models, each with four inputs. The models represent a collection of linear ARX models, that
Model Input O
[input],
O
[input],
[input],
[input],
[input]
# () ()
(min) (min) (min)
EPM Q
lh
2.3 2.3 42 43 87 -5.51, 6.17 -0.022, 0.022
f
2.9 3.3 35 35 113 4.71, -4.06 0.019, -0.020
Fuzzy Q
h
2.1 2.0 52 58 87 -4.54, 4.98 -0.018, 0.022
model
l
2.0 2.0 50 50 113 3.70, -3.68 0.019, -0.020
Table 5.17: Frequency response analysis results fuzzy model
152
Model Application
Quality aspect EPM Hybrid Fuzzy R&D Design Control
Static performance + +
Dynamic performance ++ 0
Complexity 0 + ++
Interpretability ++ ++
Process independence ++ 0
Table 5.18: Comparison of model quality aspects, ranging from poor quality () to good quality
(++)
describe the Kappa number and the yield as a function of the liquor ow and the heat supply
of the heater.
Obviously, the simple structure does not provide much information about the physical pro-
cesses that take place. Interpretation of the fuzzy model can only be done in terms of input-
output behavior. With the EPM and the hybrid model, the model structure provided much
information about the relation of physical phenomena. Since the fuzzy model maps the pro-
cess inputs directly to the process outputs, such a structure is not present.
Since the information that was used to build the fuzzy model only consisted of process data,
the model is very process dependent. As is common for these black box models, transporta-
bility is very limited. The performance analysis has already shown that there are differences
in the dynamic characteristics of the fuzzy model and the process it was built for. As
a result, the fuzzy model is only suitable for application under the process conditions that
are represented by the identication data. It is easy to improve performance by providing
more information during model identication, for example by providing additional mea-
surements. This, however, yields a new fuzzy model.
5.9 Model evaluation
In order to evaluate the properties of the hybrid model of the pulp digester, the modeling
results are summarized in table 5.18. This table shows how well each of the models performs
with respect to one of the ve quality aspects. In addition, the relevance of a quality aspect
with respect to a general eld of application is given (see also table 3.1).
The static performance of the hybrid and fuzzy model is comparable. Both describe the key
variables well within the identication domain; the fuzzy model performs slightly better for
the Kappa number # than the hybrid model. The fuzzy model is directly t to # and
has, because of the high number of parameters in comparison to the model equations, the
exibility to describe the key variables well for steady state conditions. In addition, it is an
input output mapping, which means that error integration is not present, as is the case for the
hybrid model.
Dynamically, the hybrid model performs better than the fuzzy model. Time constants and
systemorders match the EPMbetter than the fuzzy model. This is the result of the inclusion of
153
Hybrid model properties
rst principles information for the dynamics. In the hybrid model, the key variables are based
on the concentrations of several species, which in turn are described by physical phenomena.
This way, the observed behavior is matched better. The MIMO reference system shows
different orders for changes in the inputs (a change in heat supply Q
h
results in a second
order response, a change in the liquor ow rate results in a third order response), which
is accounted for in the hybrid model. The simple structure of the fuzzy model is not able
to describe this behavior, although the performance with respect to the identication and
validation data is acceptable.
Obviously, the hybrid model is less complex than the EPM and more complex than the fuzzy
model. The result is that the hybrid model is less interpretable than the EPM, but provides
more information about the behavior than the fuzzy model does. Although the fuzzy model
is transparent, it provides no physical information about the internal behavior. The hybrid
model can be interpreted physically; the observed behavior can be explained with the model
structure, in which the relation between the modeled physical phenomena is represented.
The level of interpretability of both the EPM and the hybrid model is comparable. In both
models, the structure provides the most explanation of the behavior. The model equations
provide information about the relative importance of the modeled effects. In both models,
some of these equations are based on empirical data. The reaction rates in the EPM are based
on empirical data, while the reaction rates in the hybrid model are derived from observed
behavior. The level of interpretability is limited by the information that the specic equations
provide.
Since part of the identication of the hybrid and fuzzy models was data driven, the models are
more process dependent than the EPM. The fuzzy model is the most process dependent, since
the observed behavior was the only information that was provided during identication. The
hybrid model contains more information from other sources, such as rst principles, which
results in better transportability. The use of other sources in addition to process data also
means that the data does not have to provide full information about the behavior. If the model
is used for other installations, the fuzzy submodels may need to be adjusted, but the structure
can remain unchanged.
The level of process dependence, although limited, makes the hybrid model less useful for
process or equipment design studies on a detailed level. This is also true for research and
development applications. A hybrid model can be very useful in studies that have the goal
to gain more insight in the behavior of processes, but should not be used for applications
that require models that are process independent. Regarding control applications, the hybrid
model can be useful for specic control design or optimization problems. For use of the
model in online control applications, the computational effort should be carefully evaluated.
The level of complexity determines the amount of information that a model provides, but
also the computational effort that is required to solve it, which may be limited in online
applications.
The evaluation indicates that hybrid models are suited for applications that require models
that are transparent, provide sufcient explanation of the process behavior and focus on a
specic installation. Whether this is in the eld of R&D, design or control, entirely depends
on the application.
154
5.10 Concluding remarks
The hybrid model of the continuous pulp digester meets the requirements that were set.
The behavior of unknown parts was derived from the observed behavior without making
a priori assumptions about the internal structure of the submodels. In addition, since the
fuzzy relations are static, there is no need for dynamic data during the identication of these
unknown parts.
The hybrid model contains more information than the complete fuzzy model. As a result,
it possesses better extrapolation properties. Although the error increases during extrapola-
tion, the hybrid model performs better than the fuzzy model. This was observed during the
evaluation of the static and dynamic performance.
The hybrid model was designed by incorporating the essential characteristics of the process,
given the model requirements. The number of states has been reduced from 19 to 13, while
the number of model sections has been reduced from10 to 4. This yields a reduction in model
complexity of roughly 30 %. Still, the behavior of the hybrid model approaches the behavior
of the EPM closely. This is the result of the incorporation of the most important states that
describe the physical background of the key variables. It also provides an explanation for the
behavior of the model. If these states are not taken into account, it is not possible to combine
similar performance with a similar level of interpretation.
It is difcult to determine exactly for which applications hybrid models can be benecial
and for which not. The enormous amount of literature about process modeling illustrates
that each modeler presents a different approach for each application. This chapter has tried
to provide more insight in general aspects of hybrid modeling and hybrid model properties.
This has been done by presenting a detailed discussion on the design of a hybrid model and by
comparing this model to two other models. It thus provides a basis for modelers to determine
whether hybrid modeling is a suitable approach for their modeling problem.
The next chapter will deal with the development of a hybrid model for an experimental pro-
cess unit setup. Although the hybrid model will be simpler than the model for the pulp
digester, it will investigate the performance of the modeling approach in an actual process
environment where limited amounts of data are available.
155
6
Application
The aim of this chapter is to build a hybrid model, based on experimental data. A suitable
process, of which measurements were available, is a batch distillation column. In addition,
dynamic modeling of this column has been applied to characterize column operation (Betlem
et al., 1998). It is therefore interesting to see if a hybrid model can be applied for such a
purpose.
Two different hybrid models will be developed for the column. The models differ with respect
to the dynamics that are incorporated. The goal of this exercise is to illustrate the development
of the minimal dynamic framework to obtain a useful dynamic description. The best model
will be used to determine the optimal production rate for a batch.
First, the application of the hybrid model will be discussed, from which the key variables and
model requirements will be derived. Then, the two different hybrid models will be designed
and compared. Finally, the best model will be applied in a simulation study to determine the
optimal production rate.
6.1 Problem denition
An important parameter during batch operation is the duration of a batch or the batch
time. Often, the objective is to maximize the production within a certain amount of time, for a
given quality. A model that calculates the production as a function of the batch time provides
insight in this problem. Several approaches have been presented for the column under study,
ranging from rigorous to simplied reduced models. A simplied model has been used to
evaluate different basic control policies for a single distillation cut (which means distillation
of the feed stock without slop recycle) (Betlem et al., 1998):
Constant reux control
Constant quality control
Dynamic optimization according to Pontryagins maximum principle
The model that was used, does not take the start-up of the column into account. Start-up of
a batch requires about 20 to 30 min, while a typical batch run for a single cut takes about 3
h. This means that start-up comprises a signicant part of the batch run. A model that can
describe the start-up of the column would therefore be benecial for optimization studies.
Measurements of several batch runs with constant quality control were available. These in-
volve the separation of ethanol and 1-propanol for different feed stocks and equal product
qualities. It is assumed that for this separation, constant quality control is equivalent to dy-
namic optimal control (Betlem et al., 1998). The data includes a description of the start-up of
the column. Based on this data, a hybrid model will be developed that describes the produc-
tion for one quality and for the constant quality control strategy during start-up and normal
operation.
159
Application
The measurements limit the level of detail of the hybrid model. No internal data is available,
only input-output measurements. This means that the hybrid model will be similar to the
reduced models in Betlem (1997) that describe overall dynamic behavior.
6.2 Hybrid model design
The design phase will follow the approach presented in chapter 4. First, the model struc-
ture will be determined, after which the data is pre-processed and the parameters and fuzzy
models are identied. Subsequently, the fuzzy submodels will be integrated into the rst
principles framework.
6.2.1 Basic modeling
Process description and information analysis
The experimental setup involves a batch distillation column with 21 bubble cap trays with
an internal diameter of 76 mm and a Murphee vapor tray efciency of about 53 %. The only
manipulated variable is the reux fraction R
)
0 0.5 1 1.5 2
0.6
0.7
0.8
0.9
1
t (h)
R
*
(
)
(a) (b)
0 0.5 1 1.5 2
100
120
140
160
180
t (h)
L
V
(
m
o
l
/
h
)
(c)
Figure 6.2: Available measurements, run ID1
Process structure and basic equations
To describe the production dynamically, the overall dynamics of the column should be
included in the model. The short term dynamics are less important; they describe the tray
to tray behavior inside the column. In addition, the measurements do not provide sufcient
information to include short term dynamics.
To describe the overall dynamics of the column, a simplied model comprised of an overall
separation approximation, combined with rst order dynamics for the exhaustion, has been
derived (Betlem, 1997):
dM
col
dt
= L
V
(1 R
) (6.1)
M
col
dx
col
dt
= L
V
(1 R
)(x
0
n+1
x
col
) (6.2)
x
0
n+1
= f(R
, x
col
) (6.3)
in which M
col
(mol) is the mass of the column, L
V
(mol/h) is the vapor ow rate, L
V
(1
R
) = L
D
(mol/h) is the distillate ow rate, x
col
is the average molar ethanol fraction in the
column, x
0
n+1
is the molar ethanol fraction of the product and R
V
(L
V,ss
L
V
) (6.5)
in which
V
is the rst order time constant that characterizes column heating and L
V,ss
is
the steady state vapor ow for the given heat supply (this is an approximation, since the
vapor ow changes during the batch run due to exhaustion). The actual behavior is more
complex, as explained before. After the vapor ow reaches the condensor the behavior of
the ow can be approximated by the rst order equation. For this approximation, an estimate
for the initial condition L
V,0
should be used such that the equation matches the observed
behavior. The start-up behavior of the model before the vapor ow reaches the condensor is
not incorporated and not represented by the measurements. As a result, the model describes
only the nal stage of the start-up procedure.
Equation 6.3 describes the separation as a function of the reux fraction and the column
quality. This relation is difcult to derive. Relations based on a so-called separation factor
have been used, but they only approximate the separation for steady state conditions as they
are valid for continuous operation (Shinskey, 1984).
Based on the process data, a fuzzy relation that describes the separation can be derived. Since
it is derived form the data, it can match the observed behavior without imposing a functional
relation a priori, as is the case with empirical separation factor relations. In addition, the
inuence of the start-up on the product quality can be taken into account by incorporating
the vapor ow in the input space of the fuzzy model. This way, a hybrid model is obtained
that includes rst order exhaustion dynamics, rst order startup dynamics and a static fuzzy
relation that describes the product quality. The structure of the fuzzy relation is therefore
given by:
x
0
n+1
= f
fuzzy
(R
, x
col
, L
V
) (6.6)
This model does not take top concentration dynamics, characterized by the dominant time
constant of the column (Betlem, 2000), into account. It is interesting to investigate to what
extend incorporating column dynamics inuences the result. This illustrates the effect that
omitting characteristic dynamic information has on model performance. To incorporate the
dominant column dynamics, a rst order state equation for the product quality is used:
dx
0
n+1
dt
=
1
x
(x
x
0
n+1
) (6.7)
in which
x
is the dominant time constant and x
is given by:
x
= f
fuzzy
(R
, x
col
, L
V
) (6.8)
Equation 6.8 can be interpreted as the forcing function for the product quality.
163
Application
Run
V
(h) L
V,0
(mol/h) L
V,ss
(mol/h)
ID1 0.2 120 165
ID2 0.2 120 165
ID3 0.2 120 168
Average 0.2 120 166
Table 6.2: Estimation results parameters start-up dynamics
The result is two different hybrid model structures: a model that does not incorporate top con-
centration dynamics, denoted as the simplied overall static model, given by equations 6.1,
6.2, 6.5 and 6.6, and a model that incorporates top concentration dynamics, denoted as the
simplied overall dynamic model, given by equations 6.1, 6.2, 6.5, 6.7 and 6.8. Appendix F
shows the DFDs of both structures. Both models will be build and compared.
6.2.2 Simplied overall static model identication
Identication of the simplied overall static model involves the determination of
V
,
L
V,0
, L
V,ss
and identication of the fuzzy relationship for the product quality (equation 6.6).
Start-up dynamics
The time constant
v
and the steady state value L
V,ss
were determined by tting equa-
tion 6.5 to the measurements of L
V
. In addition, the initial condition L
V,0
was determined to
provide a better approximation. Results are shown in gure 6.3. Table 6.2 shows the values
of
V
, L
V,0
and L
V,ss
for the three identication runs. The nal values that will be used in
the model are the averages of the values for the three runs.
Product quality
For the identication of the fuzzy submodel, R
, x
col
, L
V
and x
0
n+1
need to be available.
The reux fraction R
, the vapor ow L
V
and the product quality x
0
n+1
are measured. The
column quality x
col
can be obtained by solving equations 6.1 and 6.2. This is done for each
of the three identication runs. The combined runs result in an input-output data set of 512
features.
The fuzzy model was identied using GK-clustering in combination with structure optimiza-
tion (see section 4.4.1). No data reduction was applied; it was found that data reduction did
not improve clustering results. Clustering settings and results are shown in table 6.3. The
table also gives the Root Mean Squared Error with respect to the identication data. The
fuzzy model is given in appendix F.
164
0 0.5 1 1.5 2
100
120
140
160
180
200
t (h)
L
V
(
m
o
l
/
h
)
0 1 2
100
120
140
160
180
200
t (h)
L
V
(
m
o
l
/
h
)
(a) (b)
0 1 2 3
100
120
140
160
180
200
t (h)
L
V
(
m
o
l
/
h
)
(c)
Figure 6.3: Model results for L
V
, run ID1 (a), ID2 (b) and ID3 (c)
Model validation
To analyze the performance of the model, the model will be solved for the initial condi-
tions of the identication and validation runs. The simulation will use the measurements of
the manipulated variable R
)
0 1 2
0.97
0.975
0.98
0.985
0.99
t (h)
x
0n
+
1
(
)
(a) (b)
0 1 2
0.97
0.975
0.98
0.985
0.99
t (h)
x
0n
+
1
(
)
0 0.5 1 1.5 2
0.97
0.975
0.98
0.985
0.99
t (h)
x
0n
+
1
(
)
(c) (d)
0 1 2
0.97
0.975
0.98
0.985
0.99
t (h)
x
0n
+
1
(
)
(e)
Figure 6.4: Model validation results for x
0
n+1
of the simplied overall static model. Dots indicate
measurements, line indicates hybrid model results for ID1 (a), ID2 (b), ID3 (c), VAL1 (d) and VAL2 (e)
166
Batch K
I
ID1 5 0.04
ID2 5 0.07
ID3 4 0.1
Table 6.4: PI-estimator settings for x
, x
col
, L
V
and x
well, as can be seen in appendix F. This error is integrated and the result is that x
0
n+1
lags
behind, as is the case for ID1 and VAL2. The overshoot in x
0
n+1
is described slightly better
than with the static model. In addition, the simulation results for x
0
n+1
contain less noise.
This is the result of the dynamic equation for the product quality, which has a ltering effect.
Model k
0
cm
# rules RMSE
x
10 0.6 3 3.7e-3
Table 6.5: Clustering settings and fuzzy model results for x
167
Application
0 0.5 1 1.5 2
0.96
0.97
0.98
0.99
1
1.01
t (h)
x
*
(
)
0 0.5 1 1.5 2
0.97
0.975
0.98
0.985
0.99
t (h)
x
0n
+
1
(
)
(a) (b)
0 1 2
0.96
0.97
0.98
0.99
1
1.01
t (h)
x
*
(
)
0 1 2
0.97
0.975
0.98
0.985
0.99
t (h)
x
0n
+
1
(
)
(c) (d)
0 1 2 3
0.96
0.97
0.98
0.99
1
1.01
t (h)
x
*
(
)
0 1 2 3
0.97
0.975
0.98
0.985
0.99
t (h)
x
0n
+
1
(
)
(e) (f)
Figure 6.5: Estimation results of x
for batch run ID1 (a), ID2 (c) and ID3 (e), and corresponding
estimates of x
0
n+1
((b), (d) and (f), respectively).
168
0 0.5 1 1.5 2
0.97
0.975
0.98
0.985
0.99
t (h)
x
0n
+
1
(
)
0 1 2
0.97
0.975
0.98
0.985
0.99
t (h)
x
0n
+
1
(
)
(a) (b)
0 1 2
0.97
0.975
0.98
0.985
0.99
t (h)
x
0n
+
1
(
)
0 0.5 1 1.5 2
0.97
0.975
0.98
0.985
0.99
t (h)
x
0n
+
1
(
)
(c) (d)
0 1 2
0.97
0.975
0.98
0.985
0.99
t (h)
x
0n
+
1
(
)
(e)
Figure 6.6: Model validation results for x
0
n+1
of the simplied overall dynamic model. Dots indicate
measurements, line indicates hybrid model results for ID1 (a), ID2 (b), ID3 (c), VAL1 (d) and VAL2 (e)
169
Application
6.3 Hybrid model evaluation
Both the static and dynamic models describe the overall behavior of the distillation col-
umn and have a simple structure, which makes detailed evaluation of model interpretability
and complexity of less interest. The focus of the evaluation phase will therefore be on the
application of the models; the models will be used to determine the production curve which
describes M
p
in relation to the batch time t
b
.
6.3.1 Production curve
Evaluating the production as a function of the batch time results in a production curve
(see also Rippin (1983)). The curve represents the amount of product produced as a function
of the duration of a batch. To construct such a curve, it is required that each batch produces
product with the same quality. Since the hybrid models calculate the production under con-
stant quality control, the product quality is constant over the course of a batch. A production
curve can then simply be constructed by calculating the production during the course of a
single batch run.
The production curves will be determined by simulating batch runs using the constant quality
control strategy. To accomplish this, a simple PI controller is constructed that controls the
quality x
0
n+1
by manipulating R
= 1.
Production curves were constructed for the conditions of the three identication batch runs.
This enables comparison to the experimental results. Both the hybrid models were used.
Controller settings for both models are given in table 6.6. Figure 6.7 shows controller per-
formance for the conditions of batch ID1; in both cases, the quality x
0
n+1
is controlled well.
The dynamic model shows some overshoot. The presence of overshoot was found to be in-
dependent of controller tuning, which indicates that it is the result of general model behavior,
similar to the experimental results.
The controller gain K is negative for the static model. This means that in order to increase
product quality, the reux fraction R
) are negative,
which locally results in an increase of x
0
n+1
if R
is decreased.
Model K
I
Static -50 0.1
Dynamic 100 0.1
Table 6.6: PI-controller settings for x
0
n+1
170
0 2 4 6 8
0.97
0.975
0.98
0.985
0.99
t (h)
x
0n
+
1
(
)
0 2 4 6 8
0.97
0.975
0.98
0.985
0.99
t (h)
x
0n
+
1
(
)
(a) (b)
Figure 6.7: Quality control results static (a) and dynamic (b) model, settings run ID1.
The negative parameters are the result of the t to the experimental data. The fuzzy model
maps the reux fraction, the column composition and the vapor ow to the product quality.
With the exception of start-up, the product quality x
0
n+1
is constant throughout the batch run.
This means that the fuzzy model essentially describes a hyperplane with slope 0. Due to the
noise that is present, the least squares t results in a slightly negative slope with respect to
R
.
Although it was found that the controller of the static model achieves constant quality, the
reux fraction R
, x
col
and L
V
. This can clearly be seen during start-up. Consider gure 6.2. In
gure 6.2 (b), the reux fraction R
.
The result is that the dynamic model approximates the experimental situation much more
accurately. In addition, the extrapolation behavior (t
b
> 3 h) is in accordance with the
expectation that the maximum exhaustion is below its theoretically achievable value.
The dynamic model includes the top concentration dynamics using a rst order approxi-
mation characterized by the dominant time constant. An alternative would be to derive the
dynamic behavior of the product quality fromexperimental data and incorporate this behavior
171
Application
0 2 4 6 8
0
50
100
t (h)
M
p
(
m
o
l
)
max
0 2 4 6 8
0
50
100
t (h)
M
p
(
m
o
l
)
max
(a) (b)
0 2 4 6 8
0
50
100
t (h)
M
p
(
m
o
l
)
max
(c)
Figure 6.8: Production curves for the conditions of batch run ID1 (a), ID2 (b) and ID3 (c). Solid line
experimental results, dotted line static model, dashed line dynamic model
in the form of a dynamic fuzzy submodel. The structure of such a hybrid model is similar to
the structure of the static model and omits the need for a rst order approximation. However,
as shown in chapter 5, the experimental data needs to contain sufcient information about
the dynamics in order to obtain a good model. In this case, experimental data is limited and
noisy. The incorporation of the dominant time constant can be viewed upon as the use of an
additional source of information, that does provide sufcient information about the dynamics
without the need for more experimental data.
The results show that the performance of the static model is acceptable if the desired behav-
ior of R
is imposed on the model. During model validation, the desired behavior of the
reux fraction R
S
: X {0, 1} (A.1)
such that for any element x of the universe,
S
(x) = 1 if x is a member of S and
S
(x) = 0
if x is not a member of S. Figure A.1 illustrates this.
A fuzzy subset is described by a characteristic function that differs from equation A.1 in that
it is extended from the unit binary pair {0, 1} to the unit interval [0, 1]:
S
: X [0, 1] (A.2)
Equation A.2 is called a membership function. The membership function describes to which
degree an element is a member of the fuzzy subset. Let S be a fuzzy subset of the universe
of discourse X. The membership of S for an element x is calculated with the membership
function for S and is denoted the membership grade
S
:
S
= mf
S
(x) (A.3)
183
Fuzzy logic
0 5 10
0
0.2
0.4
0.6
0.8
1
1.2
x
y
i
G
y
i
(A.9)
in which G is the set of elements which attain the maximum value of the membership grades
F,y
i
of the fuzzy subset F and m is the cardinality of G.
185
Fuzzy logic
The MOM method does not take the shape of the fuzzy subset into account. The Center Of
Area method does not have this problem. COA defuzzication is calculated as follows:
y =
_
y
F,y
dy
_
F,y
dy
(A.10)
A.2 Fuzzy models
Fuzzy models, a representation of the essential features of a system by the apparatus of
fuzzy set theory, basically fall into two categories, which differ fundamentally in their ability
to represent different types of information. The rst includes linguistic models, of which the
Mamdani-type are the most common. These models are based on collections of IF-THEN
rules with vague predicates and fuzzy reasoning.
The second category of fuzzy models is based on the Takagi-Sugeno-Kang (TSK) method of
reasoning (Takagi and Sugeno, 1985)
1
. These models are formed by logical IF-THEN rules
that have a fuzzy antecedent part and a functional consequent part. TSK models integrate
the ability of linguistic models for qualitative knowledge representation with an effective
potential for expressing quantitative knowledge as well.
A.2.1 Linguistic models
Generally speaking, linguistic models describe systems using a set of IF-THEN rules, in
which the rules take the place of the usual set of equations used to characterize a system. The
linguistic model is a knowledge-based system; it contains rules which incorporate inherently
fuzzy or effectively fuzziable real-world knowledge. The decision making ability of the
linguistic model depends on the existence of a rule-base and fuzzy reasoning mechanism.
Consider a double-input single-output system. Knowledge about this system can be encoded
by a set of IF-THEN rules with two antecedent variables and one consequent variable:
IF u
1
= mf
u
1
,i
AND u
2
= mf
u
2
,j
THEN y = mf
y,k
(A.11)
in which the fuzzy sets are represented by the membership functions mf
,
. The IF-part
of a rule is called the premise or antecedent part of the rule, the THEN part is called the
consequent part.
The number of rules and the structure depends on the knowledge that the model represents.
The maximumnumber of rules of the fuzzy model is determined by the maximumnumber of
combinations of the antecedent membership functions. Consider the rule in equation A.11.
If 3 fuzzy sets for both u
1
and u
2
are available, the maximum number of rules for the fuzzy
1 In the literature, fuzzy models based on the TSK method of reasoning are called TSK models,
Sugeno models or TS models. They all denote the same type of model.
186
model is 9. If a fuzzy set occurs in more than one rule, the rules are dependent; changing the
fuzzy set will affect the behavior of several rules. If a fuzzy set occurs only in one rule, the
rules are independent.
A.2.2 Fuzzy inference
The following algorithm is used for calculation of the output that is inferred by an lin-
guistic model, via the Mamdani method. Assume a linguistic model for a 2-by-1 system as
given in equation A.11 with 2 independent rules:
IF u
1
= mf
u
1
,1
AND u
2
= mf
u
2
,1
THEN y = mf
y,1
IF u
1
= mf
u
1
,2
AND u
2
= mf
u
2
,2
THEN y = mf
y,2
(A.12)
Also assume a crisp input u
1
= x
1
and u
2
= x
2
. The rst step is to determine the inuence
the different rules of the model have on the output of the model. The inuence of a rule i is
measured by its degree of ring (DOF)
DOF,i
, which represents the degree of truth of the
premise part of a rule, given the premise u
1
= x
1
and u
2
= x
2
. The DOF is calculated as:
DOF,i
= mf
u
1
,i
(x
1
) mf
u
2
,i
(x
2
) (A.13)
The DOF of a rule can be used to determine the fuzzy set for the output of that rule, given the
premise u
1
= x
1
and u
2
= x
2
. This is the inference step:
mf
y,i
(y) =
DOF,i
mf
y,i
(A.14)
The next step is to aggregate the inferred fuzzy sets mf
y,i
by using the max operator:
mf
y
(y) =
2
i=1
mf
y,i
(A.15)
The nal step involves defuzzication of the aggregated fuzzy set for the output y, which
results in a crisp output value. Figure A.3 shows the inference mechanism.
A.2.3 TSK models
A known disadvantage of the linguistic models is that they do not incorporate specic
knowledge in an explicit form, if that kind of information is available but cannot be ex-
pressed within the framework of fuzzy set theory. This kind of information is often available
in the form of mathematical equations. Sugeno and co-workers proposed an alternative form
of fuzzy reasoning, which provides a possibility to incorporate such information. The TSK
reasoning method is associated with a rule-base of a special format that is characterized with
187
Fuzzy logic
u
1
1
u
2
1
Y
1
u
1
1
u
2
1 1
x
*
1
x
*
2
Y
1
Y
Defuzzification: y
Inference
Aggregation
Rule 1
Rule 2
Figure A.3: Fuzzy inference mechanism for linguistic models
functional type consequence parts instead of the fuzzy consequence parts used in the lin-
guistic model. For a 2-by-1 system with two independent rules this can be represented as
follows:
IF u
1
= mf
u
1
,1
AND u
2
= mf
u
2
,1
THEN y = a
1
u
1
+b
1
u
2
+c
1
IF u
1
= mf
u
1
,2
AND u
2
= mf
u
2
,2
THEN y = a
2
u
1
+b
2
u
2
+c
2
(A.16)
In this example, the consequent part incorporates linear equations, but in essence, any func-
tion can be incorporated. The crisp output y, inferred by the fuzzy model, is dened as the
weighted average of the crisp outputs y
i
of each rule:
y =
2
i=1
DOF,i
y
i
2
i=1
DOF,i
(A.17)
The great advantage of TSK models is the power to describe complex technological pro-
cesses. The problem is decomposed into simpler sub-systems, in many cases linear.
188
B
Bioreactor model
c
Lm
= 0.0084 h
1
K = 0.01 h
1
K
I
= 1.0 g/l
K
L
= 0.05
K
p
= 0.0001 g/l
K
X
= 0.3
m
xm
= 0.029 h
1
q
pm
= 0.004 h
1
Y
p/s
= 1.2
Y
x/s
= 0.47
m
= 0.11 h
1
Table B.1: Bioreactor model parameters
This appendix describes the fed-batch bioreactor reference model which is used in chap-
ter 4.
B.1 Model equations
The model describes four states: the biomass concentration X (g/l), the substrate con-
centration S (g/l), the product concentration P (g/l) and the volume V (l). The state
equations are given by:
dX
dt
= X( Dc
L
) (B.1)
dS
dt
= X + (S
f
S)D (B.2)
dP
dt
= q
p
X P(D +K) (B.3)
dV
dt
= F (B.4)
in which S
f
(g/l) is the feed substrate concentration, K (h
1
) is the product decay constant.
Dilution rate:
D =
F
V
(B.5)
Growth rate:
=
m
S
K
X
X + 10
(B.6)
Cell lysis rate:
c
L
=
c
Lm
X
K
L
+X + 1
exp(S/100) (B.7)
Product formation rate:
q
p
=
1.5q
pm
SX
4K
p
+XS
_
1 +
S
3K
I
_ (B.8)
191
Bioreactor model
Variable Noise amplitude
X 0.2 g/l
S 0.1 g/l
P 0.1 g/l
V 0 g/l
F 0.005 l/h
Table B.2: process and input noise amplitudes
Substrate consumption rate:
=
Y
x/s
+
q
p
Y
p/s
+m
x
(B.9)
with
m
x
=
m
xm
X
X + 10
(B.10)
Model parameters are given in table B.1.
For the simulations, a discrete version of the model is implemented. White noise of a certain
amplitude is added to the model process state and input at time step k. Since the state of
time step k is used to calculate the state at time step k + 1, some correlation occurs. Noise
amplitude settings are given in table B.2.
B.2 Notation
D Dilution rate (h
1
)
F Feed ow rate (l/h)
K Constant (h
1
)
K
L
Constant ()
K
X
Constant ()
P Product concentration (g/l)
S Substrate concentration (g/l)
S
F
Substrate concentration in the feed (g/l)
V Volume (l)
X Biomass concentration (g/l)
Y
x/s
Constant ()
Y
p/s
Constant ()
c
L
Cell lysis rate (h
1
)
c
Lm
Constant (h
1
)
m
x
Maintenance energy (h
1
)
m
xm
Constant (h
1
)
q
p
Product formation rate (h
1
)
Net growth rate (hybrid model) (h
1
)
Growth rate (h
1
)
m
Constant (h
1
)
Substrate consumption rate (h
1
)
192
C
Extended Purdue Model
Section Height (m) Cross sectional area S (m
2
)
A 5.50 18.0
B 1.34 18.0
C 1.83 18.7
D 1.55 18.7
E 9.81 18.7
F 1.46 18.7
G 5.18 18.7
H 1.58 19.9
I 1.74 19.9
J 2.01 21.1
Table C.1: EPM section dimensions
This appendix describes the Extended Purdue Model. The model owsheet, section
model data ow diagram and model parameters are given. Other information can be found in
Wisnewski et al. (1997).
C.1 Model owsheet
The model owsheet consists of a series of tubular reactors or sections that are con-
nected to formthe digester model. Mixers, splitters and heaters are also added. The owsheet
is shown in gure C.1. Table C.1 shows section dimensions.
C.2 Section model
All model equations of the section model are listed here. Figure C.2 shows the context
diagram of the section model, while gure C.3 shows the Data Flow Diagram for the section
model. This DFD represents the physical effects that are modeled in relation to the states.
Mass balance solid phase:
s,i
t
=
c
S(1 )
s,i
z
+R
s,i
for i = 1, . . . , 5 (C.1)
Mass balance entrapped phase:
e,i
t
=
c
S(1 )
e,i
z
+D
ce
(
f,i
e,i
)
+R
s,i
+
f,i
for i = 1, . . . , 6 (C.2)
195
Extended Purdue Model
A
F
I
H
G
E
D
C
B
J
chips
flow
liquor
flow
Black liquor
Pulp product
Wood chips/White liquor
lower heater
upper heater
wash heater
quench
recircu-
lation
Filtrate
Chips flow Liquor flow
Black liquor
Filtrate
upper
heater
lower
heater
wash
heater
(a) (b)
Height
0 m
5.5m
10.2 m
21.5 m
28.3 m
32 m
I
m
p
r
e
g
n
a
t
i
o
n
H
e
a
t
C
o
o
k
i
n
g
W
a
s
h
C
o
o
l
Figure C.1: Digester unit operation (a) and model owsheet (b)
Pulp digester
section model
Entering
Wood
Entering
Liquor
s,i,in
T
c,in
f
f
r
f,i,in
T
f,in
r
e,i,in
Exiting
Wood
Exiting
Liquor
f
c
r
s,i
T
c
r
e,i
f
f
r
f,i
T
f
Figure C.2: Digester section model context diagram
196
Energy
balance
solid phase
Energy
balance
free phase
Mass
balances
solid phase
Mass
balances
entr. phase
Mass
balances
free phase
Reaction
Diffusion
Bulk flow
T
f
T
c
R
i
R
i
s,i
e,i
s,i
R
i
T
c
D
T
c
D
e,i
f,i
D
D
b
T
f
f,i
b
R
i
e,i
e,i
f,i
f,i
s,i
s,i
T
c
T
c
T
f
T
f
f,in
f,in
T
f,in
T
c,in
e,in
c,in
s,in
s,i
e,i
f,i
T
f
T
c
r
f,i
Figure C.3: Digester section model Data Flow Diagram (b)
197
Extended Purdue Model
Mass balance free liquor phase:
f,i
t
=
1
S
f,i
f
z
+D
cf
(
e,i
f,i
)
f,i
for i = 1, . . . , 6 (C.3)
Energy balance solid and entrapped phase:
t
(Cp
s
M
s
+Cp
e
M
e
)T
c
=
c
S(1 )
z
(Cp
s
M
s
+Cp
e
M
e
)T
c
+ H
R
5
i=1
R
s,i
+U(T
f
T
c
) +
b
Cp
f
M
f
T
f
+D
ce
D
E
(C.4)
Energy balance free liquor phase:
t
(Cp
f
M
f
T
f
) =
1
S(S)
z
(
f
Cp
f
M
f
T
f
) +
1
U(T
c
T
f
)
+
1
b
Cp
f
M
f
T
f
+D
cf
D
F
(C.5)
Reaction rate:
R
s,i
= e
f
(k
1,i
e,1
+k
2,i
e,1
e,3
)(
s,i
s,i
) for i = 1, . . . , 5 (C.6)
Reaction rate constants:
k
1,i
= A
1,i
exp
_
E
1,i
RT
c
_
for i = 1, . . . , 5 (C.7)
k
2,i
= A
2,i
exp
_
E
2,i
RT
c
_
for i = 1, . . . , 5 (C.8)
Reaction stochiometry:
R
e,i
=
5
j=1
b
i,j
R
s,j
for i = 1, . . . , 6 (C.9)
Volumetric bulk ow:
b
=
5
i=1
R
s,i
s
for i = 1, . . . , 5 (C.10)
Diffusion coefcient:
D
ce
= 6.1321
_
T
c
exp
_
4870
1.98T
c
_
(C.11)
198
Volumetric correction for diffusion coefcient free liquor phase:
D
cf
= D
ce
(1 )
(C.12)
Net energy transported into woodchips per volume of diffusing mass:
D
E
= T
p
Cp
l
4
i=1
(
f,i
e,i
) +T
p
Cp
s
6
i=5
(
f,i
e,i
) (C.13)
in which
T
p
=
_
T
f
if
f,i
>
e,i
T
c
if
f,i
<
e,i
(C.14)
D
F
= D
E
(C.15)
Porosity:
= 1
5
i=1
s,i
s
(C.16)
Heat capacity entrapped phase:
Cp
e
= Cp
l
M
el
M
e
+Cp
s
M
es
M
e
(C.17)
Total density solid phase:
M
s
=
5
i=1
s,i
(C.18)
Total density entrapped phase
M
e
= M
el
+M
es
(C.19)
Heat capacity free liquor phase:
Cp
f
= Cp
l
M
fl
M
f
+Cp
s
M
fs
M
f
(C.20)
Total density free liquor phase
M
f
= M
fl
+M
fs
(C.21)
Partial liquid components density entrapped phase
M
el
=
water
+
4
i=1
e,i
(C.22)
199
Extended Purdue Model
Partial solid components density entrapped phase
M
es
=
6
i=5
e,i
(C.23)
Partial liquid components density free liquor phase
M
fl
=
water
+
4
i=1
f,i
(C.24)
Partial solid components density free liquor phase
M
fs
=
6
i=5
f,i
(C.25)
Kappa number:
# =
s,1
+
s,2
0.00153
5
i=1
s,i
(C.26)
Yield:
=
5
i=1
s,i,exiting
5
i=1
s,i,entering
(C.27)
C.3 Heater model
In the hybrid model, the EPM upper and lower heater are lumped to form a preheater.
To make comparisons between the two models, the preheater has to provide the same amount
of energy as the upper and lower heater do. The resulting outlet temperature depends on this
energy, the composition of the ow and the ow rate. Therefore, in the EPM, the heaters are
modeled using a simple static energy balance. The heater model can be described as follows:
out
=
in
(C.28)
Q
h
=
in
Cp
in
M
in
(T
out
T
in
) (C.29)
M
in
= M
l,in
+M
s,in
(C.30)
in which
M
l,in
=
water
+
4
i=1
f,i,in
; (C.31)
and
M
s,in
=
6
i=5
f,i,in
; (C.32)
200
i Softwood () Hardwood ()
1 0 0
2 0 0
3 0.71 0.65
4 0.25 0.25
5 0 0
Table C.2: Non-reactive fractions of solid phase components
Cp
in
= Cp
l
M
l,in
M
in
+Cp
s
M
s,in
M
in
; (C.33)
C.4 Typical operating values and parameters
The non-reactive portions of the solid components are related to the composition of the
wood entering the rst section of the digester and are the same in any following digester
section. The non-reactive portions are independent of the exact wood composition, but do
depend on the type of wood. The non-reactive portions are related to the densities of solid
components entering the digester in the following way:
s,i
=
s,i
s,i,in
(C.34)
in which
s,i
is the non-reactive portion of the component,
s,i
is the non-reactive fraction
and
s,i,in
is the initial density of the component. The fraction for the solid phase components
are given in table C.2.
The stochiometric matrix b which links the reaction rates of the solid phase components to
the reaction rates of the entrapped phase components is dened as:
b =
_
OHL
0.5
HSL
OHL
0.5
HSL
OHC
OHC
OHC
(
OHL
0.5
HSL
) (
OHL
0.5
HSL
)
OHC
OHC
OHC
0.5
HSL
0.5
HSL
0 0 0
0.5
HSL
0.5
HSL
0 0 0
1 1 0 0 0
0 0 1 1 1
_
_
(C.35)
The stochiometric coefcients for the consumption of Effective Alkali and Hydrosulde are
stated in table C.3. Tables C.3 through C.6 show various other model parameters.
The implemented compaction prole is based upon (Wisnewski et al., 1997), which uses
different, xed compaction values for each CSTR. In the PFR approach used in this work,
Softwood Hardwood
OHL
0.166 kg OH/kg lignin 0.21 kg OH/kg lignin
OHC
0.395 kg OH/kg carbohydrate 0.49 kg OH/kg carbohydrate
HSL
0.039 kg HS/kg lignin 0.05 kg HS/kg lignin
Table C.3: Stochiometric coefcients
201
Extended Purdue Model
(s, 1) (s, 2) (s, 3) (s, 4) (s, 5)
A
1
(hardwood) 0.3954 1.457E11 28.09 7.075 5.8267E3
A
1
(softwood) 0.2809 6.035E10 6.4509 1.5607 1.0197E4
A
2
(hardwood) 12.49 1.873 124.9 47.86 3.225E16
A
2
(softwood) 9.26 0.489 28.09 10.41 5.7226E16
E
1
29.3 115 34.7 25.1 73.3
E
2
31.4 37.7 41.9 37.7 167
Table C.4: Pre-exponential factors and activation energies
those compaction values are linearized along the spatial domain. The following relations
describe the linear compaction prole. Two discontinuities exist at places in the digester
where the diameter alters.
=
_
_
_
0.0135z + 0.6966 for z 10.9195
0.0122z + 0.7227 for 10.9195 < z 20.2502
0.0101z + 0.7002 for 20.2502 < z
(C.36)
202
Parameter Value Denition
c
1.3964 m
3
/min woodchip ow rate
f,in
2.3497 m
3
/min entering white liquor ow rate
uh
7.2728 m
3
/min upper heater recirculation ow rate
T
uh
418.83 K upper heater recirculation temperature
lh
6.7785 m
3
/min lower heater recirculation ow rate
T
lh
438.64 K lower heater recirculation temperature
quench
0 m
3
/min quench recirculation ow rate
wash
0.2067 m
3
/min wash zone recirculation ow rate
T
wash
363.4 K wash heater temperature
cheat
0.3514 m
3
/min cheater ow rate
cool
0.5 m
3
/min cooling zone ow rate
filt
1.6312 m
3
/min entering ltrate ow rate
T
filt
360.93 K entering ltrate temperature
T
c,in
395 K entering woodchip temperature
T
f,in
382 K entering white liquor temperature
s,1,in
25.37 kg/m
3
high reactivity lignin density
of entering woodchip
s,2,in
101.49 kg/m
3
low reactivity density
of entering woodchip
s,3,in
270.53 kg/m
3
cellulose density
of entering woodchip
s,4,in
12 kg/m
3
galactoglucomman density
of entering woodchip
s,5,in
129.03 kg/m
3
araboxylan density
of entering woodchip
e,1,in
0.01 kg/m
3
active effective alkali concentration
in entrapped liquor phase of entering woodchip
e,2,in
0.01 kg/m
3
passive effective alkali concentration
in entrapped liquor phase of entering woodchip
e,3,in
0.01 kg/m
3
active hydrosulde concentration
in entrapped liquor phase of entering woodchip
e,4,in
0.01 kg/m
3
passive hydrosulde concentration
in entrapped liquor phase of entering woodchip
e,5,in
0.01 kg/m
3
dissolved lignin concentration
in entrapped liquor phase of entering woodchip
e,6,in
0.01 kg/m
3
dissolved carbohydrates concentration
in entrapped liquor phase of entering woodchip
f,1,in
78.8 kg/m
3
active effective alkali concentration
of entering white liquor ow
f,2,in
0 kg/m
3
passive effective alkali concentration
of entering white liquor ow
f,3,in
13.3 kg/m
3
active hydrosulde concentration
of entering white liquor ow
f,4,in
0 kg/m
3
passive hydrosulde concentration
of entering white liquor ow
f,5,in
59.6 kg/m
3
dissolved lignin concentration
of entering white liquor ow
f,6,in
59.6 kg/m
3
dissolved carbohydrates concentration
of entering white liquor ow
f,1,filtr
4.8055 kg/m
3
active effective alkali concentration
of entering ltrate ow
f,2,filtr
0.01 kg/m
3
passive effective alkali concentration
of entering ltrate ow
f,3,filtr
0.01 kg/m
3
active hydrosulde concentration
of entering ltrate ow
f,4,filtr
0.01 kg/m
3
passive hydrosulde concentration
of entering ltrate ow
f,5,filtr
45 kg/m
3
dissolved lignin concentration
of entering ltrate ow
f,6,filtr
45 kg/m
3
dissolved carbohydrates concentration of
entering ltrate ow
Table C.5: Standard operating parameters
203
Extended Purdue Model
Parameter Value Denition
e
f
0.9 reaction rate multiplier
s
1666 kg/m
3
density of solid wood material
water
999 kg/m
3
density of water
H -518 kJ/kg heat of reaction
Cp
l
4.19 kJ/(kg K) heat capacity liquor
Cp
s
1.47 kJ/(kg K) heat capacity wood substance
R 0.0083145 kJ/(mol K) universal gas constant
U 827 kJ/(min m
3
K) heat transfer coefcient
Table C.6: Digester unit operation parameters
C.5 Frequency analysis
Figures C.4-C.7 show the Bode diagrams for the Kappa number and the yield as a func-
tion of the lower heater heat output Q
lh
and the liquor ow rate
f
. Table C.7 shows the
frequency analysis results; the amplitudes # and are the gain for a 10 % and 7 % step
change in Q
lh
and
l
, respectively. The amplitudes for a positive and a negative step change
are presented. In the table, O denotes the order of the system response, the time constant
and the dead time.
Input O
[input],
O
[input],
[input],
[input],
[input]
# () ()
(min) (min) (min)
Q
lh
2.3 2.3 42 43 87 -5.51, 6.17 -0.022, 0.022
f
2.9 3.3 35 35 113 4.71, -4.06 0.019, -0.020
Table C.7: Frequency response analysis results for EPM
204
10
3
10
2
800
600
400
200
0
(rad/min)
10
3
10
2
10
5
10
4
10
3
(rad/min)
A
R
Figure C.4: Bode diagram # as a function of Q
lh
10
3
10
2
800
600
400
200
0
(rad/min)
10
3
10
2
10
8
10
7
10
6
(rad/min)
A
R
Figure C.5: Bode diagram as a function of Q
lh
205
Extended Purdue Model
10
3
10
2
1000
800
600
400
200
0
(rad/min)
10
3
10
2
10
0
10
1
10
2
(rad/min)
A
R
Figure C.6: Bode diagram # as a function of
f
10
3
10
2
1000
800
600
400
200
0
(rad/min)
10
3
10
2
10
2
10
1
10
0
(rad/min)
A
R
Figure C.7: Bode diagram as a function of
f
206
C.6 Notation
A
1,i
Arrhenius constant species i (m
3
/kg min)
A
2,i
Arrhenius constant species i (m
3
/kg min)
Cp
e
Heat capacity, entrapped phase (kJ/kg K)
Cp
f
Heat capacity, free liquor phase (kJ/kg K)
Cp
l
Heat capacity, liquor (kJ/kg K)
Cp
s
Heat capacity, solid phase (kJ/kg K)
D
E
Energy transported into wood chip (kJ/m
3
)
D
F
Energy transported into free liquor phase (kJ/m
3
)
D
ce
Diffusion coefcient, entrapped phase (min
1
)
D
cf
Diffusion coefcient, free liquor phase (min
1
)
E
1,i
Activation energy species i (kJ/mol)
E
2,i
Activation energy species i (kJ/mol)
M
e
Mass entrapped phase (kg/m
3
)
M
el
Mass liquor components, entrapped phase (kg/m
3
)
M
es
Mass solid components, entrapped phase (kg/m
3
)
M
f
Mass free liquor phase (kg/m
3
)
M
fl
Mass liquor components, free liquor phase (kg/m
3
)
M
fs
Mass solid components, free liquor phase (kg/m
3
)
M
in
Entering mass, heater (kg/m
3
)
M
l,in
Entering mass, liquor components, heater (m
3
/min)
M
s
Mass solid phase (kg/m
3
)
M
s,in
Entering mass, solid components, heater (m
3
/min)
Q
h
Heat supply heater (kJ/min)
Q
lh
Heat supply lower heater (kJ/min)
Q
uh
Heat supply upper heater (kJ/min)
R Universal gas constant (kJ/molK)
R
e,i
Reaction rate species (e, i) (m
3
/min)
R
s,i
Reaction rate species (s, i) (m
3
/min)
S Cross sectional area (m
2
)
T
c
Wood chip temperature (K)
T
f
Free liquor phase temperature (K)
T
in
Temperature entering ow rate, heater (K)
T
out
Temperature exiting ow rate, heater (K)
T
p
Temperature used in heat transfer calculation (K)
U Heat transfer coefcient (kJ/min m
3
K)
b Reaction stochiometry ()
e
f
Reaction rate multiplier ()
k
1,i
Kinetic constant species i (m
3
/kg min)
k
2,i
Kinetic constant species i (m
3
/kg min)
z Location in the reactor, top = 0 (m)
H
R
Reaction enthalpy (kJ/kg)
s,i
Non-reactive fraction species (s, i) (kg/m
3
)
Stochiometric coefcients ()
Yield ()
Porosity ()
Ratio free liquor volume - reactor volume ()
Q
Dead time with respect to Q
h
(min)
e,i
Concentration species i, entrapped phase (kg/m
3
)
f,i
Concentration species i, free liquor phase (kg/m
3
)
s,i
Concentration species i, solid phase (kg/m
3
)
s,i
Non-reactive portion species (s, i) (kg/m
3
)
s
Density of solid material (kg/m
3
)
water
Density of water (kg/m
3
)
[input],[output]
Time constant for [output] with respect to [input] (min)
b
Bulk ow (m
3
/min)
c
Wood chip ow rate (m
3
/min)
in
Entering ow rate, heater (m
3
/min)
l
Liquor phase ow rate (m
3
/min)
out
Exiting ow rate, heater (m
3
/min)
208
D
Pulp Digester Hybrid Model
chips
flow
liquor
flow
Black liquor
Pulp product
Wood chips/White liquor
lower heater
upper heater
wash heater
quench
recircu-
lation
Filtrate
(a) (b)
Height
0 m
5.5m
21.5 m
26.7 m
32 m
I
m
p
r
e
g
n
a
t
i
o
n
C
o
o
k
i
n
g
W
a
s
h
C
o
o
l
I
II
III
IV
Chips flow Liquor flow
heater
wash
heater
Filtrate
Black liquor
Figure D.1: Digester unit operation (a) and hybrid model owsheet (b)
This appendix describes the hybrid model of the continuous pulp digester that is de-
scribed in chapter 5. The model owsheet, section model data ow diagram and model
parameters are given.
D.1 Model owsheet
The model owsheet consists of a series of tubular reactors or sections that are con-
nected to formthe digester model. Mixers, splitters and heaters are also added. The owsheet
is shown in gure D.1. Table D.1 shows section dimensions.
Section Height (m) Cross sectional area S (m
2
)
I 5.50 18.0
II 15.99 18.7
III 5.18 18.7
IV 5.33 21.1
Table D.1: Hybrid model section dimensions
211
Pulp Digester Hybrid Model
Pulp digester
section model
Entering
Wood
Entering
Liquor
r,i,in
T
r,in
f
l
r
l,i,in
T
l,in
Exiting
Wood
Exiting
Liquor
f
r
r
r,i
T
r
f
l
r
l,i
T
l
Figure D.2: Hybrid model section context diagram
D.2 Section model
All model equations of the section model are listed here. Figure D.2 shows the context
diagram of the section model, while gure D.3 shows the Data Flow Diagram for the section
model. This DFD represents the physical effects that are modeled in relation to the states.
Mass balance reaction phase:
r,i
t
=
r
S(1 )
r,i
z
+R
r,i
for i = 1, . . . , 3 (D.1)
r,i
t
=
r
S(1 )
r,i
z
+R
r,i
+D
cr
(
l,i
1
r,i+3
) for i = 4, . . . , 7 (D.2)
The difference between equations D.1 and D.2 is that species (r, 1) to (r, 3) cannot diffuse
to the liquor phase, while species (r, 4) to (r, 7) can. The porosity is introduced in the
diffusion term to compensate for the lumping of the solid and entrapped phases. In the EPM,
the concentrations of species (r, 4) to (r, 7) are based on the entrapped phase volume. In the
hybrid model, they are based on the chip volume. Dividing by corrects this. An alterna-
tive is to build a fuzzy equation that calculates D as a function of T
r
,
l,i
and
r,i+3
, but
introducing is simpler. The corrected concentration can be interpreted as an effective con-
centration that determines the driving force behind the diffusion. The porosity is dened
in equation D.13.
Mass balance liquor phase:
l,i
t
=
1
S
l,i
l
z
+ D
cl
(
1
r,i+3
l,i
) for i = 1, . . . , 4 (D.3)
Energy balance reaction phase:
t
(Cp
r
M
r
T
r
) =
r
S(1 )
z
(Cp
r
M
r
T
r
)
+ H
R
3
i=1
R
r,i
+U(T
l
T
r
) +D
cr
D
R
(D.4)
212
Energy
balance
reac. phase
Energy
balance
liq. phase
Mass
balances
reac. phase
Mass
balances
liq. phase
Reaction
Diffusion
T
l
T
r
R
i
r,i
r,i
R
i
T
r
D
T
r
D
r,i
l,i
D
D
l,i
l,i
r,i
r,i
T
r
T
r
T
l
T
l
l,in
l,in
T
f,in
T
r,in
r,in
r,in
r,i
f,i
T
f
T
r
r
l,i
Figure D.3: Hybrid model section Data Flow Diagram
213
Pulp Digester Hybrid Model
Energy balance liquor phase:
t
(Cp
l
M
l
T
l
) =
1
S(S)
z
(
l
Cp
l
M
l
T
l
) +
1
U(T
r
T
l
) + D
cl
D
L
(D.5)
Reaction rate:
R
r,i
= f
fuzzy
(
r,i
,
r,4
, T
r
) for i = 1, . . . , 3 (D.6)
Reaction stochiometry:
R
r,i
=
3
j=1
b
i,j
R
r,j
for i = 4, . . . , 7 (D.7)
Diffusion coefcient:
D
cr
= f
fuzzy
(T
r
) (D.8)
Volumetric correction for diffusion coefcient liquor phase:
D
cl
= D
cr
1
(D.9)
Net energy transported into woodchips per volume of diffusing mass:
D
R
= T
p
Cp
w
2
i=1
(
l,i
1
r,i+3
) +T
p
Cp
s
4
i=3
(
l,i
1
r,i+3
) (D.10)
in which
T
p
=
_
T
l
if
l,i
>
r,i
T
r
if
l,i
<
r,i
(D.11)
D
L
= D
R
(D.12)
Porosity:
= 1
3
i=1
r,i
s
=
V
e
V
c
(D.13)
Heat capacity reaction phase:
Cp
r
= Cp
w
M
rl
M
r
+Cp
s
M
rr
M
r
(D.14)
Total density reaction phase
M
r
= M
rl
+M
rr
(D.15)
214
with
M
rr
=
3
i=1
r,i
+
7
i=6
r,i
(D.16)
and
M
rl
=
water
+
5
i=4
r,i
(D.17)
Heat capacity liquor phase:
Cp
l
= Cp
w
M
ll
M
l
+Cp
s
M
lr
M
l
(D.18)
Total density liquor phase
M
l
= M
ll
+M
lr
(D.19)
with
M
ll
=
water
+
2
i=1
l,i
(D.20)
and
M
lr
=
4
i=3
l,i
(D.21)
Kappa number:
# =
r,1
+
r,2
0.00153
3
i=1
r,i
(D.22)
Yield:
=
3
i=1
r,i,exiting
3
i=1
r,i,entering
(D.23)
D.3 Fuzzy models
The hybrid model of the pulp digester contains four fuzzy models:
R
r,i
= f
fuzzy
(
r,i
,
r,4
, T
r
) for i = 1 . . . 3 (D.24)
D
cr
= f
fuzzy
(T
r
) (D.25)
The models were identied with GK-clustering in combination with structure optimization.
The weights of the rules were determined in the submodel integration step. Clustering settings
and modeling results are shown in table D.2. Using the input-output data that was generated
215
Pulp Digester Hybrid Model
with the EPM, this results in the fuzzy models given below. The premise part membership
functions are given in gures D.4 to D.7.
IF
r,1
= mf
i
AND
r,4
= mf
i
AND T
r
= mf
i
THEN R
r,1
= w
R
r,1
i
y
i
for i = 1, . . . , 4 (D.26)
in which
y
i
= p
R
r,1
i,1
r,1
+p
R
r,1
i,2
r,4
+p
R
r,1
i,3
T
r
+p
R
r,1
i,4
(D.27)
with
p
R
r,1
=
_
_
0.0088 0.0011 0.0002 0.0933
0.0252 0.0019 0.0011 0.5478
0.0004 0.0059 0.0029 1.1060
0.0233 0.0065 0.0060 2.7791
_
_
w
R
r,1
=
_
_
1
1
1
1
_
_
(D.28)
IF
r,2
= mf
i
AND
r,4
= mf
i
AND T
r
= mf
i
THEN R
r,2
= w
R
r,2
i
y
i
for i = 1, . . . , 6 (D.29)
in which
y
i
= p
R
r,2
i,1
r,2
+p
R
r,2
i,2
r,4
+p
R
r,2
i,3
T
r
+p
R
r,2
i,4
(D.30)
with
p
R
r,2
=
_
_
0.0291 0.0154 0.0220 9.8500
0.0238 0.0022 0.0116 2.1300
0.0185 0.0367 0.1045 45.7436
0.0092 0.0085 0.0385 16.2819
0.0003 0.0042 0.0001 0.0000
0.0307 0.0235 0.0585 26.2124
_
_
w
R
r,2
=
_
_
0.73
0.93
0.47
1.19
1.14
1.70
_
_
(D.31)
IF
r,3
= mf
i
AND
r,4
= mf
i
AND T
r
= mf
i
THEN R
r,3
= w
R
r,3
i
y
i
for i = 1, . . . , 6 (D.32)
in which
y
i
= p
R
r,3
i,1
r,3
+p
R
r,3
i,2
r,4
+p
R
r,3
i,3
T
r
+p
R
r,3
i,4
(D.33)
with
p
R
r,3
=
_
_
0.0111 0.0245 0.0813 38.2907
0.0132 0.0364 0.0909 43.4828
0.0052 0.0227 0.0208 10.5408
0.0026 0.0114 0.0128 6.0335
0.0002 0.0041 0.0011 0.4765
0.0083 0.0154 0.0546 25.5889
_
_
w
R
r,3
=
_
_
0.99
1.38
1.04
1.95
1.25
1.12
_
_
(D.34)
216
Fuzzy model Inputs
dr
# data features k
0
cm
# rules RMSE
R
r,1
r,1
,
r,4
, T
r
2 179 10 0.7 4 0.024
R
r,2
r,2
,
r,4
, T
r
1 333 10 0.7 6 0.14
R
r,3
r,3
,
r,4
, T
r
5 198 10 0.7 6 0.12
D
cr
T
r
1 36 5 0.2 2 0.026
Table D.2: Fuzzy model identication settings and results
IF T
r
= mf
i
THEN D
cr
= w
D
cr
i
y
i
for i = 1, . . . , 2 (D.35)
in which
y
i
= p
D
cr
i,1
T
r
+p
D
cr
i,2
(D.36)
with
p
D
cr
=
_
0.0014 0.5092
0.0062 2.3676
_
w
D
cr
=
_
1
1
_
(D.37)
5 10 15 20 25
0
0.2
0.4
0.6
0.8
1
1.2
r,1
mf1
mf2
mf3 mf4
10 20 30 40
0
0.2
0.4
0.6
0.8
1
1.2
r,4
mf1
mf2
mf3 mf4
(a) (b)
380 400 420 440
0
0.2
0.4
0.6
0.8
1
1.2
T
r
mf1
mf2
mf3
mf4
(c)
Figure D.4: Premise part membership functions fuzzy model R
r,1
217
Pulp Digester Hybrid Model
20 40 60 80 100
0
0.2
0.4
0.6
0.8
1
1.2
r,2
mf1
mf2 mf3 mf4
mf5
mf6
10 20 30 40
0
0.2
0.4
0.6
0.8
1
1.2
r,4
mf1
mf2
mf3 mf4
mf5 mf6
(a) (b)
380 400 420 440
0
0.2
0.4
0.6
0.8
1
1.2
T
r
mf1
mf2
mf3 mf4
mf5
mf6
(c)
Figure D.5: Premise part membership functions fuzzy model R
r,2
218
250 300 350 400
0
0.2
0.4
0.6
0.8
1
1.2
r,3
mf1
mf2
mf3
mf4
mf5
mf6
10 20 30 40
0
0.2
0.4
0.6
0.8
1
1.2
r,4
mf1
mf2
mf3
mf4 mf5
mf6
(a) (b)
380 400 420 440
0
0.2
0.4
0.6
0.8
1
1.2
T
r
mf1 mf2
mf3
mf4
mf5
mf6
(c)
Figure D.6: Premise part membership functions fuzzy model R
r,3
400 420 440
0
0.2
0.4
0.6
0.8
1
1.2
T
r
mf1 mf2
Figure D.7: Premise part membership functions fuzzy model D
cr
219
Pulp Digester Hybrid Model
Softwood Hardwood
OHL
.166 kg OH/kg lignin 0.21 kg OH/kg lignin
OHC
0.395 kg OH/kg carbohydrate 0.49 kg OH/kg carbohydrate
Table D.3: Stochiometric coefcients
D.4 Heater model
In the hybrid model, the EPM upper and lower heater are lumped to form a preheater.
To make comparisons between the two models, the preheater has to provide the same amount
of energy as the upper and lower heater do. The resulting outlet temperature depends on this
energy, the composition of the ow and the ow rate. Therefore, in the hybrid model, the
heaters are modeled using a simple static energy balance. The heater model can be described
as follows:
out
=
in
(D.38)
Q
h
=
in
Cp
in
M
in
(T
out
T
in
) (D.39)
M
in
= M
l,in
+M
r,in
(D.40)
in which
M
l,in
=
water
+
2
i=1
l,i,in
(D.41)
and
M
r,in
=
4
i=3
l,i,in
(D.42)
Cp
in
= Cp
w
M
l,in
M
in
+Cp
s
M
r,in
M
in
(D.43)
D.5 Typical operating values and parameters
The stochiometric matrix b which links the reaction rates of the reaction phase compo-
nents to the reaction rates of the liquor phase components can be derived from equation C.35.
Lumping species (s, 3) to (s, 5) corresponds with removing columns 4 and 5 from b and
lumping species (e, 1) and (e, 3) corresponds with summing rows 1 and 3 of b. For the
reaction products, rows 2 and 3 also have to be summed. This results in:
b =
_
OHL
OHL
OHC
OHL
OHL
OHC
1 1 0
0 0 1
_
(D.44)
The stochiometric coefcients for the consumption of Effective Alkali and Hydrosulde are
stated in table D.3. Tables D.4 through D.5 show various other model parameters.
220
Parameter Value Denition
r
1.3964 m
3
/min reaction phase (woodchip) ow rate
l,in
2.3497 m
3
/min entering white liquor ow rate
Q
h
627099 kJ/min Preheater energy input
quench
0 m
3
/min quench recirculation ow rate
T
wash
365.133 K wash heater temperature
filt
1.6312 m
3
/min entering ltrate ow rate
T
filt
360.93 K entering ltrate temperature
T
r,in
395 K entering woodchip temperature
T
l,in
382 K entering white liquor temperature
r,1,in
25.37 kg/m
3
high reactivity lignin density
of entering woodchip (reaction phase)
r,2,in
101.49 kg/m
3
low reactivity lignin density
of entering woodchip (reaction phase)
r,3,in
411.56 kg/m
3
carbohydrates density of entering woodchip
r,4,in
0.02 kg/m
3
active liquor concentration in entering reaction phase
r,5,in
0.02 kg/m
3
passive liquor concentration in entering reaction phase
r,6,in
0.01 kg/m
3
dissolved lignin concentration in entering reaction phase
r,7,in
0.01 kg/m
3
dissolved carbohydrates concentration
in entering reaction phase
l,1,in
92.1 kg/m
3
active liquor concentration
of entering white liquor ow
l,2,in
0 kg/m
3
passive liquor concentration
of entering white liquor ow
l,3,in
59.6 kg/m
3
dissolved lignin concentration
of entering white liquor ow
l,4,in
59.6 kg/m
3
dissolved carbohydrates concentration
of entering white liquor ow
f,1,filtr
4.8055 kg/m
3
active liquor concentration
of entering ltrate ow
f,2,filtr
0.02 kg/m
3
passive liquor concentration
of entering ltrate ow
f,5,filtr
45 kg/m
3
dissolved lignin concentration
of entering ltrate ow
f,6,filtr
45 kg/m
3
dissolved carbohydrates concentration
of entering ltrate ow
Table D.4: Standard operating parameters
Parameter Value Denition
s
1666 kg/m
3
density of solid wood material
water
999 kg/m
3
density of water
H
R
-518 kJ/kg heat of reaction
Cp
w
4.19 kJ/(kg K) heat capacity liquor
Cp
s
1.47 kJ/(kg K) heat capacity wood substance
U 827 kJ/(min m
3
K) heat transfer coefcient
Table D.5: Digester unit operation parameters
221
Pulp Digester Hybrid Model
Model Input O
[input],
O
[input],
[input],
[input],
[input]
# () ()
(min (min (min)
EPM Q
lh
2.3 2.3 42 43 87 -5.51, 6.17 -0.022, 0.022
f
2.9 3.3 35 35 113 4.71, -4.06 0.019, -0.020
Hybrid Q
lh
1.8 1.9 40 45 84 -5.77, 6.40 -0.018, 0.018
model
f
2.4 2.7 34 34 109 4.23, -3.69 0.014, -0.013
Table D.6: Frequency response analysis results for hybrid model
The implemented compaction prole is based upon (Wisnewski et al., 1997), who uses dif-
ferent, xed compaction values for each CSTR. In the PFR approach for the hybrid model,
compaction is neglected. The ratio of the volume of the liquor phase and the volume of the
reactor is assumed to be constant. This means that the volumes of the reaction phase and
the liquor phase do not change.
= 0.54 (D.45)
D.6 Frequency analysis
Figures D.8-D.11 show the Bode diagrams for the Kappa number and the yield as a
function of the lower heater heat output Q
lh
and the liquor ow rate
f
. Table D.6 shows the
frequency analysis results; the gains # and are the gains for steps in Q
lh
and
l
. The
steps size of the steps in
l
was 7 %. The step size of the steps in Q
h
was equivalent with
a 10 % change in lower heater heat output Q
lh
of the EPM. The gains for a positive and a
negative step are presented. In the table, O denotes the order of the system response, the
time constant and the dead time.
222
10
3
10
2
800
600
400
200
0
(rad/min)
10
3
10
2
10
5
10
4
10
3
(rad/min)
A
R
Figure D.8: Bode diagram # as a function of Q
lh
. Dots EPM, line hybrid model.
10
3
10
2
800
600
400
200
0
(rad/min)
10
3
10
2
10
8
10
7
10
6
(rad/min)
A
R
Figure D.9: Bode diagram as a function of Q
lh
. Dots EPM, line hybrid model.
223
Pulp Digester Hybrid Model
10
3
10
2
1000
800
600
400
200
0
(rad/min)
10
3
10
2
10
0
10
1
10
2
(rad/min)
A
R
Figure D.10: Bode diagram # as a function of
f
. Dots EPM, line hybrid model.
10
3
10
2
1000
800
600
400
200
0
(rad/min)
10
3
10
2
10
2
10
1
10
0
(rad/min)
A
R
Figure D.11: Bode diagram as a function of
f
. Dots EPM, line hybrid model.
224
D.7 Notation
Cp
l
Heat capacity, liquor phase (kJ/kg K)
Cp
r
Heat capacity, reaction phase (kJ/kg K)
Cp
w
Heat capacity, liquor (kJ/kg K)
D
R
Energy transported into wood chip (kJ/m
3
)
D
L
Energy transported into liquor phase (kJ/m
3
)
D
cr
Diffusion coefcient, reaction phase (min
1
)
D
cl
Diffusion coefcient, liquor phase (min
1
)
M
rl
Mass liquor components, reaction phase (kg/m
3
)
M
rr
Mass reaction phase components, reaction phase (kg/m
3
)
M
l
Mass liquor phase (kg/m
3
)
M
ll
Mass liquor components, liquor phase (kg/m
3
)
M
lr
Mass reaction phase components, liquor phase (kg/m
3
)
M
in
Entering mass, heater (kg/m
3
)
M
l,in
Entering mass, liquor components, heater (m
3
/min)
M
r
Mass reaction phase (kg/m
3
)
M
r,in
Entering mass, reaction phase components, heater (m
3
/min)
Q
h
Heat supply heater (kJ/min)
R
r,i
Reaction rate species (r, i) (m
3
/min)
S Cross sectional area (m
2
)
T
r
Reaction phase temperature (K)
T
l
Liquor phase temperature (K)
T
in
Temperature entering ow rate, heater (K)
T
out
Temperature exiting ow rate, heater (K)
T
p
Temperature used in heat transfer calculation (K)
U Heat transfer coefcient (kJ/min m
3
K)
b Reaction stochiometry ()
z Location in the reactor, top = 0 (m)
p
[]
Consequent part parameters fuzzy models
w
[]
Rule weights fuzzy models
H
R
Reaction enthalpy (kJ/kg)
Stochiometric coefcients ()
Yield ()
Porosity ()
Ratio liquor volume - reactor volume ()
Q
Dead time with respect to Q
h
(min)
l,i
Concentration species i, liquor phase (kg/m
3
)
r,i
Concentration species i, reaction phase (kg/m
3
)
s
Density of solid material (kg/m
3
)
water
Density of water (kg/m
3
)
[input],[output]
Time constant for [output] with respect to [input] (min)
r
Reaction phase ow rate (m
3
/min)
in
Entering ow rate, heater (m
3
/min)
l
Liquor phase ow rate (m
3
/min)
out
Exiting ow rate, heater (m
3
/min)
225
E
Pulp Digester Fuzzy Model
Pulp digester
fuzzy model
Operating
conditions
Q
h
l
Product
quality
k#
Figure E.1: Hybrid model section context diagram
This appendix describes the dynamic fuzzy model of the pulp digester. The fuzzy models,
identication and validation data and the frequency analysis are presented.
E.1 Fuzzy models
The fuzzy model of the pulp digester describes the Kappa number (#) and the yield ()
at the reactor outlet as a function of the heat supply Q
h
and the liquor ow
l
by using two
second order MISO models. Figure E.1 shows the context diagram of the complete model.
Since the fuzzy model of the digester only consists of two fuzzy equations, no DFD was
constructed.
The model equation structure is as follows:
#
k
= f
fuzzy
(#
k1
, #
k2
, Q
k
Q
,
l,k
l
) (E.1)
k
= f
fuzzy
(
k1
,
k2
, Q
k
Q
,
l,k
l
) (E.2)
in which
k
denotes the time step and
Q
and
l
denote the dead times with respect to Q
h
and
l
.
The models were identied with GK-clustering in combination with structure optimization.
Clustering settings and model results are shown in table E.1. This results in the following
fuzzy models:
IF #
k1
= mf
i
AND #
k2
= mf
i
AND Q
h,k
Q
= mf
i
AND
l,k
= mf
i
THEN #
k
= y
i
for i = 1, . . . , 3 (E.3)
in which
y
i
= p
i,1
#
k1
+p
i,2
#
k2
+p
i,3
Q
h,k
Q
+p
i,4
l,k
+p
i,5
(E.4)
with
p
=
_
_
1.98 0.98 0.0046 0.0070 0.0230
1.98 0.98 0.0058 0.0067 0.0293
1.98 0.98 0.0068 0.0098 0.0303
_
_
(E.5)
229
Pulp Digester Fuzzy Model
Model
Q
(min)
(min) k
0
cm
# rules
# 87 113 6 0.7 3
87 113 6 0.4 5
Table E.1: Settings and structure results fuzzy model
and
IF
k1
= mf
i
AND
k2
= mf
i
AND Q
h,k
Q
= mf
i
AND
l,k
= mf
i
THEN
k
= y
i
for i = 1, . . . , 5 (E.6)
in which
y
i
= p
i,1
#
k1
+p
i,2
#
k2
+p
i,3
Q
h,k
Q
+p
i,4
l,k
+p
i,5
(E.7)
with
p
=
_
_
1.98 0.98 0.000016 0.000020 0.000192
1.98 0.97 0.000017 0.000026 0.000206
1.98 0.97 0.000017 0.000023 0.000236
1.98 0.97 0.000026 0.000037 0.000342
1.98 0.98 0.000016 0.000034 0.000190
_
_
(E.8)
The membership functions for the two fuzzy MISO models are presented in gures E.2 and
E.2
230
20 40 60 80
0
0.2
0.4
0.6
0.8
1
1.2
#
k1
k1
k2
mf1
mf2 mf3 mf4 mf5
2 2.2 2.4 2.6 2.8
0
0.2
0.4
0.6
0.8
1
1.2
f
. The identication data set is shown in gure E.4, the validation data set is shown in
gure E.5. The value of Q
h
is divided by a factor of 1e5 to prevent membership function
evaluation problems.
233
Pulp Digester Fuzzy Model
0 1000 2000 3000 4000 5000 6000 7000 8000
4
5
6
7
8
9
x 10
5
t (min)
Q
h
(
k
J
/
m
i
n
)
(a)
0 1000 2000 3000 4000 5000 6000 7000 8000
1.8
2
2.2
2.4
2.6
2.8
3
t (min)
f
(
m
3
/
m
i
n
)
(b)
0 1000 2000 3000 4000 5000 6000 7000 8000
0
20
40
60
80
100
t (min)
#
(
)
(c)
0 1000 2000 3000 4000 5000 6000 7000 8000
0.45
0.5
0.55
0.6
0.65
0.7
t (min)
)
(d)
Figure E.4: Identication data for Q
h
(a),
l
(b), # (c) and (d)
234
0 1000 2000 3000 4000 5000 6000 7000 8000
3
4
5
6
7
8
9
x 10
5
t (min)
Q
h
(
k
J
/
m
i
n
)
(a)
0 1000 2000 3000 4000 5000 6000 7000 8000
1.6
1.8
2
2.2
2.4
2.6
2.8
3
t (min)
f
(
m
3
/
m
i
n
)
(b)
0 1000 2000 3000 4000 5000 6000 7000 8000
0
20
40
60
80
100
t (min)
#
(
)
(c)
0 1000 2000 3000 4000 5000 6000 7000 8000
0.4
0.45
0.5
0.55
0.6
0.65
0.7
0.75
t (min)
)
(d)
Figure E.5: Validation data for Q
h
(a),
l
(b), # (c) and (d)
235
Pulp Digester Fuzzy Model
Model Input O
[input],
O
[input],
[input],
[input],
[input]
# () ()
(min) (min) (min)
EPM Q
lh
2.3 2.3 42 43 87 -5.51, 6.17 -0.022, 0.022
f
2.9 3.3 35 35 113 4.71, -4.06 0.019, -0.020
Fuzzy Q
h
2.1 2.0 52 58 87 -4.54, 4.98 -0.018, 0.022
model
l
2.0 2.0 50 50 113 3.70, -3.68 0.019, -0.020
Table E.2: Frequency response analysis results for EPM
E.3 Frequency analysis
Figures E.6-E.9 show the Bode diagrams for the Kappa number and the yield as a func-
tion of the lower heater heat output Q
lh
and the liquor ow rate
f
. Table E.2 shows the
frequency analysis results; the gains # and are the gains for steps in Q
lh
and
l
. The
steps size of the steps in
l
was 7 %. The step size of the steps in Q
h
was equivalent with
a 10 % change in lower heater heat output Q
lh
of the EPM. The gains for a positive and a
negative step are presented. In the table, O denotes the order of the system response, the
time constant and the dead time.
10
3
10
2
800
600
400
200
0
(rad/min)
10
3
10
2
10
5
10
4
10
3
(rad/min)
A
R
Figure E.6: Bode diagram # as a function of Q
lh
. Dots EPM, line fuzzy model.
236
10
3
10
2
800
600
400
200
0
(rad/min)
10
3
10
2
10
8
10
7
10
6
(rad/min)
A
R
Figure E.7: Bode diagram as a function of Q
lh
. Dots EPM, line fuzzy model.
10
3
10
2
1000
800
600
400
200
0
(rad/min)
10
3
10
2
10
0
10
1
10
2
(rad/min)
A
R
Figure E.8: Bode diagram # as a function of
f
. Dots EPM, line fuzzy model.
237
Pulp Digester Fuzzy Model
10
3
10
2
1000
800
600
400
200
0
(rad/min)
10
3
10
2
10
2
10
1
10
0
(rad/min)
A
R
Figure E.9: Bode diagram as a function of
f
. Dots EPM, line fuzzy model.
E.4 Notation
Q
h
Heat supply heater (kJ/min)
p
[]
Consequent part parameters fuzzy models
w
[]
Rule weights fuzzy models
Yield ()
Q
Dead time with respect to Q
h
(min)
[input],[output]
Time constant for [output] with respect to [input] (min)
l
Liquor phase ow rate (m
3
/min)
238
F
Distillation column hybrid models
Parameter Value Denition
V
0.2 h Column heating time constant
L
V,0
120 mol/h Initial vapor ow
L
V,ss
166 mol/h Steady state vapor ow approximation
x
0
n+1,sp
0.986 Product quality for which the model is valid
Table F.1: Simplied overall static hybrid model parameters
This appendix describes the simplied overall static and dynamic hybrid models. The
data ow diagram and model parameters are given. In addition, the performance of the fuzzy
submodel is presented.
F.1 Simplied overall static model
F.1.1 Model equations
The simplied overall static hybrid model is given by the following set of equations:
dM
col
dt
= L
V
(1 R
) (F.1)
M
col
dx
col
dt
= L
V
(1 R
)(x
0
n+1
x
col
) (F.2)
dL
V
dt
=
1
V
(L
V,ss
L
V
) (F.3)
x
0
n+1
= f
fuzzy
(R
, x
col
, L
V
) (F.4)
M
p
= M
col,0
M
col
(F.5)
Model parameters are given in table F.1. Figure F.1 shows the DFD of the model.
F.1.2 Fuzzy submodel
The premise part membership functions of the fuzzy submodel for x
0
n+1
are given in
gure F.2. The model can be represented as follows:
IF R
= mf
i
AND x
col
= mf
i
AND L
V
= mf
i
THEN x
0
n+1
= w
x
0
n+1
i
y
i
for i = 1, . . . , 3 (F.6)
in which
y
i
= p
x
0
n+1
i,1
R
+p
x
0
n+1
i,2
x
col
+p
x
0
n+1
i,3
L
V
+p
x
0
n+1
i,4
(F.7)
241
Distillation column hybrid models
Mass balance
column
Vapor flow
Component
balance
column
Pro-
duction
Product
quality
M
col
x
col
x
col
M
col
M
col
L
V
L
V
L
V,0
M
col,0
x
col,0
M
p
R*
L
V
M
col
x
0
n+1
L
V
R*
M
col,0
R*
L
V
x
col
x
0
n+1
Figure F.1: Simplied overall static model Data Flow Diagram
242
0.6 0.7 0.8 0.9 1
0
0.2
0.4
0.6
0.8
1
R*
mf1 mf2
mf3
0.2 0.3 0.4 0.5 0.6
0
0.2
0.4
0.6
0.8
1
x
col
mf1
mf2 mf3
(a) (b)
100 120 140 160
0
0.2
0.4
0.6
0.8
1
L
V
)
0 1 2
0.97
0.975
0.98
0.985
0.99
t (h)
x
0n
+
1
(
)
(a) (b)
0 1 2
0.97
0.975
0.98
0.985
0.99
t (h)
x
0n
+
1
(
)
(c)
Figure F.3: Performance fuzzy model x
0
n+1
with respect to identication data ID1 (a), ID2 (b) and ID3
(c)
244
Parameter Value Denition
V
0.2 h Column heating time constant
L
V,0
120 mol/h Initial vapor ow
L
V,ss
166 mol/h Steady state vapor ow approximation
x
0
n+1,sp
0.986 Product quality for which the model is valid
x
0.42 h Dominant column time constant
Table F.2: Simplied overall dynamic hybrid model parameters
F.2 Simplied overall dynamic model
F.2.1 Model equations
The simplied overall dynamic hybrid model is given by the following set of equations:
dM
col
dt
= L
V
(1 R
) (F.9)
M
col
dx
col
dt
= L
V
(1 R
)(x
0
n+1
x
col
) (F.10)
dL
V
dt
=
1
V
(L
V,ss
L
V
) (F.11)
dx
0
n+1
dt
=
1
x
(x
x
0
n+1
) (F.12)
x
= f
fuzzy
(R
, x
col
, L
V
) (F.13)
M
p
= M
col,0
M
col
(F.14)
Model parameters are given in table F.2. Figure F.4 shows the DFD of the model.
F.2.2 Fuzzy submodel
The premise part membership functions of the fuzzy submodel for x
are given in g-
ure F.5. The model can be represented as follows:
IF R
= mf
i
AND x
col
= mf
i
AND L
V
= mf
i
THEN x
= w
x
0
n+1
i
y
i
for i = 1, . . . , 3 (F.15)
in which
y
i
= p
x
i,1
R
+p
x
i,2
x
col
+p
x
i,3
L
V
+ p
x
i,4
(F.16)
with
p
x
=
_
_
0.0250 0.0182 0.0001 0.9767
0.0582 0.0477 0.0003 0.9151
0.0194 0.0147 0.0000 0.9613
_
_
w
x
=
_
_
1
1
1
_
_
(F.17)
245
Distillation column hybrid models
Mass balance
column
Vapor flow
Component
balance
column
Pro-
duction
Product
quality
forcing
M
col
x
col
x
col
M
col
M
col
L
V
L
V
L
V,0
M
col,0
x
col,0
M
p
R*
L
V
M
col
x
0
n+1
R*
M
col,0
R*
L
V
x
col
x
0
n+1
Column
dynamics
x
0
n+1
x
0
n+1
x*
x
0
n+1,0
Figure F.4: Simplied overall dynamic model Data Flow Diagram
246
0.6 0.7 0.8 0.9 1
0
0.2
0.4
0.6
0.8
1
R*
mf1
mf2 mf3
0.2 0.3 0.4 0.5 0.6
0
0.2
0.4
0.6
0.8
1
x
col
mf1 mf2
mf3
(a) (b)
100 120 140 160
0
0.2
0.4
0.6
0.8
1
L
V
247
Distillation column hybrid models
0 0.5 1 1.5 2
0.97
0.98
0.99
1
1.01
t (h)
x
*
(
)
0 1 2
0.97
0.98
0.99
1
1.01
t (h)
x
*
(
)
(a) (b)
0 1 2
0.97
0.98
0.99
1
1.01
t (h)
x
*
(
)
(c)
Figure F.6: Performance fuzzy model x
with respect to identication data ID1 (a), ID2 (b) and ID3
(c)
F.2.3 Fuzzy model performance
The performance of the fuzzy submodel for x
Reux fraction ()
p
[]
Consequent part parameters fuzzy models
w
[]
Rule weights fuzzy models
x
V
Time constant characterizing column heating (h)
x
Dominant time constant (h)
248
Summary
Dynamic process modeling in chemical engineering is often based on a combination of
rst principles and empirical relations. These models are interpretable, in the sense that,
by analyzing the model, there is a physical understanding of the process behavior. Many
process models consist of a framework of mass, component and energy balances describing
the essential process accumulation. Within this framework, phenomena such as chemical
reaction or mass transfer can be described by static empirical relations. However, for many
processes, information about - or understanding of - these phenomena can be limited. In
addition, the behavior can be complex. This may result in difculties during the derivation of
empirical relations.
Hybrid fuzzy-rst principles models can be a useful alternative in these situations. By com-
bining fuzzy logic submodels with a physical model framework, hybrid fuzzy-rst principles
models are obtained that combine a high level of interpretability with the ability to deal
with complex behavior. Hybrid fuzzy-rst principles models are especially suited to describe
highly nonlinear behavior over a large operating domain. Examples are models of batch or
fed-batch processes, cyclic processes or distributed parameter processes, such as plug ow
reactors.
This work deals with the development and analysis of hybrid fuzzy-rst principles models. A
model structure is proposed in which hybrid models consist of a framework of dynamic mass
and energy balances, formulated in state-space form, supplemented with algebraic and fuzzy
equations. The fuzzy equations describe physical phenomena that are poorly understood or
difcult to model using rst principles. The proposed structure guarantees a certain level of
transparency, since internal variables and their behavior can be interpreted. The quality of hy-
brid models is assessed by evaluating dynamic performance, static performance, complexity,
interpretability and process independence.
The main sources of information during hybrid model development are rst principles, pro-
cess data and human expertise. First principles are used to derive the model structure. Al-
though fuzzy logic is extremely suitable to quantify expert knowledge, it is often difcult to
integrate this knowledge in a predened hybrid model structure. Therefore, expert knowl-
edge is only used to provide structural information. Process data is used to determine model
parameters and identify the fuzzy equations.
The type of fuzzy model which is used is the Takagi-Sugeno-Kang (TSK) type. This type
can be interpreted as a collection of local linear models. TSK fuzzy models are extremely
suitable for describing highly nonlinear relations. Although linguistic fuzzy models can be
interpreted better by humans, they require a more complex structure to describe a relation
than TSK models. The advantage of using TSK type fuzzy models is reduced complexity in
combination with structural transparency.
For the development of hybrid models, a structured modeling approach is presented. This
approach consists of several independent steps, divided into three phases. In the rst phase,
Summary
the model objective and quality requirements are formulated. The second phase involves
the design of the model. In the design phase, the modeling problem is divided into several
smaller problems, which can be solved sequentially: structure design, behavior estimation,
identication and optimization. In the third phase, the hybrid model is evaluated with respect
to the model requirements.
The hybrid model structure is determined by formulating process behavior hypotheses. These
hypotheses describe the essential characteristics of the process behavior. Based on these
hypotheses, basic model equations can be derived. In addition, the phenomena that will be
described with fuzzy logic are distinguished.
Subsequently, the process data required for identication is obtained. Most fuzzy model
identication techniques require input-output data. To acquire input-output data, appropriate
experiments need to be designed. Estimation techniques can be used to estimate behavior
which is not directly measurable. For this purpose, Kalman ltering and PI-estimation were
compared. The PI-estimator is structurally similar to a PI feedback controller. Its perfor-
mance matches the Kalman lter, but the PI-estimator is easier to set up.
Three different approaches for the identication of fuzzy models were compared: fuzzy
clustering, which searches for linear subspaces in data, genetic algorithms, a probabilistic
optimization technique which can be used to determine fuzzy model parameters and neuro-
fuzzy methods, in which the fuzzy model is interpreted as an articial neural network. All
approaches yielded acceptable results. However, fuzzy clustering was preferred, since it re-
quires less a priori structure information. This makes the approach useful in situations where
little information about the modeled phenomenon is available.
The hybrid model is formed by combination of the fuzzy models and the physical model
framework. Since the fuzzy models are identied individually, it may be necessary to opti-
mize the fuzzy model parameters with respect to the model output in order to improve the
overall hybrid model performance. The best results are obtained if the optimization is fo-
cused on the consequent parameters of the fuzzy model; these are the parameters of the local
linear models. The number of parameters to be optimized can be reduced by optimizing rule
or model weights.
The modeling approach was illustrated using three different cases. A simple simulated fed-
batch reactor was used for the development of the approach. To analyze hybrid model prop-
erties, a hybrid model of a simulated continuous pulp digester was built and compared with
a rst principles model and a fuzzy model. A batch distillation column was used to illustrate
hybrid modeling of an experimental setup.
The hybrid models have shown that the use of fuzzy logic in hybrid modeling introduces
exibility, which enables the description of complex behavior with a predened, interpretable
overall model structure. This is accomplished since the fuzzy submodels describe complex
behavior in a transparent way without the need for an a priori fuzzy model structure. The
need for such a structure would reduce exibility. The result of this exibility is that hybrid
fuzzy-rst principles models can match the desired behavior if the model structure represents
the essential dynamic characteristics of the process.
250
Since process dynamics are modeled with a similar structure, dynamic performance, com-
plexity and interpretability of hybrid models and rst principles models are comparable. With
respect to static performance and process independence, hybrid models are more comparable
with fuzzy models. Depending on the number of fuzzy equations, the static performance of
the hybrid model is based on observed behavior, similar to fuzzy models. The fuzzy equa-
tions in hybrid models are data driven, which imposes a level of process dependence. The
fuzzy models are valid in the operating regime that is represented by the identication data.
The rst principles part can only partly compensate for limited validity of the fuzzy models
outside this operating regime.
The result is that hybrid fuzzy-rst principles models are useful in applications which fo-
cus on a specic installation and require dynamic models that are transparent and provide a
general explanation of the process behavior.
251
Samenvatting
Binnen de chemische technologie is dynamisch modelleren vaak gebaseerd op een com-
binatie van fysische principes en empirische relaties. Deze modellen zijn interpreteerbaar:
door het model te analyseren kan fysisch begrip verkregen worden van het gedrag van het
model. Veel modellen bestaan uit een raamwerk van massa-, energie- en componentbalansen
die de essenti ele processaccumulatie beschrijven. Binnen dit raamwerk kunnen verschijnse-
len zoals chemische reactie en stofoverdracht beschreven worden met empirische relaties.
Echter, voor veel processen kan informatie over - of inzicht in - deze verschijnselen beperkt
zijn. Daarnaast kan het gedrag van het proces complex zijn, wat kan leiden tot moeilijkheden
bij het opstellen van empirische relaties.
Hybride fuzzy/fysische modellen kunnen in deze situaties een bruikbaar alternatief zijn. Door
fuzzy submodellen te combineren met een fysisch raamwerk worden hybride fuzzy/fysische
modellen verkregen die een hoge mate van interpreteerbaarheid combineren met de mogelijk-
heid om complex gedrag te beschrijven. Hybride fuzzy/fysische modellen zijn in het bijzon-
der geschikt om niet-lineair gedrag binnen een groot operatiedomein te beschrijven. Voor-
beelden zijn batchprocessen, cyclische processen of gedistribueerde parameter-processen,
zoals buisreactoren.
Dit werk behandelt het ontwerp en analyse van de hybride fuzzy/fysische modellen. Een mo-
delstructuur zal voorgesteld worden, waarin hybride modellen bestaan uit een raamwerk van
dynamische massa- en energiebalansen, geformuleerd in de vorm van toestandsvergelijkin-
gen, aangevuld met algebrasche en fuzzy vergelijkingen. De fuzzy vergelijkingen beschri-
jven fysische verschijnselen die slecht begrepen worden of moeilijk op grond van fysische
principes te modelleren zijn. De voorgestelde structuur garandeert een bepaalde transparantie
omdat interne variabelen en hun gedrag genterpreteerd kunnen worden. De kwaliteit van hy-
bride modellen wordt bepaald door de dynamische en statische prestatie, de complexiteit, de
interpreteerbaarheid en procesonafhankelijkheid in kaart te brengen.
De belangrijkste bronnen van informatie voor de bouw van hybride modellen zijn fysische
principes, procesdata en menselijke expertise. Fysische principes worden gebruikt om de
modelstructuur te bepalen. Hoewel fuzzy logic uitermate geschikt is om expertkennis te
kwanticeren, is het vaak moeilijk deze kennis in een vastgelegde modelstructuur te integre-
ren. Daarom wordt expertkennis slechts gebruikt om modelstructuren aan te geven. Proces-
data wordt gebruikt om modelparameters en de fuzzy submodellen te bepalen.
Het type fuzzy modellen dat gebruikt wordt is het Takagi-Sugeno-Kang (TSK) type. Dit
type kan genterpreteerd worden als een verzameling van lokale lineaire modellen. TSK
fuzzy modellen zijn uitermate geschikt om hoog niet-lineair gedrag te beschrijven. Hoewel
lingustische fuzzy modellen beter door mensen genterpreteerd kunnen worden, is er een
meer complexe structuur nodig om een relatie te beschrijven dan bij gebruik van TSK fuzzy
modellen. Het voordeel van het gebruik van TSK modellen is gereduceerde complexiteit in
combinatie met een transparante structuur.
Samenvatting
Voor de ontwikkeling van hybride modellen is een gestructureerde modelleringsaanpak ge-
presenteerd. Deze aanpak bestaat uit verschillende onafhankelijke stappen, verdeeld over 3
fasen. In de eerste fase worden de modeldoelstelling en kwaliteitseisen geformuleerd. De
tweede fase betreft het ontwerp van het model. Het model wordt ge evalueerd in de derde fase
in relatie tot de kwaliteitseisen.
De structuur van het hybride model wordt bepaald door gedragshypothesen te formuleren.
Deze hypothesen beschrijven de essenti ele karakteristieken van het procesgedrag. Uitgaande
van deze hypothesen kunnen de basisvergelijkingen van het model afgeleid worden. Tevens
worden de verschijnselen die met behulp van fuzzy logic beschreven worden onderscheiden.
Vervolgens wordt de benodigde procesdata vergaard. De meeste identicatietechnieken voor
fuzzy modellen hebben ingangs- en uitgangsdata nodig. Om de data te verkrijgen moeten
geschikte procesexperimenten ontworpen worden. Daarnaast kunnen schattingstechnieken
gebruikt worden om gedrag te schatten dat niet direct meetbaar is. Voor deze toepassing zijn
het Kalman lter en de PI-schatter vergeleken. De PI-schatter is structureel gezien verge-
lijkbaar met een PI-regelaar. De prestatie is vergelijkbaar met het Kalman lter, maar de
PI-schatter is veel makkelijker op te stellen.
Voor de identicatie van de fuzzy submodellen zijn drie verschillende technieken vergeleken:
fuzzy clustering, waarbij gezocht wordt naar lineaire subruimten, genetische algoritmen, een
probabilistische optimalisatietechniek die gebruikt kan worden voor het bepalen van de pa-
rameters van een fuzzy model, en neurofuzzy methoden, waarbij het fuzzy model beschouwd
wordt als een neuraal netwerk. Alle technieken leverde acceptabele resultaten. Echter, de
voorkeur gaat uit naar fuzzy clustering omdat deze techniek minder a priori informatie over
de structuur van het fuzzy model nodig heeft. Dit maakt de aanpak geschikt in situaties waar
weinig informatie over het te modelleren fenomeen beschikbaar is.
Door de fuzzy modellen en het fysische raamwerk te combineren ontstaat het hybride model.
Omdat de fuzzy modellen afzonderlijk bepaald worden, kan het nodig zijn om de parame-
ters van de fuzzy modellen te optimaliseren om de prestatie van het hybride model te ver-
beteren. De beste resultaten worden bereikt als de DAN-parameters van de fuzzy modellen
geoptimaliseerd worden; dit zijn de parameters van de lokale lineaire modellen. Het aantal
parameters dat geoptimaliseerd moet worden kan gereduceerd worden door regel- of model-
gewichten te optimaliseren.
De modelleringsaanpak is gellustreerd op drie verschillende processen. Een simpele gesi-
muleerde fed-batch reactor is gebruikt voor de ontwikkeling van de aanpak. Voor de analyse
van de eigenschappen van hybride modellen is een hybride model van een (gesimuleerde)
continue houtverpulper gebouwd. Dit model is vergeleken met een fysisch model en een
fuzzy model. Tenslotte is een batchdestillatiekolom gebruikt om hybride modellering van
een experimentele opstelling te illustreren.
De hybride modellen hebben aangetoond dat het gebruik van fuzzy logic voor hybride mo-
delleren exibiliteit introduceert, zodat beschrijving van complex gedrag met een vooraf ge-
denieerde, interpreteerbare modelstructuur mogelijk wordt. Dit wordt bereikt doordat de
fuzzy submodellen complex gedrag op een transparante manier beschrijven, zonder dat voor
de fuzzy modellen een a priori modelstructuur vastgelegd dient te worden. Dit zou de exi-
254
biliteit verminderen. Het resultaat van de exibiliteit is dat hybride fuzzy/fysische modellen
het gewenste gedrag kunnen beschrijven mits de modelstructuur de essenti ele dynamische
karakteristieken van het proces representeert.
Omdat de dynamica binnen fysische en fuzzy modellen op dezelfde manier beschreven wordt,
zijn de dynamische prestatie, complexiteit en interpreteerbaarheid van deze modellen verge-
lijkbaar. Voor wat betreft statische prestatie en procesonafhankelijkheid zijn hybride model-
len meer vergelijkbaar met fuzzy modellen. Afhankelijk van het aantal fuzzy submodellen
is de statische prestatie van hybride modellen gebaseerd op het waargenomen gedrag, verge-
lijkbaar met fuzzy modellen. De fuzzy elementen van hybride modellen zijn gebaseerd op
procesdata, wat een mate van procesafhankelijkheidoplegt. De fuzzy submodellen zijn geldig
in het operatiedomein dat de procesdata representeert. Het fysische gedeelte kan slechts
gedeeltelijk compenseren voor beperkte geldigheid buiten dit domein.
Het resultaat is dat hybride fuzzy/fysische modellen geschikt zijn voor toepassingen die zich
richten op een specieke installatie en waarbij behoefte is aan dynamische modellen die
transparant zijn en een algemene uitleg van het gedrag geven.
255
Dankwoord
Zon promotieproject is een hele onderneming. Het proefschrift dat er nu ligt had niet
tot stand kunnen komen als ik niet bijgestaan was door een groep enthousiaste mensen, die
ik dan ook graag wil bedanken.
Brian Roffel en Ben Betlem hebben mij tijdens het project begeleid. Brian, het idee voor
het project kwam van jou. Al tijdens de beginperiode van mijn afstudeeropdracht wist je me
warm te maken voor onderzoek en jij hebt me al snel de kans gegeven het project verder
vorm te geven. Jouw pragmatische insteek en resultaatgericht denken hebben vele obstakels
uit de weg geholpen en tot een aantal leuke resultaten geleid. Ben, jij was meer de losoof
en de discussies met jou zijn van groot belang geweest voor het bepalen van de richting. Het
afdwalen tijdens onze gesprekken zorgde ervoor dat veel onderwerpen in een goede context
geplaatst konden worden, ookal kwamen we vaak weer terug bij af.
Het proefschrift had er nooit kunnen liggen zonder het vele werk dat verzet is door de grote
groep afstudeerders. Naar onderwerp waren dit: Pieter Jansen van der Sligte (vastleggen
ervaringskennis), Peter Deurenberg (TN, verwerking ervaringskennis), Henri Witteveen (pa-
rameterschatting, maar ook een jaar lang collega en geduchte tegenstander bij squash!), Coen
Noppert (genetische algorithmen), Y ucel K ok (Rijksuniversiteit Groningen, neurofuzzy sys-
temen), Freek Stoffelen (TN, submodel integratie, de Pot van Freek), Maarten Jansen (model-
eigenschappen), Simone Bijl (modeleigenschappen) en Barry Meddeler (modeleigenschap-
pen). Jammergenoeg is het werk van Tobias op den Brouw niet direct in het proefschrift
opgenomen. Tobias, hoewel de AWZI geen geschikte case voor het project bleek, heeft jou
werk wel meer inzicht gegeven in de (on)mogelijkheden van het gebruik van industri ele data
en is op deze manier waardevol geweest voor het bepalen van de toepassingsmogelijkheden
van hybride modelleren. Jongens, de verschillende onderwerpen hadden nooit behandeld
kunnen worden zonder jullie inzet. Het werken met jullie heb ik altijd een van de mooiste
aspecten van het promoveren gevonden. Ook waardeer ik het contact dat we naast het werk
hadden of nog steeds hebben. Bedankt!
Ik wil de leden van de verschillende D-commissies eveneens bedanken voor hun bijdrage:
Henk van den Beld
(CT), Ties Bos (CT), Herman Hemmes (TN), Hans Kuipers (CT), Sietse
van der Meulen (TN), Ruud Visscher (Parenco BV), Sjoerd de Vries (TO), Theo de Vries
(EL) en Jan Wattenberg (Parenco BV).
Mijn paranimfen Jan Jelle Sijbesma en Ralf Heijkants hebben mij met de laatste loodjes
geholpen. Jongens, bedankt dat jullie me bij willen staan. Ook wil ik iedereen van de vak-
groep bedanken voor de plezierige tijd de afgelopen jaren. Bartie, du wirst immer mein S ue
sein.
Tot slot wil ik mijn ouders bedanken voor hun onvoorwaardelijke steun. Hoewel jullie niet
altijd op de hoogte waren van waar ik me nu eigenlijk mee bezig hield, hebben jullie me
altijd gestimuleerd en de kans gegeven om verder te leren. Zonder jullie had ik dit niet
kunnen bereiken.
Bibliography
Alexandridis, A., H. Sarimveis and T. Retsina (2001). Modeling and control of continuous
digesters using the pls methodology. In: Pulp Digester Modeling and Control Workshop.
Annapolis (USA).
Atkinson, K. E. (1989). Numerical analysis. 2nd ed.. John Wiley and Sons. New York (USA).
Babu ska, R. (1996). Fuzzy modeling and identication. Ph.d. thesis. Delft Technical Univer-
sity. Delft (The Netherlands).
Babu ska, R. and H. B. Verbruggen (1994). Applied fuzzy modeling. In: Proceedings IFAC
Symposium on Articial Intelligence in Real Time Control. Valencia (Spain). pp. 6166.
Babu ska, R., H. J. L. Van Can and H. B. Verbruggen (1996). Fuzzy modeling of enzymatic
penicillin-g conversion. In: Preprints 13th IFAC world congress. Vol. N. San Francisco
(USA). pp. 479484.
Back, T. and F. Kursawe (1995). Evolutionary algorithms for fuzzy logic: a brief overview.
In: Advances in fuzzy systems - applications and theory (B. Bouchon-Meunier, R. R.
Yager and L. A. Zadeh, Eds.). Vol. 4. pp. 310. World Scientic. Singapore.
Baranyi, P., I. Bavelaar, L. T. Koczy and A. Titli (1997). Inverse rule base of various fuzzy
interpolation techniques. In: Seventh IFSA World Congress. Prague. pp. 121126.
Baranyi, P., I. M. Bavelaar, R. Babuska, L. T. Koczy, A. Titli and H. B. Verbruggen (1998).
A method to invert a linguistic fuzzy model. International Journal of Systems Science
29(7), 711721.
Bastin, G. and D. Dochain (1990). On-line estimation and adaptive control of bioreactor.
Vol. 1 of Process Measurement and Control. Elsevier. Amsterdam.
Betlem, B. H. L. (1997). Hierarchical control of batch distillation. Ph.d. thesis. University of
Twente. Enschede (The Netherlands).
Betlem, B. H. L. (2000). Batch distillation column low-order models for quality program
control. Chemical Engineering Science 55, 31873194.
Betlem, B. H. L., H. Krijnsen and H. Huijnen (1998). Optimal batch distillation control based
on specic measures. Chemical Engineering Journal 71, 111126.
Bezdek, J. C. (1981). Pattern Recognition with Fuzzy Objective Function Algorithms. Plenum
Press. New York.
Bezdek, J. C., C. Coray, R. Gunderson and J. Watson (1981). Detection and characterization
of cluster substructure, ii. fuzzy c-varieties and convex combinations thereof. SIAM
Journal of Applied Mathematics 40(2), 339357.
Boehm, B. W. (1973). Characteristics of software quality. Technical Report TRW-SS-73-
09. Thompson Ramo Wooldridge Systems Group, Systems Engineering and Integration
Division.
Bibliography
Bohlin, T. (1994). A csae study of grey box identication. Automatica 30(2), 307318.
Box, G. E. P. and G. M. Jenkins (1970). Time series analysis; forecasting and control.
Holden-Day series in time series analysis. Holden-Day. San Francisco.
Box, G. E. P., W. G. Box and J. S. Hunter (1978). Statistics for experimenters. Wiley series
in probability and mathematical statistics. John Wiley and Sons. New York (USA).
Branch, M. A., T. F. Coleman and Y. Li (1999). A subspace, interior and conjugate gradi-
ent method for large-scale bound-constrained minimization problem. SIAM Journal on
Scientic Computing 21(1), 123.
Bristol, E. H. (1966). On a newmeasure of interaction for multivariable process control. IEEE
Transactions on Automation and Control AC-11, 133.
Brown, R. G. and P. Y. C. Hwang (1992). Introduction to randomsignals and applied Kalman
ltering. 2nd ed.. John Wiley and Sons. New York (USA).
Byrd, R. H., R. B. Schnabel and G. A. Shultz (1988). Approximate solution of the trust region
problemby minimization over two-dimansional subspaces. Mathematical programming
40, 247263.
Cagan, J., I. E. Grossman and J. Hooker (1996). Combining articial intelligence and opti-
mization in engineering design: a brief survey. AIChE Symposium series 92(312), 110
118.
Caracotsios, M. and W. E. Stewart (1985). Sensitivity analysis of initial value problems with
mixed odes and algebraic equations. Computers in Chemical Engineering 9(4), 359
365.
Crosby, P. B. (1979). Quality is free: The art of making quality certain. McGraw-Hill. New
York (USA).
da Costa Sousa, J. M. (1998). A fuzzy approach to model-based control. Ph.d. thesis. Tech-
nical University Delft. Delft (The Netherlands).
da Costa Sousa, J. M., R. Babuska and H. B. Verbruggen (1997). Fuzzy predictive control
applied to an air-conditioning system. Control engineering practice 5(10), 13951406.
De Bruin, H. A. E. and B. Roffel (1996). A new identication method for fuzzy linear models
of nonlinear dynamic systems. Journal of Process Control 6(5), 227293.
Dubois, D. and H. Prade (1993). Fuzzy sets: a survey of engineering applications. Computers
and chemical engineering 17(Supp), S373S380.
Dufour, P., S. Bhartiya, P. S. Dhurjati and F. J. Doyle III (2001). A neural network approach
for diagnosis in a continuous pulp digester. In: Pulp Digester Modeling and Control
Workshop. Annapolis (USA).
Dunn, I. J., J. Heinzle, J. Ingham and J. E. Prenosil (1992). Biological reaction engineering,
principles, applications and modelling with PC simulation. VCH Verlagsgesellschaft.
Weinheim (Germany).
260
Dunn, J. C. (1974). A graph theoretic analysis for pattern classication via tamuras fuzzy
relation. IEEE Transactions on Systems, Man and Cybernetics 4(3), 310313.
Edgar, T. F. and D. M. Himmelblau (1988). Optimization of chemical processes. McGraw-
Hill. New York (USA).
Eykhoff, P. (1974). System Identication. John Wiley and Sons. London (UK).
Fischer, M. and R. Isermann (1996). Robust hybrid control based on inverse fuzzy process
models. In: Proceedings of the Fifth IEEE International Conference on Fuzzy Systems.
New Orleans (USA). pp. 12101216.
Funkquist, J. (1997). Grey-box identication of a continuous digetser - a distributed-
parameter process. Control Engineering Practice 5(7), 919930.
Godasi, S. and A. Palazoglu (2001). Recursive identication of a wiener model to describe
grade transitions in a continuous pulp digester. In: Pulp Digester Modeling and Control
Workshop. Annapolis (USA).
Goldberg, D. E. (1989). Genetic algorithms in search, optimization, and machine learning.
Addison-Wesley. Reading, MA (USA).
Gonzalez, A. J. and D. D. Dankel (1993). The Engineering of Knowledge-based Systems -
Theory and Practice. Prentice Hall. Engelwood Cliffs, NJ (USA).
Gupta, S., P.-H. Liu, S. A. Svoronos, R. Sharma, N. A. Abdel-Khalek, Y. Cheng and H. El-
Shall (1999). Hybrid rst-principles/neural networks model for column otation. AIChE
Journal 45(3), 557566.
Hart, A. (1992). Knowledge acquisition for expert systems. McGraw-Hill. New York (USA).
Hathaway, R. J. and J. C. Bezdek (1993). Switching regression models and fuzzy clustering.
IEEE transactions on Fuzzy Systems 71(3), 503516.
Hertz, J., A. Krogh and R. G. Palmer (1991). Intorduction to the theory of neural computation.
Addison-Wesley. Redwood (USA).
Holst, J., U. Holst, H. Madsen and H. Melgaard (1992). Validation of grey box models. In:
IFAC Adaptive systems in control and design processing. Grenoble (France). pp. 5360.
Hwang, G.-J. (1995). Knowledge acquisition for fuzzy expert systems. International Journal
of Intelligent Systems 10, 541560.
Jahn, E. C. and R. W. Strauss (1992). Industrial chemistry of wood. In: Riegels Handbook
of industrial chemistry (E. R. Riegel and J. A. Kent, Eds.). 9th ed.. pp. 435485. Van
Nostrand Reinhold. New York (USA).
Jain, A. K. and R. C. Dubes (1988). Algorithms for Clustering Data. Prentice Hall. Engle-
wood Cliffs, NJ (USA).
Jang, J.-S. R. (1993). Ans: Adaptive-network -based fuzzy inference system. IEEE Trans-
actions on Systems, Man and Cybernetics 23(3), 665685.
Jansen, M. (2000). Modeling and simulation of a continuous pulp digester. M.sc. thesis. Uni-
versity of Twente. Enschede (The Netherlands).
261
Bibliography
Jansen van der Sligte, P. (1999). Using expert knowledge in fuzzy modeling. M.sc. thesis.
University of Twente. Enschede (The Netherlands).
Johansen, T. A. and B. A. Foss (1997). Operating regime based process modeling and identi-
cation. Computers and chemical engineering 21(2), 159176.
Jordan, M. I. and D. E. Rumelhart (1992). Forward models: Supervised learning with a distal
teacher. Cognitive Science 16(3), 307354.
Kalman, R. E. (1960). A new approach to linear ltering and prediction theory. Journal of
Basic Engineering 82D, 3545.
Kan, S. H. (1994). Metrics and models in software quality engineering. Addison-Wesley.
Reading (USA).
Kavli, T. (1993). Asmodan algorithm for adaptive spline modelling of observation data.
International journal of control 58(4), 947968.
Kaymak, U. and R. Babuska (1995). Compatible cluster merging for fuzzy modeling. In:
FUZZ-IEEE/IFES95. Yokohama, Japan. pp. 897904.
Kendall, M. and A. Stuart (1979). The advanced theory of statistics. Vol. 2. 4th ed.. Charles
Grifn and company. London (UK).
KrishnaKumar, K., P. Gonsalves, A. Satyadas and G. Zacharias (1995). Hybrid fuzzy logic
ight controller synthesis via pilot modeling. Journal of Guidance, Control and Dynam-
ics 18(5), 10981105.
Luyben, W. L. (1990). Process modeling, simulation, and control for chemical engineers. 2nd
ed.. McGraw-Hill. New York (USA).
Marsh, S. (1994). Fuzzy Logic Program. Cortex Communications. Austin, TX (USA).
Nauck, D. and R. Kruse (1997). Function approximation by nefprox. In: Second European
Workshop on Fuzzy Decision Analysis and Neural Networks for Management, Planning
and Optimization. Dortmund, Germany.
Nelles, O. (1997). Lolimot - lokale, lineare modelle zur identikation nichtlinearer, dynamis-
cher systeme. Automatisierungstechnik 45(4), 163174.
Nonaka, I. and H. Takeuchi (1995). The Knowledge-creating Company; How Japanese Com-
panies Create the Dynamics of Innovation. Oxford University Press. New York.
Otto, O. and K. Hartmann (1996). Hybrides modell eines sichters auf der basis kunstlicher
neuronaler netze. Chemie Ingenieur Technik 68(12), 15781581.
Pal, N. R., K. Pal, J. C. Bezdek and T. A. Runkler (1997). Some issues in system identica-
tion using clustering. In: The 1997 IEEE international conference on neural networks.
Houston (USA). pp. 25242529.
Piron, E., E. Latrille and F. Rene (1997). Application of articial neural networks for cross-
ow microltration modelling: black box and semi-physical approaches. Computers
and chemical engineering 21(9), 10211030.
262
Porru, G., C. Aragonese, R. Baratti and A. Servida (2000). Monitoring of a co oxidation re-
actor through a grey-model based ekf observer. Chemical engineering science 55, 331
338.
Psichogios, D. C. and L. H. Ungar (1992). A hybrid neural network-rst principles approach
to process modeling. AIChE Journal 38(10), 14991511.
Qi, H., X.-G. Zhou, L.-H. Liu and W.-K. Yuan (1999). A hybrid neural netwrok-rst princi-
ples model for xed-bed reactor. Chemical Engineering Science 54, 25212526.
Ramirez, W. F. (1994). Process control and identication. Academic Press. San Diego (USA).
Rausis, K. (1998). Verication des lois de commande oue. Ph.d. thesis. Ecole Polytechnique
Federale de Lausanne. Lausanne (Switzerland).
Rippin, D. W. T. (1983). Simulation of single- and multiproduct batch chemical plants for
optimal desing and operation. Computers and chemical engineering 7(3), 137156.
Roffel, B. and P. Chin (1987). Computer control in the process industries. Lewis Publishers.
Chelsea, MI (USA).
Roubos, J. A., P. Krabben, M. Setnes, R. Babuska, J. J. Heijnen and H. B. Verbruggen (1999).
Hybrid model development for fed-batch bioprocesses; combining physical equations
with the metabolic network and black-box kinetics. In: 6th UK Workshop on Fuzzy
Systems. Brunel University, Uxbridge (UK). pp. 231239.
Seinfeld, J. H. and L. Lapidus (1974). Process modeling, estimation, and identication. Vol. 3
of Mathematical methods in chemical engineering. Prentice Hall. Englewood Cliffs
(USA).
Sestito, S. and T. S. Dillon (1994). Automated Knowledge acquisition. Prentice Hall. New
York (USA).
Setnes, M. (1999). Complexity reduction methods for fuzzy systems design. In: Fuzzy Logic
Control Advances in Applications (H. B. Verbruggen and R. Babuska, Eds.). pp. 87
111. World Scientic. Singapore.
Shinskey, F. G. (1984). Distillation control for productivity and energy conservation. 2nd ed..
McGraw-Hill. New York (USA).
Simutis, R., R. Oliveira, M. Manikowski, S. Feyo de Azevedo and A. Lubbert (1997). How
to increase the performance of models for process optimization and control. Journal of
biotechnology 59, 7389.
Smith, C. C. and T. J. Williams (1974). Mathematical modeling, simulation and control of the
operation of kamyr continuous digester for kraft process. Technical Report 64. Purdue
University, PLAIC.
Sneddon, I. N. (1976). Encyclopaedic dictionary of mathematics for engineers and applied
scientists. Pergamon Press. Oxford (UK).
Sohlberg, B. (1998). Supervision and control for industrial processes. Springer Verlag. Berlin
(Germany).
263
Bibliography
Stephanopoulos, G. (1984). Chemical process control: an introduction to theory and practice.
Prentice Hall. Englewood Cliffs, NJ (USA).
Stephanopoulos, G. and C. Han (1996). Intelligent systems in process engineering: a review.
Computers and chemical engineering 20(6/7), 743791.
Takagi, T. and M. Sugeno (1985). Fuzzy identication of systems and its application to mod-
eling and control. IEEE Transactions on Man, Systems and Cybernetics 15(1), 116132.
Tansley, D. S. W. and C. C. Hayball (1993). Knowledge-Based Systems Analysis and Design
- A KADS Developers Handbook. BNR Europe Ltd.. Hertfordshire (UK).
Thompson, M. L. and M. A. Kramer (1994). Modeling chemical processes using prior knowl-
edge and neural networks. AIChE Journal 40(8), 13281340.
Van Can, H. J. L., C. Hellinga, K. Ch. A. M. Luyben, J. J. Heijnen and H. A. B. Te Braake
(1996). Strategy for dynamic process modeling based on neural networks in macro-
scopic balances. AIChE Journal 42(12), 34033418.
Van Can, H. J. L., H. A. B. Te Braake, C. Hellinga, K. Ch. A. M. Luyben and J. J. Heij-
nen (1997). An efcient model development strategy for bioprocesses based on neural
networks in macroscopic balances. Biotechnology and Bioengineering 54(6), 549566.
Van Can, H. J. L., H. A. B. Te Braake, S. Dubbelman, C. Hellinga, K. Ch. A. M. Luyben and
J. J. Heijnen (1998). Understanding and applying the extrapolation properties of serial
gray-box models. AIChE Journal 44(5), 10711089.
Van Lith, P. F., H. Witteveen, B. H. L. Betlemand B. Roffel (2001). Multiple nonlinear param-
eter estimation using pi feedback control. Control Engineering Practice 9(5), 517531.
Venkatasubramanian, V. and S. H. Rich (1988). An object-oriented two-tier architecture for
integrating compiled and deep-level knowledge for process diagnosis. Computational
Chemical Engineering 12, 903921.
Westerterp, K. R., W. P. M. van Swaaij and A. A. C. M. Beenackers (1984). Chemical reactor
design and operation. John Whiley and Sons. Chichester (UK).
Wisnewski, P. A., F. J. Doyle and F. Kayihan (1997). Fundamental continuous-pulp-digester
model for simulation and control. AIChE Journal 43(12), 31753192.
Yager, R. R. and D. P. Filev (1994). Essentials of fuzzy modeling and control. John Wiley and
Sons. New York (USA).
Yang, M. S. (1993). A survey of fuzzy clustering. Mathematical and computer modelling
18(11), 116.
Ying, H. (1994). Sufcient conditions on general fuzzy systems as function approximators.
Automatica 30(3), 521525.
Yourdon, E. (1989). Modern structured analysis. Prentice Hall. Englewood Cliffs (USA).
Zadeh, L. A. (1994). Soft computing and fuzzy logic. IEEE Software 11(6), 4856.
Zhao, J., V. Wertz and R. Gorez (1994). A fuzzy clustering method for the identication of
fuzzy models for dynamical systems. In: 9th IEEE International Symposium on Intelli-
gent Control. Columbus, Ohio (USA). p. 172.
264