Engineering Calculations Using Microsoft Excel 2014
Engineering Calculations Using Microsoft Excel 2014
Niket S. Kaisare
MATLAB ® is a trademark of The MathWorks, Inc. and is used with permission. The MathWorks does not warrant the accuracy
of the text or exercises in this book. This book’s use or discussion of MATLAB ® software or related products does not constitute
endorsement or sponsorship by The MathWorks of a particular pedagogical approach or particular use of the MATLAB ® software.
CRC Press
Taylor & Francis Group,
6000 Broken Sound Parkway NW, Suite 300,
Boca Raton, FL 33487-2742
This book contains information obtained from authentic and highly regarded sources. Reasonable efforts have been made to pub-
lish reliable data and information, but the author and publisher cannot assume responsibility for the validity of all materials or the
consequences of their use. The authors and publishers have attempted to trace the copyright holders of all material reproduced in
this publication and apologize to copyright holders if permission to publish in this form has not been obtained. If any copyright
material has not been acknowledged please write and let us know so we may rectify in any future reprint.
Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced, transmitted, or utilized in any
form by any electronic, mechanical, or other means, now known or hereafter invented, including photocopying, microfilming,
and recording, or in any information storage or retrieval system, without written permission from the publishers.
For permission to photocopy or use material electronically from this work, please access www.copyright.com (http://www.
copyright.com/) or contact the Copyright Clearance Center, Inc. (CCC), 222 Rosewood Drive, Danvers, MA 01923, 978-750-8400.
CCC is a not-for-profit organization that provides licenses and registration for a variety of users. For organizations that have been
granted a photocopy license by the CCC, a separate system of payment has been arranged.
Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are used only for identification
and explanation without intent to infringe.
Preface, xix
Author, xxiii
vii
viii ◾ Contents
Chapter 7 ◾ Special Methods for Linear and Nonlinear Equations 273
7.1 GENERAL SETUP 273
7.1.1 Ordinary Differential Equation–Boundary Value Problems 274
7.1.2 Elliptic PDEs 274
7.1.3 Outlook of This Chapter 275
7.2 TRIDIAGONAL AND BANDED SYSTEMS 275
7.2.1 What Is a Banded System? 275
7.2.1.1 Tridiagonal Matrix 276
7.2.2 Thomas Algorithm a.k.a TDMA 276
7.2.2.1 Heat Conduction Problem 277
7.2.2.2 Thomas Algorithm 281
7.2.3 ODE-BVP with Flux Specified at Boundary 285
7.2.4 Extension to Banded Systems 288
7.2.5 Elliptic PDEs in Two Dimensions 289
7.3 ITERATIVE METHODS 290
7.3.1 Gauss-Siedel Method 291
7.3.2 Iterative Method with Under-Relaxation 295
7.4 NONLINEAR BANDED SYSTEMS 296
7.4.1 Nonlinear ODE-BVP Example 296
7.4.1.1 Heat Conduction with Radiative Heat Loss 297
7.4.2 Modified Successive Linearization–Based Approach 298
7.4.3 Gauss-Siedel with Linearization of Source Term 302
7.4.4 Using fsolve with Sparse Systems 304
7.5 EXAMPLES 304
7.5.1 Heat Conduction with Convective or Radiative Losses 304
7.5.2 Diffusion and Reaction in a Catalyst Pellet 305
xiv ◾ Contents
BIBLIOGRAPHY, 527
INDEX, 529
Preface
S tudents today are expected to know one or more of the several computing or simu-
lation tools as part of their curriculum, due to their widespread use in the industry.
MATLAB® has become one of the prominent languages used in research and industry.
MATLAB is a numerical computing environment that is based on a MATLAB scripting
language. MathWorks, the makers of MATLAB, describe it as “the language of technical
computing.” The focus of this book will be to highlight the use of MATLAB in technical
computing or, more specifically, in solving problems in the analysis and simulation of pro-
cesses of interest to engineers.
This is intended to be an intermediate-level book, geared toward postgraduate students,
practicing engineers, and researchers who use MATLAB. It provides advanced treatment
of topics relevant to modeling, simulation, and analysis of dynamical systems. Although
this is not an introductory MATLAB or numerical techniques textbook, it may however
be used as a companion book for introductory courses. For the sake of completeness, a
primer on MATLAB as well as introduction to some numerical techniques is provided in
the Appendices. Since mid-2000s,we have always used MATLAB in electives in IIT Madras.
The popularity of MATLAB among students led us to start a core undergraduate (sopho-
more) and a postgraduate (first-year masters) laboratory. Since 2016, I have started teach-
ing a massive open online course (MOOC) on MATLAB programming on the NPTEL
platform.* The first two years of this course had over 10,000 enrolled students. Needless to
say, MATLAB has become an important tool in teaching and research. The focus of all the
above courses is to introduce students to MATLAB as a numerical methods tool. Some of
the students who complete these courses inquire about the next-level courses that would
help them apply MATLAB skills to solve engineering problems. This book may also be used
for this purpose. In introductory courses, a significant amount of time is spent in develop-
ing the background for numerical methods itself. In our effort to make the treatment gen-
eral and at a beginner’s level, we eschew real-world examples in favor of abstracted ones.
For example, we would often introduce a second-order ODE using a generic formulation,
such as y″ + ay′ + b(y − c) = 0. A sophomore who hasn’t taken a heat transfer course may
not yet appreciate a “heating in a rod” problem. An intermediate-level text means that it
is more valuable to use a real example, such as T″ + r−1T′ + β(T − Ta) = 0. The utility of such
* NPTEL stands for National Programme for Technology Enhanced Learning and is a Government of India−funded initiative
to bring high-quality engineering and science courses on an online (MOOC) platform to enhance students’ learning.
xix
xx ◾ Preface
an approach cannot be understated, since it allows the freedom to introduce some of the
complexity that engineers, scientists, and researchers face in their work.
The value of using real-world examples was highlighted during my experience in indus-
trial R&D, where we used MATLAB extensively. We needed to interface with cross-func-
tional teams: engineering, implementation, and software development teams. Individuals
came from a wide range of backgrounds. These interactions exposed me to a new experi-
ence: Your work must be understood by people with very different backgrounds, who may
not speak the same technical language. The codes had to bridge the “language barrier”
spoken in different teams, and the codes were to be combined with a reasonably intuitive
interface. I have tried to adopt some of these principles in this book, without moving too far
from the more common pedagogy in creating such a book.
Thus, a practically oriented text that caters to an intermediate-level audience is my objec-
tive in writing this book.
systems. The “process” is the focus. Numerical methods are introduced insofar as is essen-
tial to make a judicious choice of algorithms for simulation and analysis.
PREREQUISITES
Since this is a postgraduate-level text, some familiarity with an undergraduate-level
numerical techniques or an equivalent course is assumed, though we will review all the
relevant concepts at the appropriate stage. So, the students are not expected to remem-
ber the details or nuances of “Newton-Raphson” or “Runge-Kutta” methods, but this
book is not the first time they hear these terms.
Some familiarity with coding (MATLAB, Fortran, C++, Python, or any language)
will be useful, but not a prerequisite. MATLAB primer is provided in the Appendix for
first-time users of MATLAB. Finally, with respect to writing MATLAB code, I focus on
“doing it right the first time” approach—by bringing in good programming practices
that I have learnt over the years. Things like commenting and structuring your code,
scoping of variables, etc., are also covered, not as an afterthought but as an integral part
of the discussion. However, these are dealt with more informally than a “programming
language” course.
xxiii
CHAPTER 1
Introduction
1.1 OVERVIEW
1.1.1 A General Model
This book is targeted toward postgraduate students, senior undergraduates, researchers, and
practicing engineers to provide them with a practical guide for using MATLAB® for process
simulation and numerical analysis. MATLAB was listed among the top ten programming
languages by the IEEE Spectrum magazine in 2015 (a list that was topped by Java, followed by
C and C++). While the basics of MATLAB can be learnt through various sources, the focus
of this book is on the analysis and simulation of processes of interest to engineers.
The terms “analysis” and “simulation” are generic terms that define a rather broad spectrum
of problems and solution techniques. Engineering is a discipline that deals with the transfor-
mation of raw material, momentum, or energy. Thus, this book will focus on those process
examples where the variables of interest vary with time and/or space, including the relationship
of these state variables with their properties. I will use an example of a reactor-separator process
in Section 1.1.2 to illustrate this. While this is a chemical engineering example, the treatment in
this book is general enough for other engineering and science disciplines to also find it useful.
The problems mentioned above that are considered in this book include ordinary and
partial differential equations (ODEs and PDEs), algebraic equations (either linear or non-
linear), or combinations thereof. The three sections of this book are organized based on the
computational methodology and analysis tools that will be used for the respective problems.
Section I of this book includes Chapters 2 through 5 and deals with ODE-IVPs (initial
value problems) as well as the problems that can be converted into a standard form that can
be solved with ODE-IVP tools. A generic ODE-IVP is of the type
dy
= f ( t ,y ;f
f) (1.1)
dt
where
t is an independent variable
y ∈ R n is a vector of dependent variables
ϕ represents parameters
1
2 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
0 = g ( x;f
f) (1.2)
where
x ∈ R m is a vector of dependent variables
ϕ represents parameters
Nonlinear algebraic equations, such as Equation 1.2, fall under this category. Moreover,
ODE-BVPs (boundary value problems) and several PDEs are also converted into the form
of Equation 1.2. Section II will not only cover techniques to solve algebraic equations but
also expound methods to convert ODEs/PDEs to this form. A combination of Equations 1.1
and 1.2, called differential algebraic equations (DAEs), is covered in Chapter 8. Chapters 5
and 9 are the concluding chapters of the first two sections. They build on the concepts from
the preceding chapters in the respective sections for the analysis of dynamical systems and
provide an introduction to advanced topics in simulations.
Finally, Chapter 10, included in Section III, deals with the parameter estimation prob-
lem, that is, to compute the parameter vector, ϕ, that best fits the experimental data.
dx A
F = -r ( x A ) , x A V =0 = xin (1.3)
dV
Ffeed Fr, xr P, xp
D, xD Purge
Fin, xin
F, xf
F, xf
B, xB
FIGURE 1.1 A typical process consisting of a reactor and a separator, with a recycle.
Introduction ◾ 3
The reactor outlet conditions are obtained by solving the ODE-IVP above. ODE-IVP prob-
lems are covered in Chapter 3 of this book. If a dynamic response of the PFR is required,
the resulting model is a PDE, where the state variable of interest varies in both space and
time. Solutions to transient PDEs are covered in Chapter 4. Advanced topics in simulation
are presented in Chapter 5, for example, when the inlet conditions or model parameters
vary with time and/or space.
The distillation column consists of N nonlinear algebraic equations in N unknowns (mole
fractions on each tray). For example, one of the model equations for the kth tray is given by
ax i
0 = ( Li -1 xi -1 - Li xi ) + (Vi +1 yi +1 - Vi yi ) where yi = (1.4)
1 + ( a - 1) xi
Such balance equations are written for each ideal stage of the distillation column, resulting
in N nonlinear algebraic equations that need to be solved simultaneously to obtain N vari-
ables. These are further discussed in Chapter 6.
Axial dispersion is neglected while deriving the model (Equation 1.3). Inclusion of the
axial dispersion term converts this IVP to a BVP, which is covered in Chapter 7. Discretizing
the ODE-BVP results in a set of equations with a special matrix structure. Mass transfer
limitations result in DAEs, which are covered in Chapter 8.
MATLAB is instead provided in Appendix A. This book follows the principle of “learn it
right the first time.” Good programming hygiene, in writing MATLAB codes, is evange-
lized and implemented right from the first example. The book follows another principle
that the best way to learn programming is through extensive practice. MathWorks, the
parent company that develops MATLAB, has good introductory video tutorials, available
at: http://in.mathworks.com/products/matlab/videos.html.* A beginner may want to start
with their “Getting Started” videos.† I also have an introductory MOOC course on using
MATLAB for numerical computations on National Programme for Technology Enhanced
Learning.‡
Figure 1.2 shows a screenshot of MATLAB window. The main section contains two
windows: MATLAB editor at the top and MATLAB command window at the bottom. The
MATLAB editor currently shows the MATLAB file firstFlowSheet.m, which is a
“driver script” to simulate the reactor-separator flow sheet described above. Line number
13 shows the following statement:
[F,x,err] = solveFlowSheet(Ffeed,Vpfr,purge,initVal);
A careful look at the directory listing (left-top window) shows a file solveFlowSheet.m.
This is a MATLAB function file. A brief description of MATLAB script and function files
is presented in Appendix A (Section A.6). In this example, this function takes in the input
and initial conditions for the flow sheet and computes all the flowrates and mole fractions.
These are returned to the calling program and are captured in variables F and x (which are
standard notations for flowrate and mole fraction, respectively). Some of the basic principles
of programming in MATLAB, illustrated in Figure 1.2, include the following:
• Codes are sectioned using this new feature* in MATLAB editor. Each program has
input, executing, and output blocks. Sanctity of these blocks is maintained, as far as
possible.
• Codes are well-commented.
• Codes are modular, where each MATLAB function is intended to perform a
specific task.
• Appropriate use of MATLAB functions and scripts.
• Using standard or descriptive names for variables.
• Careful attention to variable definitions and scoping.
These aspects of MATLAB coding are described in this section. Another aspect discussed
presently is to use the powerful matrix and linear algebra capabilities to write efficient
and highly readable codes. Some of these features will be described using factorial and
Maclaurin series expansion of exponent:
x2 xn
ex = 1+ x + + + (1.5)
2! n!
%% User Inputs
n = input('Please enter a number: ');
%% Computing factorial
fact = 1;
for i = 1:n
fact = fact*i;
end
%% Display Results
disp([num2str(n), '! is ', num2str(fact)]);
The code consists of three sections: (i) user input section, where the input parameters for
the code are specified; (ii) computing section, which forms the main core of the code; and
(iii) results/output section, that is, the last line where the result is displayed. The lines start-
ing with “%%” (two percentage signs) as the first two characters in a line indicate the start
of a new section. Sectioning helps keep the code clean and easy to understand. Additionally,
comments are provided in the code to help readability.
PERSONAL EXPERIENCE
While this may seem unnecessary for short codes, good coding habits need to be devel-
oped right from the start. It is generally accepted that commenting makes codes readable
and shareable with others. However, advantages of structuring and commenting your code
go beyond that. Comments help us understand our own codes better and faster. It helps in
coding by making one think about the skeletal structure of the code before they start writing
the code, thus reducing the number of errors. It also reduces the time taken for debugging
a code.
The above code is how most students write their first code in MATLAB. However, this
does not take advantage of the powerful array operations provided by MATLAB. For
example, the command prod(1:n) will calculate the product of vector [1:n], which
is nothing but the factorial itself. Indeed, the entire computing section can be replaced
by a single line:
fact = prod(1:n);
If factorials for all integers from 1 through n are desired, then it is more efficient to use
cumprod(1:n). This will return a 1 × n vector, whose ith element is i!. At the command
prompt, type ‘help cumprod’ to understand its use.
Introduction ◾ 7
1.2.2 MATLAB® Functions
A MATLAB script, shown in Example 1.1, is a MATLAB file that contains statements that
are executed sequentially by MATLAB (as if they were typed on the command prompt).
The script is executed by calling it using the file name. The script shares workspace with
the entity that calls it; in most cases, that is the command window workspace. MATLAB
functions, on the other hand, are intended to execute commands (which may call other
functions or subfunctions) for a specific purpose, with a set of input and output parameters.
The variables defined in a function have a local scope. In other words, these variables are
not available in the MATLAB workspace. The variables defined in the script, on the other
hand, are available in the main MATLAB workspace.
The first line in a MATLAB function is the function definition:
function [out1, out2, ...] = funName(in1, in2, ...)
Here, function is a keyword that defines function and funName is the name of the
function. The function command also defines a comma-separated list of input and output
variables. The name of the m-file containing the function should be the same as the func-
tion name.* If the file name is different than the function name, the function is known to
MATLAB through its file name. It is a good programming practice to use the same name
for the function and the m-file.
Let us redo the factorial problem of Example 1.1 using the function.
The above function is saved in a file, myFact.m, which has the same name (without the
extension “.m”) as the function name. myFact calculates the factorial of n when invoked
at command prompt:
>> fact = myFact(5)
fact =
120
* Exception is a nested or daughter function, which will not be considered in this book.
8 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
When the first example is executed, the commands in the script file are executed in the same
workspace as the MATLAB command workspace. Thus, all the variables, n, fact, and i,
are available in the command workspace. If, on the other hand, function myFact is used,
only the variable fact appears on the MATLAB workspace. If user interaction is required
similar to that in Example 1.1, a separate MATLAB script may be written as
% File: factDriver.m
% User Input
n = input('Please enter a number: ');
% Calculate factorial and display
disp([num2str(n), '! is ', num2str(myFact(n))]);
Following are a few tips regarding functions and scripts. Let us take an analogy with C++.
When it comes to large projects, I treat a “driver” script similar to the void main{...}
of C++. While this distinction is actually incorrect, it is a good programming practice in
MATLAB to have a single driver script, whereas all other tasks are executed in individual
functions.
Alternatively, the way to think about scripts vs. functions (usually for stand-alone prob-
lems) is to look at the purpose of the file. If the file executes a sequence of commands toward
a particular end, a script is perhaps more suitable, for example, the computation of factorial
when the purpose is to compute the factorial. Function, on the other hand, may be used
The above are good practices and not really a syntactic requirement from MATLAB. Thus, if
one’s purpose is to compute a factorial, Example 1.1 should be used; instead, if the purpose
is to use the factorial in Maclaurin series expansion, Example 1.2 is the chosen alternative.
With this background, the following example will show the computation of Maclaurin
series expansion of ex as per Equation 1.5.
%% Input values
x = 2;
n = 10;
Introduction ◾ 9
%% Calculation of e^x
expVal = 1.0;
for i = 1:n
currTerm = x^i/myFact(i); % Calculate i-th term
expVal = expVal + currTerm;
end
%% Displaying results
disp(['exp(2) = ', num2str(expVal)]);
exp(2) = 7.3889
The above example is shown for pedagogical purposes. This is a very inefficient way of
coding; the example is primarily used to introduce the reader to the difference between
functions and scripts and to show how to use functions (myFact.m in this example) in
MATLAB. The inefficiency of the code is that a large number of computations were unnec-
essarily used in calculating the value of currTerm in each for loop iteration. A better way
to do this is a recursive computation:
currTerm = currTerm*(x/i);
An astute reader will also recognize the use of brackets here, to reduce the chance of round-
off error.
éx x x xù
v=ê ú
ë1 2 3 n û
MATLAB function cumprod (see Appendix A for a MATLAB Primer) computes cumula-
tive product. Thus, first element is itself, second is a product of the first two elements, third
is a product of first three elements, and so on:
éx x2 x3 xn ù
cumprod ( v ) = ê ú
ë1 1× 2 1× 2 × 3 n! û
10 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
The term in the inner brackets, x./[1:n], generates the vector v. Note the use of
element-wise division operator “./”—a unique feature of MATLAB’s matrix capabili-
ties. The result of cumprod is all terms in the series expansion but leading 1.
As an exercise, compute the Maclaurin series expansions of
x3 x5
sin ( x ) = x - + +
3! 5!
x2 x 4
cos ( x ) = 1 - + -
2! 4!
é -x ( -1)xn ù
n
x2 -x 3
cumprod ( -v ) = ê ú
ê 1 1× 2 1× 2 × 3 n! ú
ë û
1æ i 2 ö
x ( ) = ç x ( ) + (i )
i +1
2è ÷
x ø
%% Iterations
for i = 1:10
x(i+1) = (x(i)+2/x(i))/2;
end
%% Display results
disp(['Herons Square Root: ', num2str(x(end))]);
>> disp(x)
2.0000 1.5000 1.4167 1.4142 1.4142 1.4142
Starting with an initial guess of 2.0, the value of x rapidly converges to 1.4142 in the
fifth iteration. Taking the difference of the consecutive terms of x gives us
>> disp(diff(x));
-0.50 -0.0833 -0.0025 -2.124e-6 -1.595e-12 0
The fourth iteration result of 1.4142 differs from the previous by 2.124 × 10−6. Often,
this is considered as sufficiently accurate for a numerical technique.
1.2.5 Section Recap
We started this section with a quick view of the structure of a typical MATLAB code in
Figure 1.2. The importance of creating a good skeletal structure for MATLAB codes was
discussed. Sectioning and comments are very useful in this aspect to keep the code clean,
readable, and easy to debug. The importance of having clear demarcation between (1)
input/parameter definition, (2) computation, and (3) output/wrap-up sections in a code
cannot be emphasized enough. It is far too common to see programmers define their model
parameters at multiple points in a code. This was emphasized by using input-computation-
output sectioning in smallest of projects (see Example 1.1 or Example 1.3). Example 1.2 was
used to highlight the use of a function. Unlike a project driver script file, a function gets its
inputs as input arguments, and returns output arguments. Thus, most functions primarily
have a computation section.
The choice of variable names used is also important. When the context was clear,
variable names x or i were used. However, it is best to have more descriptive names,
such as currTerm to represent “current term” and expVal to represent “value of
exponential.” Also note capitalization of the first letter of an English word in the vari-
able name to improve readability. A variable currTerm is more easily readable than
currterm.
Finally, as users will see later in this book, the use of global for sharing variables
between functions or workspaces is avoided. We pay attention to variable scoping and treat
workspaces as sacrosanct.
12 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
DEFINITION: ERRORS
The true value of a variable x is represented as x∗ while the current approximation as x(i).
The absolute (true) error is defined as
E = x* - x ( i )
E =
( x* -x( ) )
i
x*
Typically, the true solution is not known a priori. Hence, the following two definitions
of approximation errors are defined. Herein, the error is defined based on the difference
between subsequent approximations of the solution. Thus
x ( i +1) - x ( i )
e = x ( i +1) - x ( i ) and e =
x(i )
EXAMPLE
With these definitions, the absolute error in the calculation of e2 using Maclaurin series
through x10 terms is
E = 6.139 ´ 10-5
Consider the fourth iteration value in Héron’s algorithm of 1.4142 (x(5) in Example 1.4,
since x(1) is the initial guess). The true and approximation errors are
The error occurs due to two reasons: (1) truncation of the infinite series to finite number of
terms and (2) finite precision representation leading to rounding off.
1.3.1 Machine Precision
A computer is a finite precision machine. While integers are exactly represented in a com-
puter representation, real number representation is inexact. Real numbers are represented
using a finite number of bits (or binary digits) in floating-point representation in a computer.
Introduction ◾ 13
Consider two real numbers r1 , r2 ∈ R; there are infinite number of real numbers between r1
and r2. However, in computer representation of real numbers, there are only a finite num-
ber of real numbers between r1 and r2. To understand this concept better, let us consider an
example of decimal floating-point representation of real numbers. Consider that our “mind-
computer” represents a positive real number using six digits: five decimal digits and a single
signed decimal exponent. The five decimal digits form mantissa m, whereas the sixth digit
is the signed exponent. The floating-point representation of this number may be written as
e -5
0 ´ 10
.xxxxx
m
The mantissa lies between 1 and 1/base (base=10 in decimal system), that is, m ∈ [0, 1).
The digit in the exponent, e, also takes the value between 0–9; thus with a bias of –5 shown
above, the exponent takes values from 10−5 − 104. The number 314.27 is represented as
0.31427 ´ 108 -5
0.31428 ´ 108 -5
The decimal computer cannot represent any other number between the two numbers, 314.27
and 314.28. With the five-digit mantissa, the number 314.2712 cannot be distinguished from
314.27. Thus, the last two digits get chopped off as a consequence of finite precision represen-
tation of a number. In summary, the floating-point representation is of the form
m ´ be - F (1.6)
where
m is the mantissa, with 1/b ≤ m < 1
e is the exponent
b is the base (2 in binary; 10 in decimal)
F is the exponent bias (fixed for a computer)
Thus, real number representation in a computer has a “least count.” This least count, when
normalized, depends on the mantissa and is known as machine precision.
MATLAB uses double precision numbers with a word length of 64 bits: 1 sign bit, 52 man-
tissa bits, and 11 exponent bits. The machine precision in MATLAB is obtained using the
keyword eps. It is typically 2−52. Consider the following statements typed on the command
prompt:
>> x = 1+2^-52;
>> disp(x-1.0)
2.2204e-16
>> x = 1+2^-53;
>> disp(x-1.0)
0
The second result is a consequence of the “least count” in MATLAB. Example 1.5 below
contains a code to recursively compute machine precision in MATLAB. Starting with an
initial value of ϵ = 1, the following procedure is adopted. The value of ϵ is halved; (1 + ϵ) is
computed and compared with 1. If the two cannot be distinguished, the value of ϵ has fallen
below the machine precision; else, the above procedure is repeated until 1 and (1 + ϵ) are
undistinguishable.
1.3.2 Round-Off Error
The finite precision of a computer leads to round-off errors, errors that result due to inexact
representation of real numbers in a computer. Note that a computer chops off the addi-
tional digits (thus, 314.2712 and 314.2772 are both represented as 314.27). Continuing
with the example of six-digit decimal floating-point representation, the following example
highlights the origins of round-off errors. The values of x satisfying quadratic equation
x2 − bx + c = 0 are
b ± b2 - 4c
x=
2
Introduction ◾ 15
( 0.97e2 ) - ( 0.4e1) =
2
p= 0.94090e 4 - 0.00040e 4
The digits of the mantissa are moved during subtraction so that the two numbers have
the same exponent. If one of the numbers is much smaller than the other, precision is lost.
In this example, p = 0.94050e 4 = 0.96979e2.
The value of p is obtained accurately for five significant digits. The two solutions
therefore are
ìï 0.96989 ´ 102
x = 0.5 ( 0.97000e2 ± 0.96979e2 ) = í -1
ïî0.10500 ´ 10
Note the second solution, true value is 0.01031 and computed value is 0.01050. This
is due to the fact that the values of b and p are fairly close to each other; thus chop-
ping off the trailing digits in p before subtraction results in a rather large round-off
error.
An alternative formula for the second solution is
2c
x=
b + b 2 - 4c
Thus, avoiding subtraction between two close numbers results in better error behavior.
The above example demonstrates how the machine precision gives rise to round-off
errors. The computer-stored value x is related to the true value x as
x = x (1 + emach )
As we shall see in Section 1.3.4, subtraction of two very close numbers results in errors in
numerical differentiation. Before that, we will introduce the other type of error, truncation error.
Taylor’s series expansion is used, either directly or indirectly, for the derivation of a large
number of numerical algorithms described in this text. Finite number of terms in Taylor’s
series are retained (e.g., 10 terms were used in computing e2). The error resulting from
truncating an infinite series in deriving a numerical approximation is known as truncation
error. The nth order error term is
hn+1
f éë ùû ( x ) , where x Î éëa, ( a + h ) ùû
n +1
En =
( )
n + 1 !
( )
O hn+1
The last term in both the equations indicate how the truncation errors in using the above
formula depend on the step-size, h. Consider the numerical example below that illustrates
truncation error and order of accuracy.
ò
0
1 + x2
dx
The true solution is tan−1(h). The aim is to compute errors in numerical integration.
Introduction ◾ 17
Solution: The code for calculating the numerical integral is given below. The inte-
gral is computed for four different values of step-size h. Although the first instinct is
to use a for loop to loop over different values of h, we will exploit the powerful array
operations in MATLAB to write an efficient code:
%% Trapezoidal Rule
trueInt = atan(h); % True values of integral
f2 = 1./(1+h.^2); % f(a+h)
trapInt = h/2.*(f1+f2);
trapErr = abs(trueInt-trapInt);
%% Plotting results
loglog(h,trapErr,'-b', h,simpErr,'--r');
xlabel('step size, h'); ylabel('error');
The atan function used in line 5 returns a vector of the same size as vector h, wherein
each element is tan−1(⋅) of the corresponding element of h. The next line contains
computation of f(a + h) = f(h) for all values of h. Notice the use of element wise division
(./) and power (.^) operators.
Figure 1.3 shows log-log plot of error in the numerical computation of integral
vs. step-size h. Both of these are approximately straight lines. The error is lower
with Simpson’s 1/3rd rule and falls faster with step-size compared to the trapezoi-
dal rule. The step-size falls by three orders of magnitude. Corresponding error
using the trapezoidal rule falls by nine orders of magnitude (from ~10−1 to ~10−10).
Similarly, error in Simpson’s 1/3rd rule falls by 15 orders of magnitude (from ~10−1
to ~10−16).
Alternatively, one can plot log ( E ) vs. log ( h ) on a linear plot. The straight line for the
trapezoidal rule has a slope of approximately 3, whereas that for Simpson’s 1/3rd rule is 5.
()
In other words, log(etrap) ~ 3 log(h) = log(h3) and log ( esimp ) ~ 5 log h = log æç h ö÷. Note that
5
è ø
this corresponds to the order of the error term in Equations 1.6 and 1.7.
18 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
100
10–5
Error
10–10
10–15
10–20
10–2 100
Step-size, h
FIGURE 1.3 Truncation error vs. step-size for single application of the trapezoidal (solid) and
Simpson’s 1/3rd rules (dashed).
f ( xi + h ) - f ( xi )
f ¢( x ) = + O (h ) (1.10)
xi h
f ( xi + h ) - f ( xi - h )
f ¢( x )
xi
=
2h
( )
+ O h2 (1.11)
Equation 1.10 is known as forward difference approximation, whereas Equation 1.11 is the
central difference. Refer to Appendix B for more details on numerical derivatives.
10–3
Forward diff
10–6
Error
10–9
Central diff
10–12
10–10 10–8 10–6 10–4 10–2
Step-size, h
FIGURE 1.4 Effect of step-size on error in computing numerical derivative using forward and cen-
tral difference formulae.
Not only is the error in the central difference formula lower than forward differ-
ence but reducing the step-size by two orders of magnitude improves the error by
four orders of magnitude: a consequence of error being proportional to h2 for central
differences.
Figure 1.4 shows the error vs. step-size plot (on a log-log scale) using forward and
central difference formulae to compute the numerical derivative of tan−1(x). The fol-
lowing code was used for this purpose:
%% Forward difference
hFwd = 10.^[-9:-3];
df_fwd = (atan(x+hFwd)-atan(x)) ./ (hFwd);
errFwd = abs(df_fwd-trueVal);
%% Central difference
hCtr = 10.^[-7:-2];
df_ctr = (atan(x+hCtr)-atan(x-hCtr)) ./ (2*hCtr);
errCtr = abs(df_ctr-trueVal);
%% Plotting
loglog(hFwd,errFwd,'-b',hCtr,errCtr,'--r');
The above example highlights the trade-off between truncation and round-off errors.
The truncation error decreases as step-size is reduced, whereas the round-off error
increases at very low step-sizes. Thus, there is an optimal step-size at which the net
20 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
fwd
hopt = e1mach
/2 fwd
and hopt = e1mach
/3
(1.12)
where εmach is the machine precision. Therefore, the optimal values of step-sizes for forward
and central difference formulae are ~10−8 and ~10−5, respectively. When the step-size is
greater than these respective values, slopes of eh vs . h curves are 1 and 2, respectively, for the
( )
two formulae. Note that this slope corresponds to the order of accuracy, O h1 and O h2 , ( )
respectively.
1.4 ERROR ANALYSIS
A basic understanding of some key numerical issues is important to make an appropriate
choice of numerical method used to solve process simulation problems. Some of these con-
cepts are introduced presently. Numerical methods broadly work in three ways. First is the
direct way of computation, such as approximation of e2 from truncated Taylor’s series or
numerical derivatives shown above. Second is an iterative way, such as the Héron’s method
in computing 2. Third is step-wise recursive method, such as multiple applications of
Newton-Cotes integration formulae (see Appendix E for details). Solutions of differential
equations (see Chapter 3, for example), optimization, etc., also fall under this category.
Error analysis introduced in this section is applicable to iterative and recursive methods.
A general introduction to some key concepts will be presented. As defined before, x(i) is the
current numerical solution, and x ∗ is the true solution. We can write x(i) = x ∗ + E(i), where E(i)
is the error. As we have seen in the examples before, the numerical solution changes with
the step-size and/or number of recursions. The next value x(i + 1) is computed as some func-
tion of x(i):
i +1
( )
i
(
x( ) = f x( ) = f x* + E( )
i
)
Thus, it is possible to write
i +1
(
E( ) = F E( ) ; x* , x( )
i i
)
indicating that the error depends on the previous error (as also on the true solution and
numerical solution). Issues regarding behavior of the numerical solution x(i) and error E(i)
are discussed.
solution x(i) approaches the true solution x ∗. Stability is a related property. The error E(i) is
a result of incorrect initial conditions and round-off errors. A method is said to be stable if
the effect of these errors do not grow with the application of the numerical method. In other
words, E(i) should decrease (or at least not increase in unbounded manner) as i increases.
An important consideration in choosing a numerical method is its convergence behav-
ior. All the examples considered so far were stable and convergent. The next consideration
is “how fast does the numerical method converge.” The order of convergence of an iterative
numerical method is said to be n if
n
E( ) = k éE( ) ù
i +1 i
ë û
Consider the errors in Example 1.4. Subsequent iterations of Héron’s algorithm result in the
following errors:
{
e ( ) = 5 ´ 10-1
i
8.33 ´ 10-2 2.5 ´ 10-3 2.12 ´ 10-6 1.6 ´ 10-12 0}
It can be inferred that the algorithm has a quadratic order of convergence, that is, n = 2.
Besides, the constant κ is also important. For example, when n = 1 (linearly convergent
method), the numerical method is convergent if |κ| < 1.
The terms stability and convergence have different meanings. Stability refers to whether
or not errors grow with iterations or recursions, whereas convergence means whether the
numerical solution reaches the true solution.
ò
Trapezoidal rule : f ( x ) dx =
a
2ë
é f1 + 2 f 2 + 2 f 3 + + 2 fn + fn+1 ùû
(1.13)
b
h
ò
Simpson’s 1 / 3rd rule : f ( x ) dx =
a
3ë
é f1 + 4 f 2 + 2 f 3 + 4 f 4 + + 4 fn + fn+1 ùû
(1.14)
a
F dX
V = 0m
kC0 ò r ( X )
0
(1.15)
22 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
where X is the conversion, F0 is the inlet flowrate and the reaction rate is given by kC0mr ( X ) .
The design problem is to find the volume V of the PFR for the desired conversion, a.
0. 8
dX
V = 0. 5
ò (1 - X )
0
1.25
%% Display results
trapErr = abs(trueVal-V_trap);
simpErr = abs(trueVal-V_simp);
loglog(b./N,[trapErr;simpErr]);
Figure 1.5 shows the effect of step-size on error for the two methods. The step-size h is
decreased by two orders of magnitude. The error using trapezoidal rule reduces from
10−2 to 10−6, that is, by four orders of magnitude, whereas the error using Simpson’s
rule decreases by eight orders of magnitude.
The PFR volumes obtained using the two numerical methods are
10–1
10–3
10–5
Error
10–7
10–9
10–11
10–3 10–2 10–1
Step-size, h
FIGURE 1.5 Error analysis for the trapezoidal and Simpson’s 1/3rd rules.
Example 1.9 shows the application of the numerical integration technique to solve a PFR
design problem. The solution obtained is accurate even with a reasonably small number
of intervals used. The main intention of this example was, however, to demonstrate error
analysis for the integration formulae. According to Equation 1.8, the truncation errors in
( )
the trapezoidal rule are O h3 . However, Figure 1.5 shows that the error vs. step-size curve
has a slope of 2. This is because the global truncation error incurred in multiple applications
of the trapezoidal rule is of the order h2.
Equations 1.8 and 1.9 showed the local truncation error, error resulting from a single
application of the numerical technique. When multiple applications of the numerical tech-
nique are used, we encounter the local error, as well as an error resulting from the accumu-
lation of errors until the current point. The net effect of this is to typically reduce the order
of accuracy of the numerical technique. In case of the two Newton-Cotes formulae, global
truncation errors are given by
( ) ( )
µ h2 f ¢¢ ( x ) , Esimp µ h 4 f ¢¢¢¢ ( x )
gte gte
Etrap
1.5 OUTLOOK
Example 1.9 was our first glimpse into Process Simulation using MATLAB, where we
used numerical integration to compute the volume of the PFR. Section 1.4.2 started with
a discussion on an integration problem (Equations 1.13 and 1.14), followed by a practical
example (design of PFR), and ending with Example 1.9 showing the solution and results in
MATLAB. We will follow a similar structure in most of the chapters: First, the numerical
techniques of importance will be discussed, followed by a concise description of the prob-
lem to be simulated or analyzed, and ending with the application of those techniques to
solve the problem.
The chapters are organized based on the approach to solving the problem. In this context,
the treatment in this book differs from introductory textbooks. For example, Chapter 3
24 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
covers ODE-IVPs of nonstiff systems. On the other hand, ODE-BVPs are covered in
Chapter 7 along with elliptic PDEs since they both involve systems that are diffusive in one
or more dimensions, respectively. Solutions of stiff ODEs and DAEs, which require a simi-
lar set of tools for simulation and analysis, are covered in Chapter 8. Numerical techniques
that are more generic in nature, such as numerical differentiation, integration, and solving
linear equations, are covered in the Appendices.
I
Dynamic Simulations and Linear Analysis
25
CHAPTER 2
Linear Algebra
2.1 INTRODUCTION
Linear algebra is typically introduced in an introductory engineering mathematics course
in the context of solving a set of n linear equations in n unknowns. Consider the following
example:
x + 2y = 1
(2.1)
x-y =4
é1 2 ù é x ù é1 ù
ê úê ú = ê ú (2.2)
ë1 -1û ë y û ë 4 û
A x b
The geometric interpretation that is commonly discussed in introductory courses is that the
solution (3, − 1) is the point of intersection of the two lines. This is a useful way of introduc-
ing the problem of solving linear equations. However, as explained by Strang in his Linear
Algebra text, a more powerful and versatile visualization is through vector spaces.
Before moving to introduce vectors and vector spaces, I will very briefly review some
results in the solution of a system of linear equations.
27
28 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
é x1 ù
ê ú
x2
x=ê ú (2.4)
êú
ê ú
êë xn úû
There may be n such equations in n unknowns. All such equations are then written together
and converted into the following standard form:
Here, the boldface small letters are used to represent vectors. Vectors can be both row and
column vectors. However, for consistency of notation, we will write vectors as n × 1 column
vectors (unless specified otherwise). The vectors x and b in the above equation are com-
posed of n elements. Each element, represented as nonbold characters with a subscript, is a
real scalar. We will define vectors and vector spaces in the next section.
The view in Equation 2.5 is a row-wise view of linear equations. The necessary and
sufficient condition for the set of linear equations (2.6) to have a unique solution is that
determinant of A should be nonzero. An equivalent statement is that a unique solution to
Equation 2.6 exists if and only if rank(A) = n. If this condition is met, inverse of A exists and
the solution is given by
x = A -1 b (2.7)
Numerical methods to solve linear equations are not discussed in this chapter. Instead, the
Gauss Elimination and LU decomposition methods are provided in Appendix C for ready
reference. Relevant MATLAB® commands are summarized in Section 2.5 at the end of this
chapter.
2.1.2 Overview
While solving linear equations is one application, in general linear algebra finds wide-
spread use in engineering (as also in science, economics, and other fields). Theories are
often well developed for linear systems, and detailed analysis can provide insights into sys-
tem behavior. It is common to linearize a nonlinear system using Taylor’s series expansion;
Linear Algebra ◾ 29
the properties of the linearized system give a good qualitative picture of the behavior of the
original nonlinear system. For example, insights into stability, controllability, and dynam-
ics can be obtained by linear system analysis. Nonlinear equations and optimization prob-
lems may be solved by successively linearizing the system. Linear algebra is also useful for
parameter estimation using least squares approach.
The discussion in this chapter will be cast in one of the following three ways. All these are
equivalent but are typically seen in different types of problems. First, the problem is cast in
terms of m linear correlations between n elements of a vector:
When m = n, this is same as Equation 2.6 encountered in solving linear equations. In least
squares parameter estimation problems, the number of rows is greater, that is, m > n. In
either cases, one intends to solve the problem to obtain x given the right-hand side vector b.
The second case is where a linear correlation exists between an input vector, x, and an out-
put vector, y. Such a linear mapping is mathematically represented as matrix multiplication:
The structure of matrix multiplication, Ax, is the same in both Equations 2.8 and 2.9.
However, in this problem, we are interested in analyzing (and perhaps predicting) how the
output responds to changes in the input vector x.
The third case is seen as solving a linear system of differential or difference equations:
Unlike the second case, here the n-dimensional y-space maps onto itself.
30 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
The three problems are somewhat equivalent numerically and require a similar reper-
toire of tools for analysis. However, I will bring out different practical significance for the
three problems so that these tools may be interpreted in relation to engineering problems
of interest.
We need to take a column-wise view of the linear equations (2.8), (2.9), or (2.10). This
calls for definition of vector spaces, which will be covered in Section 2.2. Solution of lin-
ear equations will be interpreted in that context. Thereafter, Sections 2.3 and 2.4 introduce
singular value and eigenvalue decompositions. We will thus build some background in this
chapter, which would be then used in some of the subsequent discussions in this book.
In summary, linear algebra is a versatile and practically relevant field. It is therefore apt
to start this book with a brief discussion on linear algebra. My treatment in this chapter is
inspired heavily by the online course and textbook by Prof. Gilbert Strang.
2.2 VECTOR SPACES
2.2.1 Definition and Properties
Let us first define a vector. A vector may be defined as an ordered list of scalars. A vector
x ∈ R n is a column vector with n elements (real numbers). Geometrically, the vector x ∈ R n
is a point in n-dimensional Euclidean space.
We visualize a vector in the form of a line starting from the origin toward this point in
space, which we often associate with a certain magnitude and direction. The term “magni-
tude” may be generalized in the form of vector norm: Norm is a positive scalar that is indica-
tive of the size of the vector. For example, the two-norm of a vector is what we associate as
the Euclidean distance:
A vector space consists of a set of vectors, along with rules of addition, and scalar
multiplication:
é x1 + y1 ù é cx1 ù
ê ú ê ú
x2 + y2 ú cx2
x+y =ê , cx = ê ú
ê ú ê ú
ê ú ê ú
êë xn + yn ûú êëcxn ûú
Based on the above definition of vector space, several rules apply since a vector space is a
linear space. These rules are well known to the reader as commutative and associative laws
for addition
x + y = y + x, ( x + y ) + z = x + ( y + z ) (2.12)
c ( x + y ) = cx + cy , ( c1 + c2 ) x = c1x + c2 x (2.13)
Linear Algebra ◾ 31
and that there exists a unique zero vector in the vector space, such that
x + 0 = x , x + ( - x ) = 0 (2.14)
With these definitions of vectors and vector spaces in place, let us revisit Equation 2.2. An
alternative interpretation uses the concept of vector spaces just defined:
v 1 x + v 2 y = b (2.15)
é1ù é2ù
where v 1 = ê ú and v 2 = ê ú are the column vectors that constitute matrix A. In other
ë1û ë -1û
words, vi is the ith column of the matrix. Thus, the aim of solving linear equations can now
be expressed as finding the values of “coefficients” x and y such that the vector b on the
right-hand side is a linear combination of the vectors v1 and v2. As shown in Figure 2.1a, this
is like “completing a parallelogram” that we studied in high school.
Consider, instead, if the second equation changed so that the two linear equations are
x + 2 y = 1
2x + 4 y = 4
then the system of equations does not have a solution. The two column vectors in this
case are
é1 ù é2 ù
w 1 = ê ú and w 2 = ê ú
ë2 û ë 4 û
These two vectors lie on the same line from the origin. In fact, w2 = 2w1. As shown in
Figure 2.1b, since the vector b does not lie along the same line, there is no solution. If instead,
the second equation was changed so that
x + 2 y = 1
2 x + 4 y = 2
2
2 2
2
1 1
(a) (b) (c)
FIGURE 2.1 Demonstration of case with (a) unique solution, (b) no solution, and (c) infinite solu-
tions. The two thick lines represent the two column vectors of matrix A and the symbol represents
the right-hand side vector, b or d.
32 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
there will be infinitely many solutions. This is because for this case, the three vectors
é1 ù é2 ù é1 ù
w1 = ê ú , w 2 = ê ú , d = ê ú
ë2 û ë4û ë2 û
lie along the same line from the origin. This is shown in Figure 2.1c. It is easy to see that two
solutions are (1, 0) and (0, 0.5). Likewise, infinitely many solutions for the two equations
can be obtained.
As will be discussed presently, the three geometric views in Figure 2.1 are generalizable
and powerful ways to introduce key linear algebra concepts. The utility of linear equation
solving in this chapter was limited to introducing geometrically the concept of vector spaces.
I will not discuss linear equation solving further, but use this visualization to understand
some key concepts.
In other words, a set of vectors is linearly dependent if one (or more) of the vectors can be
represented as a linear combination of the remaining vectors. Thus, the original vectors
é1ù é2ù
v 1 = ê ú and v 2 = ê ú
ë1û ë -1û
from Figure 2.1a are linearly independent because neither of them can be expressed as a
linear combination of the other. Considering Equation 2.17, the only way they can sum to
0 is if both c1 and c2 are zero. On the other hand, the vectors
é1 ù é2 ù
w 1 = ê ú and w 2 = ê ú
ë2 û ë 4 û
Now we are equipped to understand span a bit better. In the second example,
span{w1, w2} is a line, whereas span{v1, v2} is the entire 2D space. The difference between
Figure 2.1b and c is that the vector b cannot be expressed as a linear combination of
w1 and w2, but there exists at least one set of scalars (in fact, there are infinitely many such
scalars) such that
d = c1 w 1 + c2 w 2
Another way of stating this is that vector d lies in the span{w1, w2}, that is
d Î span {w 1 ,w 2 }
which is said to be a subspace of the R 2 space. A subspace is not merely a subset of a vector
space; it is also a vector space itself. Thus, a line that does not pass through the origin is not
a subspace. This is because it violates the rules in (2.14)* since the zero vector, 0, does not
lie on this line.
Although the discussion above used two vectors in R 2 space, the same arguments are
applicable to higher dimensional vector spaces as well. For example, vectors
é1ù é2ù
ê ú ê ú
1 -1
v 1 = ê ú and v 2 = ê ú
ê1ú ê0ú
ê ú ê ú
êë1ûú êë 1 úû
are two linearly independent vectors in R 4 space, whereas span{v1, v2} will be a 2D subspace
of R 4 space.
Consider another case where
é1 ù é2ù é0 ù
v1 = ê ú , v 2 = ê ú , v 3 = ê ú
ë2 û ë -1û ë1 û
We had seen earlier that vectors v1 and v2 span the entire R 2 space. Hence, v3 can be
expressed as a linear combination of v1 and v2 (one can verify that v3 = 0.4v1 − 0.2v2). Thus,
span{v1, v2, v3} is still the 2D R 2 space.
Finally, I end this subsection with the definition of dimension:
Dimension of a subspace is defined as the number of linearly independent vectors that
span the subspace.
* It violates some other rules also, but it is obvious that (2.6) is violated since 0 is not on the line.
34 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
x = a1 v 1 + a2 v 2 + + an v n (2.18)
The proof for this statement is left as an exercise. Therefore, the vectors {v1, v2, …, vn}
together can form the basis of R n. The matrix formulation
é a1 ù
ê ú
a2
x = éë v 1 v2 v n ùû ê ú (2.19)
ê ú
ê ú
êëan úû
is equivalent to Equation 2.18. This works exactly in the same manner in which Equation 2.15
was obtained from Equation 2.1. The proof for the above is also left as an exercise.
Basis vectors are therefore a set of linearly independent vectors that span the vector space.
The set of coordinate vectors
é1 ù é0 ù é0 ù
ê ú ê ú ê ú
0 1
e1 = ê ú , e 2 = ê ú , , en = ê ú (2.20)
ê ú ê ú ê0 ú
ê ú ê ú ê ú
êë0 úû êë0 úû êë1 úû
2.2.3.1 Change of Basis
Let vector x ∈ R n be represented as
é x1 ù
ê ú
x2
x=ê ú
êú
ê ú
êë xn úû
in the natural basis. Let us say we want to change the basis to the vectors {v1, v2, …, vn}. This
is possible, since these are linearly independent vectors that span the R n space. The vector x
Linear Algebra ◾ 35
is represented by Equation 2.19. Since the vector in the natural basis and {v1, v2, …, vn} basis
is the same point, the following equality holds:
é x1 ù é a1 ù
ê ú ê ú
x2 a2
éëe1 e2 en ùû ê ú = éë v 1 v 2 v n ùû ê ú (2.21)
ê ú ê ú
ê ú T ê ú
êë xn úû êëan ûú
The first matrix on the left-hand side is the identity matrix. Thus, coordinate transformation
to the new basis is given by
é a1 ù é x1 ù
ê ú ê ú
ê a2 ú = T -1 ê x2 ú (2.22)
ê ú êú
ê ú ê ú
êëan ûú êë xn úû
where the coordinate transformation matrix T is an n × n matrix whose n columns are the
n basis vectors. Change of basis is an important operation that we will use in the analysis
of linear systems and stiff ordinary differential equations (ODEs) (in Chapters 5 and 8,
respectively).
y = f ( x ) (2.23)
The above is a nonlinear mapping between the inputs and outputs of the system. A linear
mapping between the inputs and outputs will be given by*
y = Ax (2.24)
* The mapping, y = Ax + b is known as affine mapping, and not a linear mapping. If y1 = Ax1 + b and y2 = Ax2 + b, then
A ( c1x1 + c2 x 2 ) + b ¹ c1y 1 + c2 y 2
Although affine functions are not linear, it is easy to convert them to linear:
- b2 = A ( x + b1 )
y
y x
F = F1 + F2 + F3 (2.25)
F w = F1w1 + F2 w2 + F3 w3 (2.26)
é F1 ù
ê ú
x = ê F2 ú
ê F3 ú
ë û
consists of the three mass flowrates. If the desired output is in terms of total mass flow
and composition of the primary component, the output vector consists of
éF ù
y = ê ú
ë wû
The first model equation (2.25) is linear, whereas Equation 2.26 is nonlinear:
1
w=
F
( F1w1 + F2 w2 + F3w3 )
It is possible, however, to define the output vector as total mass flow and mass flow of
the primary component Fprim at the outlet. Thus
é F ù
y=ê ú
ë Fprim û
and the linear operator that models the steady state blending operation is
é1 1 1ù
y=ê úx (2.27)
ëw1 w2 w3 û
C
The matrix C is a linear mapping between the three inlet flowrates and the net produc-
tion rates from the blending system. It is represented as
C : X ® Y
Although I have stated this before, it bears repeating. The linear equation, Ax = b, was intro-
duced to motivate the concept of vectors and vector spaces. Thereafter, Section 2.2.2 intro-
duced the concept of span and linear independence; Section 2.2.3 introduced the concept
of basis; and this subsection introduced the concept of matrix as a linear operator. It would
help to “upgrade our thinking” in terms of input vector x, output vector y, and linear trans-
formation represented by the matrix A.
Null space or Kernel of a matrix is defined as the complete set of vectors x ÎX such that
their linear transformation under matrix A maps to the zero vector. Mathematically
ker ( A ) = {x : Ax = 0} (2.28)
In previous subsections, we have defined the column vectors of the matrix as the m vectors
that form individual columns of the matrix. Consider the case when the two column vec-
tors, v1 and v2, were linearly independent. In such a case
é1 2ù
ê úx =0
1 -1û
ë
A
has a unique solution, x = 0. Thus, when the column vectors of A are linearly independent,
the null space of the matrix is a single point, the origin.
38 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
The second case is when the two column vectors, w1 and w2, are linearly dependent.
The set of x that satisfies the equation
é1 2 ù
ê úx =0
ë2 4û
A
é -2a ù
ê ú
ë a û
The physical consequence (or interpretation) of null space is as follows: Any input set in the
null space will lose its information when acted upon by the linear transformation A.
Note that the null space or kernel is a subspace of the input space.
Image space of a matrix is the set of output vectors y Î Y that are obtained by the trans-
formation of all x ÎX through the matrix A. Stated in a different way, the image space of
matrix A is all y that satisfy
im {A} = {y | y = Ax : x Î X } (2.29)
é1 2ù
y=ê úx
1 -1û
ë
A
é1 2 ù
y=ê úx
ë2 4û
A
é1 ù é2 ù
y = x1 ê ú + x2 ê ú
ë2 û ë4û
Thus, the image space of this matrix is the 1D subspace along the vector w1 = [1 2]T.
Linear Algebra ◾ 39
respectively. Thus, the condition that the equation Ax = b has a solution is as follows: The vec-
tor b is in the column space of A.
Now, look at the definition of image space: It is the set of all output vectors y such that
Notice that the two definitions are similar since (x1, x2) or (c1, c2) are real numbers. So, what
is the difference between image space and column space?
One way to look at them is to realize that image and column space are numerically the
same: They are both the linear combinations of the column vectors of a matrix.
Another way to look at this is the physical interpretation. The term “column space” was
invoked when we discussed solving a linear equation: One or more solutions exist when
the vector on the right-hand side of Equation 2.2 lies in the column space of A. Figure 2.1
illustrates the statement “the equation Ax = b has solution(s) if the vector b lies in the column
space of matrix A.”
On the other hand, “image space” does not talk about solving linear equations; instead it
is introduced in the context of linear transformation. When the input vector x is acted upon
by a linear operation, the corresponding output vector lies in the image space. In the top
panels of Figure 2.2, the image of any arbitrary vector in X may lie anywhere on the 2D
2
space. So, the image space is the entire R space.
In the bottom panels, the image space of the matrix A was im(A) = [1 2]T. Note that all
the dots lie on this line. Furthermore, ker(A) = [−2 1]T, implying all the open circles on the
left-bottom panel map to the origin in the Y -space on the right-bottom panel. The example
also illustrates the fact that the image space is a subspace of the output space, Y , whereas
the null space is a subspace of the input space, X .
>> disp(null(A))
-0.8944
0.4472
40 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
(a)
(b)
FIGURE 2.2 Linear transformation from X ® Y for the two different A matrices. (a) The column
vectors of A matrix are linearly independent; (b) the column vectors are linearly dependent. Dots/
circles on the X -space on the left map to dots/circles on the Y-space on the right.
(where a = 1/ 5 ). The image space can be obtained using the command orth. The image
space is also called range of the matrix.
>> disp(orth(A))
-0.4472
-0.8944
é1 1 1ù
C=ê ú
ë0 0. 5 1û
Linear Algebra ◾ 41
The image space, im(C), is the entire R 2 space, since it is possible to get any net and
primary component flowrates through linear combinations of F1, F2, and F3. What is
the null space, ker(C)? It is the combination of the three flowrates such that both the
net flowrate, F, and primary component flowrate, Fprim, are zero. It is easy to see that for
the primary component flow to be zero, F3 = a, F2 = − 2a. If the flowrate F1 = a, we can
T
see that the net flowrate F = 0. Thus, the null space is given by x = ëéa -2a a ùû .
The unit vector in that direction is therefore
>> disp(null(C))
0.4082
-0.8165
0.4082
Case 2: For this case, let us consider that all the three streams have the same compo-
sition of the primary component, that is, ω1 = ω2 = ω3 = q. It should be clear that mixing
of these three streams will lead to the outlet stream with the same composition, q.
Thus, the net flowrate will be, F = F1 + F2 + F3, and the net flowrate of the primary com-
ponent, Fprim = qF. Thus, the image space will be 1D subspace of Y, such that
é1 ù
im ( C ) = ê ú F
ëq û
While it is possible to obtain any value of primary component flowrate, Fprim, the ratio
of this to the net flowrate (i.e., mass fraction) will always be q.
It can be verified using orth command in MATLAB that the image space is given by
ì é1/ 1 + q 2 ù ü
ï ú ïý
im ( C ) = span í ê
ïî êëq/ 1 + q úû ïþ
2
2.3.1 Orthonormal Vectors
Let x1 ∈ R n and x2 ∈ R n be two vectors in n-dimensional space. The dot product or the inner
product of the two vectors is
x 1, x 2 = x 1T x 2 = x T2 x 1 = x 2, x 1 (2.30)
42 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
A vector is a unit vector if its inner product with itself equals one. In the last example, the
basis for the image space was defined as
é1/ 1 + q 2 ù
ê ú
êq / 1 + q 2 ú
ë û
ïì 1 if i = j
x Ti x j = í (2.31)
ïî0 otherwise
An orthogonal matrix is a square matrix whose column vectors are orthonormal to each
other. Thus, since
AAT = AT A = I (2.33)
AT = A -1 (2.34)
The three properties, (2.32), (2.33), and (2.34), are key properties of orthogonal matrices.
Orthogonal Subspaces: Two subspaces are said to be orthogonal if any vector on one sub-
space is orthogonal to any other vector on the other subspace.
Remark: Since any vector in a subspace can be represented as a linear combination of its
basis vectors, it is sufficient to ensure that basis vectors of the two subspaces are orthogonal
to each other.
restrict transformation to orthonormal basis set. The orthonormal basis set that spans the
input space forms the n columns of input transformation matrix:
Tx = éë v 1 v2 v n ùû (2.35)
and the basis set that spans the output space forms the m columns of the output transforma-
tion matrix:
Ty = éëu1 u2 um ùû (2.36)
T -1 = T T
Recall that the relationship between the original vector x and the transformed vector (in the
new basis set in (2.35)) is
x = TxT x (2.37)
and
y = TyT y (2.38)
The above expression is valid for any appropriate set of orthonormal basis vectors. The
coordinate transformation is most effective for a “special” set of basis vectors.
Consider
é1 2ù
A=ê ú
ë2 4û
for which the geometric interpretation of null and image spaces was shown in Figure 2.2b.
{ }
The image space for this matrix was im ( A ) = span éë1 2 ùû . Let us choose this vector as
T
one of the basis, u1, for the output space. Since we want to choose orthonormal basis vector
é1 / 5 ù
u1 = ê ú (2.40)
êë2 / 5 úû
44 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
é -2 / 5 ù
u2 = ê ú (2.41)
êë 1 / 5 úû
é -2a/ 5a 2 ù é -2/ 5 ù
v2 = ê ú=ê ú (2.42)
ê a/ 5a 2 ú êë 1/ 5 úû
ë û
é1/ 5 ù
v1 = ê ú (2.43)
êë2/ 5 úû
é1/ 5 -2/ 5 ù
Tx = ê ú (2.44)
êë2/ 5 1/ 5 úû
é5 0ù
= TyT ATx = ê
A ú (2.45)
ë0 0û
é y ù é5x 1 ù
ê 1ú = ê ú (2.46)
êë y2 úû êë 0 úû
The above equation makes the following rules quite clear: (i) Any point in the X space
gets mapped along y1 in the output space; (ii) y2 = 0 implies that no point in the X spaces
is mapped to this subspace; (iii) only the information in x 1 subspace is retained; whereas
(iv) the information along x 2 is lost. Thus, u1 is the image space and u2 is the subspace
orthogonal to it; and v2 is the null space and v1 is the subspace orthogonal to the null
space.
Linear Algebra ◾ 45
We are now ready to generalize the above representation to any m × n matrix A through
the concept of SVD, which states that any matrix A can be represented as
A = U SV T (2.47)
where
U = éëu1 u2 um ùû (2.48)
and
V = éë v 1 v2 v n ùû (2.49)
are orthogonal matrices. The matrix Σ is an m × n matrix such that its diagonal elements
are nonnegative (i.e., positive or zero) and all the other elements are zero. The diagonal
elements of matrix Σ are called singular values. The singular values are ordered, that is,
σ1 ≥ σ2 ≥ ⋯. The number of singular values is exactly equal to m or n, whichever is lower.
Some of these singular values can be zero.
The MATLAB command for singular value decomposition is svd.
When A is a square matrix, the SVD may be written as
és1 ù
ê ú T
A = éëu1 un ùû ê ú éë v 1 v n ùû (2.50)
ê sn úû
U
ë
V
és1 ù
ê ú
ê ú T
A = éëu1 um ùû ê sn ú éë v 1 v n ùû (2.51)
ê ú
U ê 0 0 ú V
ê 0 0 ú
ë û
S
é s1 0 0 ù
ê ú
S=ê ú (2.52)
ê sn 0 0 ú
ë û
46 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
é1 1 1ù é1 1 1ù
C=ê ú , C¢ = ê ú
ë0 0. 5 1û ë 0. 5 0. 5 0. 5 û
é a ù
ê ú
x = ê -2a ú
ê a ú
ë û
Since both the singular values are nonzero, the image space is the complete R 2 space.
Since there can be at most two positive singular values, the null space will be non-
empty. As can be seen from the matrix V, the last column is the null space that we
obtained earlier.
A similar exercise can be done on the other matrix
é1 1 1ù
C¢ = ê ú
ë 0. 5 0. 5 0. 5 û
as well. The reader can verify that the following results agree with the discussion
regarding the image and null space of this matrix as well.
>> [U,S,V]=svd(A)
U =
-0.8944 -0.4472
-0.4472 0.8944
S =
1.9365 0 0
0 0 0
V =
-0.5774 -0.8165 -0.0000
-0.5774 0.4082 -0.7071
-0.5774 0.4082 0.7071
Linear Algebra ◾ 47
2.3.3 Condition Number
2.3.3.1 Singular Values, Rank, and Condition Number
Let us summarize the results of the discussion on SVD. Let A := R n → R m be an m × n
matrix with ℓ singular values, {σ1, …, σℓ}. Here, ℓ ≤ min(m, n). The two orthogonal matrices
may be written as
U = éëu1 u | um ùû V = éë v 1 v | v n ùû (2.53)
The first column vectors of U span the image space and the remaining (m − ℓ) column vec-
tors form the subspace orthogonal to the image space. The latter subspace is “unreachable” on
linear operation through A. Similarly, the last (n − ℓ) column vectors of V span the null space.
The information along the null space is lost when data is operated upon by linear operator A.
The first ℓ column vectors of V span a subspace that is orthogonal to the null space of A.
Rank of a matrix is defined as the number of linearly independent rows of a matrix.
The concept of rank is introduced in introductory linear algebra courses while discussing
whether linear equation, Ax = b, has a unique solution. The concept of rank is useful for a
broader range of problems. In fact, row rank and column rank of a matrix are the number of
linearly independent rows and columns of a matrix, respectively. Thus, the column rank of
a matrix equals the number of linearly independent column vectors, which in turn, equals
the dimension of the image space of the matrix. It can be proved that row rank and column
rank of a matrix are equal (the proof is left as student exercise).
Since im(A) = span{u1, …, uℓ}, rank is also equal to the number of nonzero singular val-
ues of A. Furthermore, if ℓ = m, the image space is the entire m-dimensional vector space of
outputs; else, it is an ℓ-dimensional subspace of the R m output space. Equivalently, if ℓ = n,
the null space of A is just the origin.
While rank is a useful criterion, it does not tell us much about the relative values of
σ1, … , σℓ. Condition number is the ratio of the largest and smallest singular values, which is
an important parameter to analyze a linear system. Formally, condition number is defined
as the ratio of the two-norm of A to the two-norm of A−1, which simply means that condi-
tion number is the ratio of maximum singular values of A and A−1:
smax ( A ) smax ( A )
c ( A) = = (2.54)
( )
smax A -1 smin ( A )
Condition number is often invoked in the context of solving a set of linear equations.
It indicates how sensitive the solution will be to small errors in data. It is useful in linear
least squares problems (this will be dealt with in Chapter 10). It also indicates directionality
of the system. The implications of condition number in these scenarios will be discussed in
the remainder of this subsection.
é1 2 ù
A=ê ú
ë2 4.001û
This is a classic example that is used to demonstrate the importance of condition num-
ber. The condition number can be computed in MATLAB using
>> cond(A)
ans =
2.5008e+04
which indicates a highly “ill-conditioned” matrix. The reason for this would be clear
to the reader. If the element A(2, 2) = 4, we notice that the rank of the matrix will be
1, the matrix would not be invertible and a unique solution to linear equation would
not exist. However, with the A matrix given in this problem, the linear equation has a
unique solution for any value of b.
When b = [1 2]T, the solution of the linear equation is
>> x=inv(A)*b;
x =
1
0
é1.01ù é 41.01ù
b=ê ú, x = ê ú
ë 2 û ë -20 û
The solution has changed significantly with a rather small change in b. Thus, condi-
tion number is an indicator of the sensitivity of the solution of linear equations to
errors in data. When faced with highly ill-conditioned matrix, one should reformulate
the problem to avoid this issue, instead of finding “improved” solution algorithms.
Condition number is a property of the system and not of the solution technique.
There is another part to the discussion of ill-conditioned matrices. The SVD of matrix A
from Example 2.2 gives
T
é -0.4471 -0.8945ù é5.0008 0 ù é -0.44471 -0.8945ù
A=ê úê úê ú (2.55)
ë -0.8945 0.4471 û ë 0 0.0002 û ë -0.8945 -0.4471û
Linear Algebra ◾ 49
The value of solution was highly sensitive to initial conditions, when b = [1 2]T, that is,
when b was along or “close to” the vector u1. If instead
é1 ù é 4001.0 ù
b = ê ú, x = ê ú
ë0 û ë -2000.0 û
the solution x has a significantly larger magnitude than the right-hand side b. As one
moves the value of b2 from 2.0 to 0.0, the solution has moved from x1 = 1.0 to x1 = 4001. The
smallest singular value of A becomes the largest singular value of A−1. Hence, the resulting
solution becomes very large.
A large value of solution vector is not the hallmark of an ill-conditioned matrix. The hall-
mark of an ill-conditioned matrix is high sensitivity to errors. Consider the following matrix:
é1000 2000 ù
A=ê ú
ë2000 4001û
Note that if σi are singular values of A, (ασi) are singular values of [αA]. Thus, the condition
number of the above matrix remains the same as that in Example 2.2. The solution with this
matrix for different values of b (when b is in the column space of u1) is as follows:
é1 ù é0.001ù
b = ê ú, x = ê ú
ë2 û ë 0 û
é1.01ù é 0.041 ù
b=ê ú, x = ê ú
ë 2 û ë -0.020 û
The sensitivity of the solution to errors is the key factor. Note that again the value of x1 has
changed by a factor of 40, even though the values of x are very small.
We can use SVD to analyze this behavior. Transforming the coordinate systems to the
basis set represented by column vectors of V for the vector x and U for the vector b, we get
the following:
(U SV ) x = b (V
T
x ) = S (U b )
T
-1 T
(2.56)
x b
( ) ( )
x 1 = s1-1 b 1 x 2 = s2-1 b 2
(2.57)
50 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
( s ) = 0.2, ( s ) = 5000.8
-1
1
-1
2
é 1 ù é -2.236 ù
b=ê úÞb=ê ú
ë2.0004 û ë 0 û
and the value of x 1 = -0.4471. In the original coordinate system, the solution is therefore
é 0. 2 ù
x = V x = ê ú (2.58)
ë 0. 4 û
= é -2.236 ù
b ê ú
ë e û
T
yielding x 1 = -0.4471 and x 2 = 5000.8e. For example, when b = ëé1 2 ùû as in Example 2.2,
ε = −1.789 × 10−4, which yields x 2 = -0.8945. Thus
é -0.4771ù é1 ù
x = ê ú Þ x=ê ú
ë -0.8495û ë0 û
é 1 ù -1
é 0. 2 ù
b=ê ú Þ x=A b=ê ú
ë2.0004 û ë 0. 4 û
é1 ù é1 ù
b = ê ú Þ x = A -1 b = ê ú
ë2 û ë0 û
= U T b = é -2.236 ù Þ x = S -1 b
b = é -0.4471ù
ê ú ê ú
ë 0 û ë 0 û
Linear Algebra ◾ 51
= U T b = é -2.236 ù Þ x = S -1 b
b = é -0.4471ù
ê ú ê ú
ë 0.0002 û ë -0.8495û
The last step above is the reason for the sensitivity of the solution to errors. A very
small change in b resulted in a significant change in x 2 from 0 to −0.8495 (which is
double the value of x 1). Therefore, the final solution changes significantly:
é 0. 2 ù
x = V x = ê ú
ë 0. 4 û
é1 ù
x = V x = ê ú
ë0 û
The above discussion would also make it clear that varying b along the u1 direction will
result in only a proportional change in the solution. For example
é 1. 1 ù -1
é0.22 ù
b=ê úÞx=A b=ê ú
ë2.2004 û ë0.44 û
2.3.4 Directionality
The above example possibly indicates some sort of directionality in a linear operator,
A : X ® Y when the singular values are significantly different from each other. For a linear
operator that relates input and output spaces, Equation 2.56 may be rewritten as
(
y = U åV T x )
(U
T
y ) = å (V x )
T (2.59)
y x
52 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
so that
yi = si x i (2.60)
When the basis vectors of the input and output spaces are changed to the singular vectors,
this results in simply stretching or shrinking along these vectors. We used Example 2.1 to
discuss image and null spaces of a matrix. The following example takes this concept further,
expanding on the discussion in Section 2.3.3 for linear operator.
é -0.25 1. 5 ù
y=ê úx (2.61)
ë -2.25 4.25û
The condition number for the above matrix is 11. The condition number for this
matrix is quite good. We will choose this matrix because it is easier to visually demon-
strate the significance of singular values for linear operation.
An alternative, geometric interpretation of condition number is as follows. Let the
vector x be a unit vector. Let us choose x randomly in the X space. If we do this mul-
tiple times, the locus of x will form a unit circle in the X space. For each of the values
of x, we compute y using the above relation (2.61). The locus of y will map an ellipse in
the Y-space. The ratio of major and minor axes of the ellipse is the condition number.
The physical picture is shown in Figure 2.3. The circles denote conditions for major
axis of the ellipse, while the diamonds denote conditions for minor axis.
The following code was used to generate Figure 2.3:
(a) (b)
FIGURE 2.3 Interpretation of condition number. (a) A circle in the X -space maps to an ellipse on
Y-space. (b) Condition number is interpreted as the ratio of major and minor axes of the ellipse.
Linear Algebra ◾ 53
The two circles on the figure correspond to points (−0.44, 0.898) and (0.44, −0.898).
Using svd command, the direction of vector v1 is found to be
é -0.443ù
v1 = ê ú
ë 0.896 û
(U
T
y ) = S (V x )
T
(2.59)
y x
T
The first point therefore corresponds to x = ëé1 0 ùû . Consequently, in the Y-space, the
corresponding output point is y1 = s1. When translated back to the natural basis, this
point is plotted as a circle at (1.457, 4.807). Likewise, the other circle on Figure 2.3 is
the point (−1.457, −4.807). Note that these points in the modified basis correspond to
és1 ù é -s1 ù
y » ê ú and y » ê ú
ë0û ë 0 û
respectively.
Generalizing the observations from Example 2.3, let us say that σℓ ≫ σℓ + 1. Then, span{v1, …, vℓ}
corresponds to the direction in the X -space where the result will be stretched after linear trans-
formation through matrix A. On the other hand, the vectors that lie in the span{vℓ + 1, …, vn}
54 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
get shrunk with this linear transformation through matrix A. A further discussion of the
effect of this directionality on system behavior will be presented in Section 5.2.3.
d
y = Ay (2.62)
dt
dy
= ay (2.63)
dt
y = c0e at (2.64)
where the right-hand side is linear and differentiation is a linear operator. By analogy, since
the superposition principle holds, a solution to Equation 2.62 is
y = e lt c (2.65)
where
c is an n-dimensional vector
λ is a scalar that is yet to be determined
d
y = Ae lt c (2.66)
dt
d
y = l e lt c
dt
le lt c = Ae lt c (2.67)
The scalar eλt is nonzero (except for λ = −∞) and may be eliminated. This yields us the
following relationship:
Ac = lc (2.68)
( A - lI ) c = 0 (2.69)
where λ is known as eigenvalue and c is known as eigenvector. The scalar λ is chosen such
that the matrix (A − λI) becomes rank deficient (i.e., rank(A − λI) < n). This condition gives
us the so-called characteristic equation:
det ( A - lI ) = 0 (2.70)
The eigenvector c is in null space of (A − λI). Let us end this discussion with a summary of
key points.
Definition: For a matrix A : R n → R n , there exists scalar l Î C and vector c Î C n such that
Ac = lc
2.4.3 Eigenvalue Decomposition
Eigenvalues and eigenvectors were introduced for solving linear ODEs, y′ = Ay. The term
is reported to have origins in the German word “eigenwert,” which translates (according to
Google translate) to “intrinsic.” The term eigenvalue or eigenvector may therefore imply
“an intrinsic value/vector” that characterizes the matrix. The most important application of
eigenvalues and eigenvectors is in the analysis of situations where we need to map a linear
transformation of a vector space on itself.
In Section 2.3, physical interpretation of SVD was provided for a linear transformation
y = Ax. The mapping was A: R n → R m. The input and output vector spaces could be of
different dimensions (as seen in Example 2.1) or they could be of same dimension (when
m = n, such as Example 2.3). Naturally, SVD gave two different coordinate transformations:
The orthogonal matrix V as the input basis, the orthogonal matrix U as the output basis,
and the diagonal matrix Σ that maps x ÎX to y Î Y .
In contrast, eigenvalues are of interest when mapping the R n vector space on itself. For
the ease of this discussion, consider the case where the matrix A has n distinct eigenvalues
(this is often a more common case). It can be proved that if the eigenvalues are distinct, the
eigenvectors will be linearly independent (proof is left as an exercise). Thus, the eigenvec-
tors can form a basis for coordinate transformation*:
C = éëc1 c2 c n ùû (2.71)
y = C -1 y (2.72)
* Standard linear algebra texts often use Ax = λx to define eigenvalues and eigenvectors. Hence, they represent transformation
matrix for eigenvectors as the basis set as X = éë x 1 x 2 x n ùû . Note that this representation and Equation 2.54 are
equivalent. Since I have used x ÎX as input vectors in Section 2.3, I chose to use notation c as eigenvectors.
† Recall that the eigenvectors may not be orthogonal.
Linear Algebra ◾ 57
or equivalently
y = C y (2.73)
(
Ay = A y1c1 + y2 c 2 + + yn c n ) (2.75)
é y ù
ê 1ú
ê y ú
Ay = éël1c1 l2c 2 ln c n ùû ê 2 ú
ê ú
ê y ú (2.77)
ë nû
él1 ù é y1 ù
ê úê ú
l ú ê y2 ú
c 2 c n ùû ê
2
Ay = éëc1
ê úê ú
ê úê ú
êë l nú êy n úû
û ë
CL
y
Note that the order of matrices C and Λ is important. The product ΛC implies that each
element λi multiplies the ith row of matrix C, whereas the product CΛ implies that each
element λi multiplies the ith column of matrix C. Thus, the equation cannot be written as
Ay = LC y .
Substituting (2.72), we get the following important result:
Ay = C LC -1 y
(2.78)
A = C LC -1
é1 2ù
A=ê ú
ë2 4û
which we had discussed as an example of rank-deficient matrix. The two eigenvalues are
λ1 = 0 and λ2 = 5. The corresponding eigenvectors are
é -2 ù é1 ù
c1 = ê ú , c 2 = ê ú
ë1û ë 2 û
where the first eigenvector, c1, which corresponds to λ1 = 0, is in the null space of A.
The above information can be obtained using eig command. The readers can verify that
the diagonal form
-1
é1 2 ù é -2 1 ù é0 0 ù é -2 1ù
ê ú=ê úê úê ú
ë2 4û ë 1 2 û ë0 5û ë 1 2û
is indeed valid.
The above results are also valid when eigenvalues exist in complex conjugate pair.
2.4.4 Applications
The eigenvalue decomposition is the cornerstone of the analysis of linear dynamical sys-
tems. It is used in both differential equations
y ¢ = Ay (2.62)
y k +1 = Ay k (2.79)
In both examples, the matrix A maps the R n vector space onto itself.*
First, let us perform a geometric interpretation of eigenvalue decomposition, like that in
Example 2.3, before moving to the applications.
* Let me repeat the contrast of this case with the examples in Section 2.3. The n × n square linear operator matrix, for the linear
operation y = Ax, mapped a n-dimensional input space to another n-dimensional output space. In case of differential equa-
tion, y′ = Ay, the matrix A maps the n-dimensional space to itself.
Linear Algebra ◾ 59
é -0.25 1. 5 ù
A=ê ú
ë -2.25 4.25û
from Example 2.3. The eigenvalues and eigenvectors for this matrix can be found
using the command eig as
l1 = 3.299, l 2 = 0.701
é0.3893ù é0.8446 ù
c1 = ê ú , c2 = ê ú
ë 0.9211û ë0.5354 û
Figure 2.4 shows the geometric interpretation of eigenvalues and eigenvectors. The
dots, which form a unit circle, represent various values of vector y (as in Example 2.3,
each dot represents each of the 101 unit vectors). The plus signs represent corre-
sponding vectors Ay. Notice that the two circles are numerically the same as that in
Figure 2.3; they are plotted on the same axis because they lie on the same vector space.
However, the main geometric significance of eigenvalues and eigenvectors is shown as
the lines on the plot. The first solid line corresponds to the first eigenvector c1. When
operated by the matrix A, the result is that the vector gets simply stretched in the same
direction by a factor equal to λ1 = 3.299. This is shown by the dashed line in the figure.
Likewise, the second thick line is the vector c2, which gets shrunk in the same direc-
tion by a factor λ2 = 0.701 (this dashed line may not be visible because it is shorter and
thinner than c2).
y
Ay
FIGURE 2.4 Geometric interpretation of eigenvalues and eigenvectors. The dots represent y and
the plus symbols represent corresponding Ay. The thick solid lines are the eigenvectors c1 and c2,
whereas the dashed lines are Ac1 and Ac2, respectively.
60 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
[1 3 ; …
1
0.5 Ax
eig …
0 x
help
–0.5 close
–1
–1 0 1
Make A*x parallel to x
Thus, any vector in the direction of the eigenvector of a matrix gets stretched or
shrunk by a factor given by the eigenvalue. If an eigenvalue is negative, the result-
ing vector Aci will be a vector in opposite direction that gets stretched/shrunk by the
eigenvalue.
Thus, the physical interpretation is stretching and shrinking of the vector along the
direction of an eigenvector by a factor equal to the corresponding eigenvalue. There is an
excellent interactive tool in MATLAB called eigshow that can be used to exactly under-
stand these concepts. The direction of x can be changed using the mouse, and the vector Ax
responds to these changes. Figure 2.5 shows a screenshot of eigshow window. By interac-
tively moving the vector x and tracking how the vector Ax moves, one can understand the
concepts discussed in Examples 2.3 and 2.4.
While the above is useful in understanding eigenvalues and eigenvectors, the next exam-
ple focuses on concepts that will be useful for readers to use them.
é -0.6 ù é -1.05ù
y=ê ú Þ Ay = ê ú
ë -0.8 û ë -2.05û
These two vectors are represented by solid and dashed line in Figure 2.6, respec-
tively. As shown in the previous section, the two eigenvectors c1 and c2 are chosen as
Linear Algebra ◾ 61
c2
c1
Ay
FIGURE 2.6 Significance of eigenvalue decomposition. The thick lines represent eigenvectors c1
and c2. The solid line represents the vector y and the dotted lines “complete the parallelogram.” The
dashed line represents the vector Ay.
basis vectors. The two dotted lines emanating from the point y complete the paral-
lelogram (e.g., Figure 2.1a). These two solid lines along (and opposite) the direction
of eigenvectors represent the projections along the eigenvectors. Since these lines are
along the eigenvectors, multiplying with matrix A results simply in expansion of the
element y1 and shrinking of the element y 2 . These two are represented by the two
thick dashed lines in the figure. These thick dashed lines are Ay1, the coordinate of
the solution along first eigenvector, and Ay 2 , the coordinate of the solution along the
second eigenvector. To get the final solution Ay in the original basis, the thick dotted
lines complete the parallelogram to yield the solution Ay displayed in the form of thin
dashed line.
This shows the practical application of eigenvalue decomposition. Since any vec-
tor that lies along an eigenvector stays along the same eigenvector, when acted upon
by the matrix A, the response of a linear system along eigenvectors can be analyzed
independently.
To summarize, the steps in computing Ay included the following:
Change of basis to eigenvectors to obtain
é -0.4236 ù
y = C -1 y = ê ú
ë -0.6223 û
é l1 ( -0.4236 ) ù é -0.2969 ù
ê ú=ê ú
êël 2 ( -0.6223 ) úû ë -2.053 û
62 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
where the first element (−0.2969) is the distance along c1 axis and the second (−2.053) is
the distance along c2 axis. These are represented as the two thick dashed lines in the figure.
Change of basis back to the natural basis
é -0.2969 ù é -1.05ù
Cê ú=ê ú
ë -2.053 û ë -2.05û
gives us the solution. Geometrically, the resulting vector, Ay, is obtained by “complet-
ing the parallelogram,” which is indicated by thick dotted lines.
The above example showed the case of distinct, real eigenvalues. However, these results
are applicable for other situations as well.
2.4.4.1 Similarity Transform
The transformation along the eigenvectors is an example of similarity transform. If S is any
invertible matrix, then the matrix B obtained as
B = S -1 AS (2.80)
is said to be similar to A and the transformation is known as similarity transform. The col-
umn vectors of matrix S form the new basis set. The above equation may be written as
A = SBS -1
lc = SBS -1c
(2.81)
lc = Bc , c = S -1c
The eigenvalues of similar matrices are equal, whereas eigenvectors are changed from c to
S−1c. The transformation along eigenvectors as basis set is a special type of similarity trans-
form, since it leads to diagonal (or nearly diagonal) matrix Λ.
2.4.4.1.1 Change of Basis In systems where the matrix multiplication by A maps the vector space
on itself, change of basis is represented by similarity transform. This was seen in Example 2.5.
Change of basis therefore does not change the eigenvalues of the system.
Change of basis along the eigenvectors is an important practical method for analyzing
linear systems.
2.4.4.1.2 Power and Exponent of Similar Matrices Two similar matrices, A and B are related as
A = SBS−1. We can therefore write A2 as
( )( )
A2 = SBS -1 SBS -1 = SB 2 S -1
Linear Algebra ◾ 63
We can multiply the above with matrix A to get higher powers of A. Generalizing,
Ai = SBi S -1 (2.82)
in a similar manner.
2.4.4.1.3 Jordan Canonical Form Jordan canonical form is a special type of similarity transform
closely related to eigenvalue decomposition. The idea behind Jordan canonical form is that any
matrix A can be written in the following form:
A = C LC -1 (2.83)
In case of n distinct eigenvalues, the matrix Λ is diagonal, as seen in Equation 2.78. When
there are repeated eigenvalues, the eigenvectors corresponding to the repeating eigenval-
ues may not be linearly independent. If that happens, full diagonalization is not possible.
However, it is still possible to decompose the matrix A in the above Jordan canonical form.
For example, consider a 3 × 3 matrix whose eigenvalues are {λ1, λ2, λ2}. It is possible to
decompose the matrix A in the following form:
él1 0 0ù
ê ú -1
A = éëc1 c2 c 3 ùû ê 0 l2 1 ú éëc1 c2 c 3 ùû
ê0 0 l 2 úû
ë
The above decomposition is nearly diagonal. The scalar “1” appears in the off-diagonal
element in each block corresponding to the repeated eigenvalue. The vector c3 is the “gen-
eralized eigenvector” of matrix A.
y ¢ = C LC -1 y
(2.84)
(C y ) = A (C y )
-1 -1
y ¢ = L y (2.85)
y i¢ = li y i
64 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
y (0)
Lt -1
y = Ce
C
(2.87)
e At
The analytical solution to the ODE (2.62) is given by the above equation. Standard text-
books on linear algebra discuss methods to compute matrix exponent. It is easy to compute
for small-size matrices. The MATLAB command expm may be used to compute the matrix
exponent numerically. Thus, the evolution of y at any time t from initial condition y0 may
be calculated in MATLAB as
>> yi = expm(A*ti)*y0;
y k +1 = Ay k (2.79)
may be written as
y k +1 = C LC -1 y k (2.88)
Just as the case of differential equation, the diagonalization from Equation 2.90 also helps us
qualitatively discuss the stability of linear difference equations. The difference equation (2.79)
results in an asymptotically stable behavior if all the eigenvalues are within the unit circle,
that is, |λi| < 1, ∀ i. If one or more of the eigenvalues are outside the unit circle, the system is
unstable; whereas eigenvalues on the unit circle implies that the system is marginally stable.
In chemical engineering, difference equations are often used in computer-based process
control and logistic maps. We will not discuss difference equations for the rest of this book.
2.5 EPILOGUE
Some of the key concepts in linear algebra were introduced in this chapter. The three types
of problems in linear systems of interest to engineers include the following.
First problem involves solving a set of linear equations of the type, Ax = b, given by
The aim is to find the vector x given the right-hand side, b. The most common example is to
solve n linear equations in n unknowns, where the matrix A is an n × n matrix. However, there
are several examples where the number of equations exceeds the number of unknowns. Such
examples will be seen in linear least squares problems for parameter estimation (Chapter 10).
The MATLAB commands relevant to this type of problems are listed below.
The linear equations problem was cast as a linear combination of column vectors of matrix A:
x1 v 1 + x2 v 2 + + xn v n = b (2.91)
where v1, v2, … , vn (see Section 2.2). The various properties of vectors, matrix, and asso-
ciated vector spaces were discussed. The relevant MATLAB commands are summarized
below.
Example 2.1 used a simple model of blending process to highlight the concepts of null and
image spaces, as also to discuss the solution of linear equations. Example 2.2 showed the
sensitivity of solution to errors in the data. Condition number was introduced to quantify
this sensitivity.
The discussion on null and image spaces led us to SVD (Section 2.3). SVD was intro-
duced in the context of analyzing the second type of problems in linear algebra:
EXERCISES
Problem 2.1 (a) Show that linearly independent vectors v1, v2, … , vn ∈ R n span the R n
space.
(b) Show that Equations 2.18 and 2.19 are equivalent.
Hint: Define the vector
é v 1,i ù
ê ú
v 2 ,i
vi = ê ú
ê ú
ê ú
êë v n,i úû
Problem 2.4 Repeat the steps of Section 2.3.3 (Example 2.2) for solving Ax = b, with
é1 3 ù é1.37 1 ù
A=ê ú and A = ê ú
ë3 6.001û ë 0. 5 0.37 û
y = Ax , y = Bx , y = Cx
Randomly generate 250 data points in domain space, X, using randn. Plot
the corresponding output in Y-space. Justify the difference in observed
response.
Relate the qualitative results to the singular values of the three matrices.
Problem 2.7 For the same three matrices in Problem 2.6, choose y0=[0.7071;
0.7071]. The solution of the ODE
y ¢ = Ay , y ( 0 ) = y0
Ordinary Differential
Equations
Explicit Methods
3.1 GENERAL SETUP
A general ordinary differential equation (ODE), introduced in Chapter 1, is of the form
dy
= f ( t ,y )
dt (3.1)
y (t0 ) = y 0
The initial condition at time t = t0 is specified as y0. In general, y ∈ R n is an n-dimensional
solution variable of interest, and f(t, y) : R n → R n is a vector-valued function. In this chapter,
I first cover numerical techniques to solve an ODE initial value problem (IVP), followed by
the application of ODE solution techniques to Process Engineering problems. As discussed
in Chapter 2, the convention followed throughout this text is that y is a n × 1 column vector
and f(⋅) is n × 1 function vector. The independent variable, t, is a scalar.
3.1.1 Some Examples
Examples of systems described by ODEs abound in engineering. For example, the reaction
of a species along the length of a plug flow reactor (PFR) is given by an ODE:
d
u C A = -kC An (3.2)
dz
69
70 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
Since this is a first-order ODE, a single initial condition is required, CA(z = 0) = CA, in. Change
of temperature of liquid in a well-stirred vessel that is electrically heated is also given by an
ODE of the form
dT
V rc p = Qrc p (Tin - T ) + Qh Ah (3.3)
dt
where
Qh is the heat flux from the electric heater
Ah is the heater surface area
The temperature at the start, T(t = 0) = T0 forms the initial condition for this system.
More commonly, we encounter situations where multiple ODEs have to be solved, for
example, when there are multiple species in a reactor or a nonisothermal reactor, where
energy balance equation models the temperature variation in the reactor. Likewise, there
may be multiple heated tanks that need to be modeled, or the height of liquid in the tank
may vary.
Equation 3.2 is written in the standard form by dividing throughout by u. If there are
multiple species, the mass balance is written for each individual species to obtain
d
C A = f1 ( C A ,CB ,¼)
dz
d
CB = f 2 ( C A ,CB ,¼) (3.4)
dz
dT
= fn+1 ( C A ,CB ,¼) (3.5)
dz
Equations 3.4 and 3.5 may be combined to obtain the set of ODEs:
éC A ù é f1 ( C A ,CB ,¼) ù
ê ú ê ú
d êCB ú ê f 2 ( C A ,CB ,¼) ú (3.6)
=
dz ê ú ê ú
ê ú ê ú
+1 ( C A ,C B , ¼) ú
ë T úû êëfn
ê
û
y f (y)
which is in the standard form (3.1). Since the concentrations and temperature are all speci-
fied at the inlet, y(z = 0) = yin, we have a set of (n + 1) ODE-IVP.
Ordinary Differential Equations ◾ 71
Another type of example involves conversion of higher-order ODEs into a set of first-
order ODEs. The classical example of this is the mass-spring-damper system. The motion
of a body with mass m attached to a spring is given by
mx ¢¢ = -cx ¢ - kx (3.7)
where
x″ is the acceleration
v = x′ is the velocity
The initial condition involves displacing the mass by a certain distance and releasing it. This
is an example of “damped oscillator.” The second-order ODE may be written as
x¢ = v
c k (3.8)
v¢ = - v- x
m m
Defining vector y = [x v]T, the above ODE will be written in the standard form*:
é v ù é x0 ù
y¢ = ê ú , y (0) = ê ú (3.9)
ë - cv - kx û ë0 û
The solutions of problems of the type (3.6) and (3.9), which give rise to ODE-IVP, are dis-
cussed in this chapter.
ODE-BVPs (boundary value problems): These form another type of ODEs, where the
conditions for various state variables are specified at different points in the domain. A clas-
sic example is that of heat conduction in a rod:
d 2T
= -b (T - Ta ) (3.10)
dz 2
with Ta as the temperature of the ambient. Since this is a second-order ODE, two condi-
tions are required to solve iy. If both the conditions are specified at the same location (e.g.,
T(0) = T0 and T′(0) = ϑ0), we can convert the second-order ODE into the standard ODE-IVP
form (3.1). However, more commonly, the two conditions are specified at either ends of the
domain, such as
T ( 0 ) = T0 , T ¢ ( L ) = J (3.11)
é0 1ù
* In fact, this is an example of linear system of ODEs: y′ = Ay, with A = ê ú.
ëê -k - c úû
72 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
This leads to ODE-BVP. ODE-BVP cannot directly be solved using methods described
in this chapter. They are converted using finite difference approximation to a set of linear
or nonlinear equations and are solved. This procedure results in a special structure of the
problem. Methods for solving ODE-BVP (as well as other problems that result in similar
numerical structure) will be discussed in Chapter 7.
Thus, this chapter focuses on one family of numerical methods to solve ODE-IVP. An
introduction to numerical solution of ODE-IVP and a comparison with numerical integra-
tion will be first presented in the remainder of this section. Thereafter, an important family
of numerical methods, called Runge-Kutta (RK) methods, will be discussed. A majority of
discussions in this section will focus on a single-variable problem; extension to multivari-
able case will be considered thereafter.
3.1.2 Geometric Interpretation
Before proceeding further, I will draw parallels with numerical integration for a problem
in a single variable, that is, y is a scalar and f is a scalar function. Numerical integration is
covered in Appendix E. If the function f(⋅) is a function of t only and is independent of y,
then numerical integration may be used to “solve” Equation 3.1:
y (b ) b
ò( ) ò
y a
dy = f ( t ) dt
a
(3.12)
b
If the integral on the right-hand side is represented by I =
ò
a
f ( t ) dt , then it is clear that
I = y (b ) - y ( a ) (3.13)
Thus, y(b) may be considered as simply the integral I, if the initial condition y(a) = 0.
In contrast, if f(t, y) is a function of the dependent variable y as well, an ODE-IVP is
solved.
The geometric interpretation of integration, discussed in Appendix E, is typically well
known through the first course in high-school calculus. Integral, if we recall, is the area
under a curve: When the function f(t) is plotted against t, the area under the curve between
t = a and t = b is the integral I. It bears repeating that in understanding the meaning of inte-
gral, the function f(t) is plotted on the Y-axis.
In contrast, when solving ODE-IVP, the solution variable y is plotted on the Y-axis. The
initial condition, (a, ya), forms the starting point and f(a) is the slope at this point. The infor-
mation about the slope is used to determine the next point, y(a + h). Thus, the geometric
interpretation of solving ODE-IVP is to use the information about the slope f(t) to deter-
mine how the solution will propagate along the y–t plot.
The next question is, what if the function in Equation 3.1 is a function of both t and y?
Typically, this is when the phrase “solving an ODE-IVP” is invoked. One way of looking at
Ordinary Differential Equations ◾ 73
dy
= -t 2 y , y ( 0 ) = 1 (3.14)
dt
The above equation can be solved analytically, and the initial condition used to obtain
-t 3/3
y =e (3.15)
Note that depending on the initial condition, one gets a family of curves when one plots y(t)
vs. t. In contrast to integration, ODE-IVP is best understood when the solution, y(t) (and
not f(⋅)) is plotted on the Y-axis, as shown in Figure 3.1.
The solution for the above ODE is plotted in Figure 3.1. Unlike integration, we are inter-
ested in “tracking” how the solution y(t) evolves with time t, starting at the point (t0, y0). The
final solution y(t) depends on the initial condition, y0. For different initial conditions y0, a
-t 3 /3
family of curves y = ce is obtained in the y vs. t space. The solution in the figure is for
the initial condition of Equation 3.14.
At t = 0.5, the dependent solution variable is y(0.5) = 0.9592. The slope of the curve y(t)
equals the function value −t2y, that is, −0.240. Thus, the right-hand side of the ODE (3.14)
is nothing but the slope of the curve y(t) at any point on the curve. Solution technique for
ODE therefore involves retracing the solid line in Figure 3.1, given the slope f(t, y) at any
point in the t – y space. The family of numerical methods discussed in this chapter attempts
to use the information of the slope f(t, y) and the projected slopes to predict how y(t) will
behave in the future. Another family of numerical methods use information from the past
to predict the future y(t).
–0.240
0.8
0.6
y
0.4
0.2
0
0 0.5 1 1.5 2
t
FIGURE 3.1 Solution to the ODE (3.14). The arrow indicates slope of the tangent at t = 0.5.
74 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
yi +1 - yi
= f ( t i ,y i ) (3.16)
h
yi +1 = yi + hf ( t i ,yi ) (3.17)
In the above, h is the step-size in the independent variable t. The step-size is chosen by users
or internally by the algorithm to ensure stability and accuracy of the numerical technique.
Starting with the initial values, (t0, y0), the above expression (3.17) is used recursively to
obtain new values yi + 1 and step forward in time. The code below shows implementation of
Euler’s explicit method for the ODE-IVP given in (3.14).
%% Problem Setup
t0=0; tN=0.5;
y0=1;
h=0.01; N=(tN-t0)/h;
%% Initialize and Solve
t=[0:h:tN]'; % Initialize time vector
y=zeros(N+1,1);
y(1)=y0; % Initialize solution
for i=1:N
y(i+1)=y(i)+h*(-t(i)^2*y(i));
end
%% Output and Error
plot(t,y); hold on;
yTrue=exp(-tN^3/3);
err=abs(yTrue-y(end));
0.99
0.98
y
0.97
0.96
0.95
0 0.1 0.2 0.3 0.4
t
FIGURE 3.2 Comparison of the true solution to ODE-IVP of Equation 3.14 with numerical solu-
tions using Euler’s method with h = 0.1 (diamonds), h = 0.05 (squares), and h = 0.01 (circles).
Euler’s explicit method can be easily derived from Taylor’s series expansion of y(t).
Specifically, the expansion of y(ti + h) around y(ti) is given by
h2 h3
y ( t i +1 ) = y ( t i ) + hy ¢ ( t i ) + y ¢¢ ( ti ) + y ¢¢¢ ( ti ) + (3.18)
2! 3!
h2
y ( t i +1 ) = y ( t i ) + hf ( t i ,yi ) +
2!
(
f ¢ x,y ( x ) ) (3.19)
where ξ ∈ [ti, ti + 1]. The first two terms form Euler’s explicit method (3.17), whereas the last
term gives the local truncation error (LTE) of Euler’s explicit method:
h2
Elte =
2
(
f ¢ x,y ( x ) ) (3.20)
The derivation of GTE of Euler’s method will be skipped* for brevity. It can be shown that
the GTE for Euler’s method is
M Kt
E£
2K
(
e -1 × h )
(3.21)
ìï ¶f ( t ,y ) üï ìï df ( t ,y ) üï
K = max í ý , M = max í ý
îï ¶t þï îï dt þï
The practical significance of Equation 3.21 is that the Euler’s explicit method has accuracy
( )
of the order h to the power 1, represented in shorthand as E ~ O h1 . This was observed in
Example 3.1: The error reduced by half when the step-size was halved (from ε = 0.011 for
h = 0.1 to 0.0057 h = 0.05), while the error fell by one order of magnitude (from 0.011 to
0.0012) when the step-size was reduced by one order of magnitude.
( ) ( )
The GTE of Euler’s method is ~ O h1 , while its LTE is ~ O h2 . In fact, this is a com-
mon trend that will be observed for all ODE solution techniques discussed in this book.
( )
If the LTE of the ODE solving method is ~ O hn+1 , its GTE will be ~ O hn . We will call ( )
such a method, with GTE ~ O h ( ) , as an nth order method.
n
yi - yi -1
= f ( t i ,y i ) (3.22)
h
yi - yi -1 - hf ( t i ,yi ) = 0 (3.23)
This equation needs to be solved at each time ti to obtain the value of yi. Since the solution
yi is implicitly known by solving (nonlinear) equation (3.23). The next example will demon-
strate the use of Euler’s implicit method.
( )
yi - yi -1 - h t i2 yi = 0
* Refer to any standard numerical ODE techniques book for the derivation. The derivation of GTE for integration, which is
pedagogically simpler, is presented in Appendix E for an interested reader.
Ordinary Differential Equations ◾ 77
1
yi = yi -1 (3.24)
1 + ht i2
Knowing the value yi − 1 and time ti at which the solution value yi is desired, the above
expression can be used to solve the ODE-IVP using the following code.
%% Problem Setup
t0=0; tN=0.5;
y0=1;
h=0.1; N=(tN-t0)/h;
%% Initialize and Solve
y=zeros(N+1,1);
t=[0:h:tN]'; % Initialize time vector
y(1)=y0; % Initialize solution
for i=2:N+1
y(i)=y(i-1)/(1+h*t(i)^2);
end
%% Error calculation
yTrue=exp(-tN^3/3);
err=abs(yTrue-y(end));
The error in Euler’s implicit method follows the same pattern as Euler’s explicit
method. The figure using Euler’s implicit method is similar to that of Figure 3.2
and is skipped. The errors in computing y(0.5) are e0.1 = 0.012, e0.05 = 0.006, and
e0.01 = 0.0012.
Like the explicit method, Euler’s implicit method also has LTE ~ O hn+1 ( ) and GTE
~O h .( )
n
(
y1 = 1 - ht 02 y0 )
Further
( ) (
y2 = 1 - ht12 y1 = 1 - ht12 1 - ht 02 y0 )( )
n -1
yn = Õ (1 - ht ) y
i =0
2
i 0
1 - ht 2 £ 1
2
h£ (3.25)
t2
With the end-time in Example 3.1 chosen as 10 s, run the code again. With h = 0.1, the value
of y(10) is obtained as ∼1017 because Euler’s method is unstable. On the other hand, Euler’s
method correctly gives the value of y(10) = 0 with step-size of h = 0.01.
( )
* Since all the numerical methods for solving ODE have their errors ~ O hn , n ³ 1, these numerical techniques will be
convergent if they are stable. Hence, there is a concept of zero-stability, which implies stability of the numerical technique
when h → 0.
Ordinary Differential Equations ◾ 79
The stability analysis can be formalized when f(⋅) is an explicit function of y only. As
seen in case studies in Section 3.5 and other chapters, this is true of a majority of chemical
engineering problems of interest. Before proceeding to the general case, consider the case
when f(y) is linear, that is, the ODE-IVP is
y ¢ = -l y (3.26)
The analytical solution for this ODE is y(t) = y0e−λt, which is stable because y(t) is bounded
as t → ∞. Substituting f(y) in Equation 3.17
yi = yi - hlyi -1 = (1 - hl ) yi -1
= -hl ( -hlyi -2 )
(3.27)
\ yi = ( -hl ) y0
i
Using the above equation, the relation for yi − 1 can be substituted as
yi = (1 - hl ) éë (1 - hl ) yi -2 ùû
= (1 - hl ) yi -2
2
(3.28)
yi = (1 - hl ) y0
i
(3.29)
This is a convergent series if the term (1 − hλ) lies between −1 and 1. In other words, Euler’s
method is stable if
1 - lh £ 1 (3.30)
2
h£ (3.31)
l
Consider the stability of Euler’s implicit scheme. Since both h and ti are positive, the term
(1 + ht )
-1
2
i from Equation 3.24 is positive and less than 1. Therefore, Euler’s implicit method
is stable for any choice of h. Indeed, when Example 3.2 is run for h = 0.1 and tN = 10, the
implicit method is still stable. When the step-size is further increased to h = 1, the implicit
method is still stable. Same results for stability of Euler’s implicit method are obtained for
the linear ODE case, y′ = − λy as well. Any positive value of the step-size h will result in
stable solution.
In contrast to the explicit methods, implicit methods are often “globally stable” for any
range of values of step-size h. Even when the numerical methods are not globally stable,
their stability margins are significantly broader than the corresponding explicit methods of
equivalent accuracy. This shall be discussed further in Section II of this book. Specifically,
Chapter 8 is devoted to implicit ODE solving techniques for a variety of problems.
In addition to determining the step-size, stability analysis has significant practical rami-
fications on the choice of numerical method for solving ODEs. This will be demonstrated
toward the end of this chapter, in Section 3.5.4 (please see Example 3.15 as well). In case of
ODEs in multiple variables, if the values of λi determine the step-size. The “speed” of evolu-
tion of the ODE is governed by the smallest eigenvalue (slowest mode), whereas the step-
size is limited by the largest eigenvalue. If the ratio of the two is very large, an explicit ODE
solver may not converge. For such systems, an implicit ODE solver is required to ensure
good speed of convergence.
3.1.6 Multivariable ODE
Next, a multivariable case is considered, with y ∈ R n as an n-dimensional vector. The
reader is reminded that for the entire book, an n-dimensional vector will be a column vec-
tor (unless specified otherwise), that is, it will be an n × 1 vector. Recall the discussion in
Chapter 2, f : R n → R n, because it maps the Y -space onto itself. Thus, f(t, y) is an n × 1 vector
of nonlinear functions of t and y. With these definitions, extension to a multivariate case is
straightforward:
y i +1 = y i + f ( t i ,y i ) (3.32)
y ¢ = - Ay (3.33)
Hereafter, the boldface will be dropped for convenience.* Euler’s method is written as
yi = yi -1 - hAyi -1 = ( I - Ah ) yi -1 (3.34)
* With some notable exceptions such as interpolation and integration, most of the methods discussed in this book are, in
general, applicable to multivariable cases. The derivations will be done for single-variable case. Extension to multivariable
cases will be clear from the context.
Ordinary Differential Equations ◾ 81
where I is an n × n identity matrix. Note that in a multivariable case, the order of multiplica-
tion is important. Continuing on the same lines
yi = ( I - Ah ) y0
i
From Chapter 2, we know that the above equation is stable if eigenvalues of the matrix
(I − Ah) are in the unit circle. If eigenvalues of A are {λ1, …, λn}, then the eigenvalues of
(I − Ah) are {(1 − λ1h), …, (1 − λnh)}. Thus, following same arguments as before, the choice of
h should be such that all eigenvalues of Ah are positive and less than 2, which is equivalent
to saying that the step-size h should be chosen such that
2
0£h£ , "i (3.35)
li
3.1.6.1 Nonlinear Case
For a nonlinear ODE, it is possible to estimate the step-size required for stability by linear-
izing the system at the current point. Thus, defining
¶f
l=- (3.36)
¶y t , y
i i
3.2.1 Some History
Euler’s method was described by Leonhard Euler in ca. 1770s. An improvement to Euler’s
method was proposed by Karl Heun in late nineteenth century. The idea draws inspira-
tion from the trapezoidal rule in integration. To compute yi + 1, Euler’s explicit method uses
the slope at the ith point, f(ti, yi); Euler’s implicit method uses the slope at (i + 1)th point,
f(ti + 1, yi + 1). The accuracy can be improved if the average of the two slopes is used. Heun
proposed that instead of using an average of the slopes at the two points, one could use the
slope at i and the “projected slope” at (i + 1). The projected point at (i + 1) is
yi + hf ( t i ,yi ) (3.39)
(
f t i +1 ,yi + hf ( t i ,yi ) ) (3.40)
yi +1 = yi +
h
2
( ( ))
f ( t i ,yi ) + f t i +1 ,yi + hf ( t i ,yi )
(3.41)
This Heun’s method is more accurate than Euler’s method. Specifically, its GTE is ~ O h2 , ( )
that is, it is a second-order accurate method. This shows that it is possible to use more
information to improve the accuracy of numerical ODE solution methods. Due to the
importance of differential equations in engineering and science, ODE solution techniques
have received a lot of attention in the twentieth century. Various families of methods have
been developed. This chapter will focus on explicit RK family of methods. RK methods
use multiple points between i and (i + 1) for projecting the slopes and use an average of
slopes at these projected points. These methods are of varying degrees of accuracy and
are abbreviated as RK-n, where n represents the order of GTE ~ O hn . Second-order RK ( )
( )
methods (i.e., methods with GTE ~ O h2 ) are considered in this section for pedagogical
purposes. Thereafter, extension to higher-order methods is discussed in Section 3.3.3.
Analysis of the improved Euler’s method and interpretation in context of RK-2 method
will be discussed in Section 3.2.2, whereas interpreting the improved Euler’s method in the
context of other methods is presented in Section 3.2.5.
Ordinary Differential Equations ◾ 83
yi +1 = yi + h ( w1k1 + w2 k2 ) (3.42)
where
k1 = f ( t i ,yi )
(3.43)
k2 = f ( t i + ph,yi + qh × k1 )
Notice that the slope f(ti, yi) from Equation 3.17 is replaced by a weighted sum, S = w1k1 + w2k2,
where the values k1 and k2 may be interpreted as slopes calculated at two different points.
RK-2 methods involve function evaluations with (at least) two intermediate points. These
function evaluations are represented as k1 and k2 in Equation 3.42.
Comparing with Equation 3.41, one can see that the improved Euler’s method uses
p = q = 1. As we will see presently, RK-2 is not a single method but a family of methods. The
derivation of the formula and error analysis is discussed next.
dy h2 d 2 y h3 d 3 y
yi +1 = yi + h + + + (3.44)
dt i 2 ! dt 2 i 3 ! dt 3 i
dy d 2 y df ¶f ¶f dy
= f ( t ,y ) and = = +
dt dt 2 dt ¶t ¶y dt
(i )
yi +1 = yi + hf (i )
+
2
(
h2 (i )
ft + f y( ) f ( ) +
i i h3 d 3 y
6 dt 3
) +
(3.45)
¶ ¶
f ( ) f ( t i ,y i ) , f t( ) f ( t i ,y i ) , f y( ) f ( t i ,y i )
i i i
¶t ¶y
84 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
¶ ¶
k2 = f ( t i , yi ) + ph f ( t i ,yi ) + qh × k1 f ( t i , yi ) + h2T (3.46)
¶t ¶y
In the above expression, we have represented the terms in (h2) and higher as T. Rearranging
the above and substituting in Equation 3.42, we get
w1 + w2 = 1
1 (3.49)
w2 p = w2 q =
2
é1 (i ) ù
yi +1 - yi +1 = h3 ê ( y ¢¢¢ ) - w2T ú (3.50)
ë6 û
Equation 3.49 gives the relationship for choosing p, q, w1, and w2, whereas (3.50) shows that
( )
the LTE of RK-2 method is ~O h3 . Specifically, using the mean value theorem, it can be
shown that the error in RK-2 methods is
-h3
eRK 2 =
12
( )
y ¢¢¢ x, y ( x ) , x Î éët i , t i +1 ùû
(3.51)
3.2.2.2 Heun’s Method
There are four unknowns and three equations in (3.49). Thus, one of the parameters can
be chosen independently. As discussed earlier, RK-2 Heun’s method uses p = 1. This gives
q = 1, w1 = 0.5, w2 = 0.5. Thus, Heun’s method may be written as
h
yi +1 = yi +
2
( k1 + k2 )
k1 = f ( t i ,yi ) (3.52)
k2 = f ( (t i )
+ h ) , ( yi + hk1 )
( )
The GTE of an RK-2 method, such as Heun’s, is O h2 . Thus, when the step-size, h, is reduced
by a factor of 10, the error reduces approximately by a factor of 100. Consequently, RK-2
methods are more accurate than Euler’s explicit method, as demonstrated in the next example.
Ordinary Differential Equations ◾ 85
Each step of Heun’s method requires function f(t, y) to be evaluated twice. Thus, with
h = 0.1, there are 10 function evaluations required for obtaining y(0.5). On the other hand,
Euler’s explicit method requires one function evaluation per step, so the same number
of function evaluations are required when step-size of h = 0.05 is used in Euler’s method.
The corresponding error computed in Example 3.1 was 5.7 × 10−3. Comparing with the
86 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
0.8
0.6
y
0.4
True solution
0.2 h = 0.5
h = 0.1
0
0 0.5 1 1.5 2
t
FIGURE 3.3 Comparison of Heun’s method using h = 0.5 (circles) and h = 0.1 (squares) with the true
solution (solid line). Smaller value of h results in higher accuracy.
error in Heun’s method for h = 0.1 (Example 3.2) shows the power of using a higher-order
method: The error is nearly one order of magnitude lower for the same number of function
evaluations.
The geometric interpretation of Heun’s method is as follows. The slope at (ti, yi) of
Euler’s method is replaced by a weighted sum: h(k1 + k2)/2, geometrically, an average of
slopes k1 and k2. The former is nothing but the slope at (ti, yi). This slope is used to project
a point at ti + 1. The value k2 is the slope calculated at this projected point. Thus, Heun’s
method progresses using an average of the slope at (ti, yi) and a projection at ti + 1 using
the slope k1.
y i +1 = y ( ) + hk2
i
(
k1 = f t ( ) ,y ( )
i i
) (3.53)
ææ i h ö æ i h öö
k2 = f ç ç t ( ) + ÷ , ç y ( ) + k1 ÷ ÷
èè 2ø è 2 ø ø
Here, k2 is the slope at the projected “midpoint” between t(i) and t(i + 1). Projected value of y
at that point is (y(i) + h/2 k1).
Likewise, the so-called Ralston’s method is derived to minimize truncation error. In this
method
John Butcher designed a convenient way to represent the coefficients of an RK method, now
called the Butcher tableau. A simplified tableau for RK-2 is represented as shown below:
p q
w1 w2
The bottom-most row contains weights of the (two) slopes, whereas the left-most row con-
tains the parameter p. The Butcher’s tableau will be revisited later in this section for general
RK methods.
y ¢ = f ( t ,y ) , y ( t i ) = y i (3.55)
starting at the point (ti, yi). LTE implies the error in numerically computing yi + 1, starting
from this point, without accounting for the error accumulated in computing yi itself. As
before, if yi+1 represents the true solution of the ODE_IVP (3.55), then
yi +1 = yi +1 + ch3 (3.56)
where, c ∼ − f ‴(ξ)/12 is the factor representing the error as per Equation 3.51. The error,
ch3, can be estimated if there is an alternate way to compute yi + 1. The first way to do so is to
compute yi + 1 with a smaller step-size. Alternatively, yi + 1 may be computed using a method
with different level of accuracy (such as RK-3 method, which has LTE of O(h 4 )).
Consider the case where yi + 1 is also calculated in two steps with step-size of h/2. The
same RK-2 method is employed. The first application of RK-2 method yields
3
æhö
y 1 =y 1 + c1 ç ÷ (3.57)
i+
2
i+
2 è2ø
88 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
In the second application, y 1 is not known. Thus, the net error includes the LTE in the
i+
2
second application, as well as the effect of the first error:
3 3
æhö æhö
yi +1 » yi +1 + c1 ç ÷ + c2 ç ÷ (3.58)
2
è ø è2ø
Note the use of approximation sign above. If the function f(t, y) in Equation 3.55 is
independent of y, then the above relationship is exact. Subtracting Equation 3.56 from
Equation 3.58 and representing the difference between the solutions using two steps and a
i +1
single step as D
i +1 + h3 æ c1 + c2 - c ö , D
0=D i +1 = yi +1 é h ù - yi +1 éh ù (3.59)
ç 3 3 ÷ ê2ú ë û
è2 2 ø ë û
æ é ùö æ é ùö
c1 ~ - f ¢¢¢ ( x1 ) / 12 ç with x1 Î êt i ,t 1 ú ÷ and c2 ~ - f ¢¢¢ ( x2 ) /12 ç with x2 Î êt 1 ,t i +1 ú ÷
ç i+ ÷ ç i+ ÷
è ë 2 ûø è ë 2 û ø
i +1 - ch3 æ 1 - 1 ö
0»D (3.60)
ç 2 ÷
è 2 ø
i +1 æç 2 ö÷
2
Ei +1 ~ ch3 » D 2
(3.61)
è 2 - 1 ø
This result can be generalized to any numerical scheme. Although the LTE was used in the
above derivation for ease of explanation, the result is more appropriately defined in terms
( )
of GTE. If the GTE of a numerical scheme is ~ O hn , then the error may be estimated as
i æç 2 ö÷ , D
n
Ei ~ D i = yi +1 é h ù - yi +1 éh ù (3.62)
n ê2ú ë û
è 2 -1 ø ë û
First, the code is executed with the values of h=0.1; tN=0.1. This gives the solu-
tion y0.1[h]. The result is stored in variable yOneStep. The code is executed again
with the value of h=0.05, keeping tN the same. The solution obtained at t = 0.1 using
éh ù
this method is the value y0.1 ê ú ºyTwoStep. Next, the difference between the val-
ë2û
0.1:
ues calculated using two steps and a single step gives D
>> Delta=yTwoStep-yOneStep;
>> errEstimate=4*Delta/3
errEstimate =z
1.6673e-04
Recall that the true value of y(t) = e−t3/3. Thus, the actual error in computing y0.1[h] is
Notice that the error estimated in the above example is close to the true error. It should be
remembered that the difference D i, as per Equation 3.83, estimates the error in the single-
step implementation of RK-2 method. The next example demonstrates how the value of D i
can be used to obtain a more accurate approximation of the ODE solution.
3.2.4 Richardson’s Extrapolation
Let me use the following example to demonstrate Richardson’s extrapolation method to
improve the accuracy of a numerical technique.
4
y0.1 » y0.1 éëh ùû + D 0. 1
3
Continuing where we left off in Example 3.4, an improved solution can be obtained as
>> yImproved=yOneStep+4*Delta/3
yImproved =
0.9997
>> errNew=abs(yTrue-yImproved)
errNew =
1.2156e-08
90 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
Note that the error in calculating y0.1 decreased significantly. This method of com-
bining results from different step-sizes to obtain an improved numerical method is
known as Richardson’s extrapolation.
So, does using Richardson’s extrapolation mean that there is no error in the improved
numerical solution? As one might guess it, the answer to that question is no. It is important
to notice that in Equation 3.61, we wrote that the error was approximately proportional to
i +1. Richardson’s extrapolation will serve to improve the order of accuracy, not eliminate
D
the error.
( )
Consider a numerical technique has GTE ~ O hn . As earlier, yt[h] will be used to repre-
sent the numerical solution obtained using step-size h at time t. Note that in addition to the
leading error term, there are higher-order terms that have to be considered as well:
h
Likewise, a similar expression can be written with step-size :
2
n n +1
éh ù æ h ö æhö (3.64)
yt = yt ê ú + c ç ÷ + d ç ÷
ë2û è 2 ø è2ø
éh ù d
(2 - 1) y
n
t = 2n yt ê ú - yt éëh ùû - hn+1
ë2û 2
(3.65)
éh ù
2n yt ê ú - yt éëh ùû
ë2û d (3.66)
yt = - hn+1
2n
-1
yimproved
n
2 2 -1
( )
( )
E ~O hn +1
Consider application of the above formula to Heun’s method. Since n = 2 for Heun’s method
éh ù
4 yt ê ú - yt éëh ùû
yimproved = ë2û
3
4 æ éh ù ö
= ç yt ê ú - yt éëh ùû ÷ + yt éëh ùû
3 è ë2û ø
4
= yt éëh ùû + Dt
3
Ordinary Differential Equations ◾ 91
This is the same formula that was used in Example 3.5. Richardson’s extrapolation improves
the accuracy of the numerical method to the next remaining term in the truncated series.
Richardson’s extrapolation is not just limited to ODE solution techniques. It may be used
for any numerical method. In Problem 3.2, you’ll use Richardson’s extrapolation for trape-
zoidal rule in numerical integration and for numerical differentiation in Problem 3.3. Using
( )
a numerical method with accuracy ~ O hn , let χh be the numerical solution using step-size
h and χh/2 be the numerical solution using step-size h/2. The Richardson’s extrapolation
formula gives
2n c h / 2 - c h d
c true = - hn+1 (3.67)
2n
-1
cimproved
(n
2 2 -1
)
( )
E ~O hn +1
A final note is that if the numerical technique has error terms given by
(
yi +1 = yi + h afi +1 + ( b1 fi + b2 fi -1 + + bn fi -n+1 ) )
Since the right-hand side depends on yi + 1, this is an implicit family of methods
yi +1 = yi + h ( b1 fi + b2 fi -1 + + bn fi -n+1 )
3h h (3.69)
yi +1 = yi + fi - fi -1
2 2
3.2.5.3 Predictor-Corrector Methods
The improved Euler’s method was originally proposed by Wilhelm Heun as a “predictor-
corrector” method. The explicit Euler’s method
+1 = yi + hf ( t i ,yi )
yipred
yi +1 = yi +
h
2
( (
f ( t i ,yi ) + f t i +1 ,yipred
+1 ))
is used to improve this prediction. In general, ABn − 1 is used as a predictor and AMn is used
as a corrector. This is done not merely to improve the accuracy of the numerical method
(ABn can instead be used if accuracy was the only concern); the predictor-corrector meth-
ods have improved stability properties than explicit AB methods but are still explicit, unlike
AM methods.
MATLAB solver ode115 uses Adam’s predictor-corrector methods of varying orders,
( ) ( )
from O h1 to O h15 . This is an explicit solver.
Ordinary Differential Equations ◾ 93
4 1 2h
yi +1 = yi - yi -1 + f ( t i +1 ,yi +1 ) (3.70)
3 3 3
MATLAB solver ode15s uses first- to fifth-order BDF method. It is the second most pop-
ular solver in MATLAB and is useful for solving stiff ODE-IVP problems, as well as some
types of DAE problems. The need for a stiff solver like ode15s is introduced in a case study
in Section 3.5.4 and the implicit methods are discussed in more detail in Chapter 8.
yi +1 = yi + h × S ( t i ,yi ) (3.71)
where S(.) is a term that can be explicitly calculated from known quantities ti and yi. Higher-
order RK methods include more terms based on the projection of f(t, y). A generic RK
method that involves m-terms may be written as
yi +1 = yi + ( w1k1 + w2 k2 + + wm km ) (3.72)
where
k1 = f ( t i ,yi )
k2 = f ( (t i + p2h ) , ( yi + q21hk1 ) ) (3.73)
k3 = f ( (t i + p3h ) , ( yi + q31hk1 + q32 hk2 ) )
94 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
The first term, k1, is just the slope at ith point; the second term, k2, involves coefficients
p2 and q21. Thus, k2 is a slope computed at some projected point between i and (i + 1). The
terms in subsequent stages use the slopes, kj, computed so far. A compact way of represent-
ing an RK method with m terms is
m
yi +1 = yi + åw k
j =1
j j (3.74)
æ æ j -1 öö
ç å
k j = f ç ( t i + p j h ) , y i + ç h q j k ÷ ÷
ç
è =1
÷÷
øø
(3.75)
è
The Butcher’s tableau is a convenient way of representing all the weights for RK and similar
family of methods. The general Butcher’s tableau is written as
p1 q11
p2 q21 q22
Note that in RK family of methods, the first row is zero, since the first term k1 is simply
the slope f(ti, yi). For explicit methods, all the diagonal elements and superdiagonal ele-
ments in Q are zero. This is because, as seen in Equation 3.75, the summation is only up to
(j − 1). Thus, k2 depends on h, ti, yi, k1 but not on itself and subsequent terms; k3 depends on
h, ti, yi, k1, k2 but not on itself and subsequent terms; and so on.
The Butcher’s Tableau for explicit RK methods using m terms is given by
0
p2 q21
w w wm–1 wm
Ordinary Differential Equations ◾ 95
Note that all the diagonal elements, q11, q22, … are zero, making these explicit ODE methods.
The explicit RK methods with m stages are of the form of Equations 3.74 and 3.75. The number
of stages m is nothing but the number of terms ki that are used in solving the ODE-IVP. The
Butcher’s tableau for explicit RK methods is such that its first row is zero (p1 = 0, q1ℓ = 0) and
that Q matrix is strictly lower triangular (i.e., all the diagonal elements q and superdiagonal
elements are zero). On the other hand, the order of an RK method is the order of GTE; thus,
( )
the nth order RK method has GTE ~ O hn . One interesting result for all explicit RK methods
is that the order of the RK method does not exceed the number of stages, that is, n ≤ m. This
( )
means that the RK method with four terms can at most be O h 4 accurate. There is an upper
limit on the order of the RK method with m terms. However, it is possible to choose the weights
p, Q, and w such that the order of accuracy is lower. The standard RK-3 method is written as
h
yi +1 = yi +
6
( k1 + 4k2 + k3 ) (3.76)
where
æ h hk ö
k1 = f ( t i ,yi ) , k2 = f ç t i + ,yi + 1 ÷ , k3 = f ( t i + h,yi - hk1 + 2hk2 )
è 2 2 ø
The Butcher tableau for this RK-3 method is
0
0.5 0.5
1 −1 2
1 4 1
6 6 6
96 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
( ) ( )
This method has LTE of ~ O h 4 and a GTE of ~ O h3 . Just as we had seen in RK-2 meth-
ods, it is possible to choose different values of weights p, Q, and w to obtain various RK-3
methods involving three stages. The following example presents a MATLAB code for using
standard RK-3 for solving ODE-IVP.
The above code uses RK-3 method to solve the ODE-IVP. The following results are
obtained when the code is run with various step-sizes.
h err
0.1 6.6908e-06
0.05 8.3830e-07
0.01 6.6685e-09
Ordinary Differential Equations ◾ 97
When the step-size is halved from 0.1 to 0.05, the error reduces by a factor of 8;
whereas, when the step-size is reduced by one order of magnitude, the error reduces
by three orders of magnitude. This is because RK-3 is a O h3 method. ( )
It should be noted that the actual error will also depend on f ⁗(ξ) terms. So, the
actual change in error may not be exactly (h1/h2)n; this is just an indicator of how the
numerical method behaves in a qualitative (and semiquantitative) way.
Similarly, higher-order RK methods have also been derived. Fourth-order RK methods will
be discussed in a separate Section 3.3.3 due to their popularity in ODE solving.
yiRK 2
+1 = yi + hk2
(3.77)
æ h h ö
k1 = f ( t i ,yi ) , k2 = f ç t i + ,yi + k1 ÷
è 2 2 ø
is used to solve the ODE-IVP. The solution is related to the true value as
yi +1 = yiRK 2
+1 é
3
ëh ùû + ch (3.56)
( )
since the LTE of RK-2 method is ~ O h3 . Third-order RK-3 methods have a greater
accuracy than RK-2 methods. The solution using RK-3 method (employing three function
evaluations*) was represented as
+1 = yi + h ( w1k1 + w2 k2 + w3 k3 )
yiRK 3
and the numerical solution of the RK-3 method is related to the true solution as
yi +1 = yiRK 3
+1 é
4
ëh ùû + dh (3.78)
Note that unlike Examples 3.3 and 3.6, we are considering LTE in the above equations. The
difference between the two is given by
0 = yiRK 3
+1 é ùû - yiRK
ëh
2
+1 é h ùû + dh 4 - ch3
ë
Di +1
* An RK-3 method requires at least three function evaluations. It is possible to derive RK-3 methods with more than three
function evaluations as well. I have kept this discussion simple with three function evaluations only, though the arguments
are generalized to any higher-order method.
98 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
Thus, noting that the leading error term is ch3, the above equation is approximated as
ch3 » Di +1 º Ei +1 (3.79)
Thus, the difference between the single application of RK-3 and RK-2 methods provides an
estimate of the error for the RK-2 method. This is demonstrated in the next example, where
the error in using RK-2 midpoint method will be estimated. We will use the standard RK-3
method of Example 3.6:
h
yiRK 3
+1 = yi +
6
( k1 + 4k2 + k3 ) (3.76)
where
æ h hk ö
k1 = f ( t i ,yi ) , k2 = f ç t i + ,yi + 1 ÷ , k3 = f ( t i + h,yi - hk1 + 2hk2 )
è 2 2 ø
The following example uses the RK-2 and RK-3 methods together to estimate the LTE
in RK-2.
z = y0 + h/6*(k1+4*k2+k3);
del = abs(z-yMP);
err = abs(exp(-h^3/3) – yMP);
The above code is executed for various values of h and the results are tabulated below:
h del err
0.02 6.667e-7 6.667e-7
0.1 8.325e-5 8.328e-5
0.5 9.115e-3 9.561e-3
It is clear from the above table that Δ1 is a reasonable approximation of local error E1.
Thus, using RK methods of different orders to estimate the numerical error is the first key
idea. Notice an important factor in the above example: The terms ki used in the RK-2 and
RK-3 methods are the same. In fact, we could write the RK-2 method as
0.5 0.5
1 −1 2
0 1 1
1 4 1
6 6 6
p Q
wRKn
wRK(n + 1)
Enew ( hnew )
n +1
= (3.80)
(h )
n +1
E
100 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
( )
where the LTE of the method is O hn+1 . Recall that in Equation 3.79, the RK-2 method has
( ) ( )
LTE ~ O h3 , the RK-3 method has LTE ~ O h 4 and their difference, Δ ∼ ch3. This differ-
ence, Δ, between the solutions from two different RK methods is approximately equal to the
error E of the lower order RK method, as seen in Equation 3.79.
Let us say that the desired accuracy of the numerical method was Enew = εtol. Then, from
Equation 3.80
1/n
æe ö (3.81)
hnew ~ h ç tol ÷
è D ø
is designed to give the error tolerance desired from the ODE solver.
1/3
æ 10-5 ö
hnew = 0. 5 ç -3 ÷
è 9.115 ´ 10 ø
which yields us hnew = 0.0516. Let us now run RK-2 midpoint method with this step-
size. To do so, the code in Example 3.7 was executed again with h=0.0516. With no
other change in the code:
err =
1.1448e-05
This indicates that Equation 3.81 may be used to determine the step-size that gives
approximately the desired error.
It is an explicit RK solver with four stages. The Butcher tableau for this Bogacki-Shampine
method is
0
0.5 0.5
0.75 0 0.75
(3.82)
1 2/9 1/3 4/9
yi +1 = yi + h ( w1k1 + w2 k2 + w3 k3 + w 4 k4 ) (3.83)
102 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
where
æ æ m -1 öö
ç
ç
ç
ç
è
å
k1 = f ( t i ,yi ) , km = f ( t i + pmh ) , yi + h qmj k j ÷ ÷
÷÷
ø ø
(3.84)
è j =1
For all explicit RK methods, the order of the RK method does not exceed the number of
stages. The highest order of accuracy possible for m = 2 stages is n = 2; for m = 3 stages is
n = 3; and for m = 4 stages is n = 4. This means that RK method with four terms can at most
( ) ( )
be O h 4 accurate; it is not possible to have O h5 or higher accuracy with four stages.
However, the interesting fact is that the highest order of accuracy possible for any m = 5
( )
stage RK method is still O h 4 ; the minimum number of terms required for O h5 RK-5 ( )
method is m = 6. This is an interesting result; it also happens to be the reason for popularity
of RK-4 methods. The popularity of RK-4 methods is because including the fifth term k5
does not improve the order of accuracy of RK method compared to the best four-stage RK
method.
h
yi +1 = yi +
6
( )
( k1 + 2k2 + 2k3 + k4 ) + O h5
(3.85)
where
æ h h ö æ h h ö
k2 = f ç t i + ,yi + k1 ÷ , k3 = f ç t i + ,yi + k2 ÷ , k4 = f ( t i + h,yi + hk3 )
è 2 2 ø è 2 2 ø
1/ 1/
2 2
1/ 0 1/
2 2
1 0 0 1
1/ 1/ 1/ 1/
6 3 3 6
Ordinary Differential Equations ◾ 103
The classical RK-4 method uses four intermediate stages and yields a method that has a
( )
global accuracy of O h 4 .
h
yi +1 = yi +
8
( )
( k1 + 3k2 + 3k3 + k4 ) + O h5
(3.86)
1/ 1/
3 3
2/ −1/ 1
3 3
1 1 −1 1
1/ 3/ 3/ 1/
8 8 8 8
Since the previous method is called classical RK-4, this RK-4 method is often called 3/8th
rule RK-4 method.
The classical RK-4 method was described originally in their seminal work by Carl Runge
and Martin Kutta. The method is rather elegant and easy to implement. When numeri-
cal computing started becoming popular in the first half of the twentieth century, RK-4
became an ODE-IVP method of choice. An additional term did not give any advantage
( )
because the best accuracy from a five-stage RK method still remains O h 4 . Later, when
( )
computational speeds improved, the O h 4 method with adaptive step-sizing provided suf-
ficiently high accuracy that RK-4 remained the most popular explicit ODE-IVP technique.
The fourth-order method is used for error control. The extended Butcher tableau for this
method is given by
1/5 1/5
The first set of weights give the fifth-order method. Thus, the next value of yi + 1 is given by
Since this is an embedded RK method, the same function evaluations are used in calcula-
tion of the fourth-order solution. Additionally, since k7 is not used in computing yi + 1 as per
Equation 3.87, it is computed as follows:
( (7 )
k7 = f t i +1 ,zi +1 )
The difference
D = yi +1 - yi(+1 ) (3.90)
RK 4
provides error estimate.
Ordinary Differential Equations ◾ 105
Additionally, RK-DP method is a first same as last (FSAL) method. Notice that the
coefficients for computing k7 on the last row of q weights are the same as the coefficients
used in Equation 3.87. In other words, since the coefficients used in Equation 3.87 to com-
pute yi + 1 are the same as the coefficients used for computing k7 as per Equation 3.88, the
function value k7 computed in the current step is the same as the first function evaluation
to compute k1 in the next step. Thus, one additional function evaluation is avoided due to
the FSAL property. Thus, at each step, because k1 is nothing, but k7 of the previous step, six
function evaluations (to compute k2, … , k7) are required for each step of RK-DP method.
The case study Section 3.5 will extensively use ode45 solver for solving ODE-IVP.
Before that, we end this section with the following brief example that shows application of
ode45. All the ODE methods in MATLAB follow the same format. They require a func-
tion that takes scalar-valued t and vector y as input argument and return f(t, y) as the out-
put. Function f should be of the same dimension as the vector y. The syntax for ode45 is
where tSol is the time vector at which the solutions are obtained, ySol are the solutions
y(t), tSpan gives the time span for which the solution is desired, and y0 is the initial
condition at time tSpan(1). The matrix ySol has as many rows as the number of time-
points in tSol and as many columns as the dimension of vector y.
The next example demonstrates use of ode45 for a 1D problem.
Note that since the solution is desired in the range [t0, tN], this forms the vector
tSpan. Thus, the remainder of the code consists of the following single line (and
comment):
This is all that is required to solve the ODE-IVP using ode45 solver. The solver takes
step-size of approximately ~0.0125 to meet the desired tolerance.
A ® Products
The inlet consists of the reactant introduced at a concentration CA0. If the inlet volumetric
flowrate is Q, the velocity is given by u = Q/Acs, where Acs is the area of cross-section of the
PFR. The molar flowrate is FA = QCA. The model for a PFR at steady state is given by
d
dz
( uCA ) = RA (3.91)
Consider the case where the reaction rate is given by the Langmuir-Hinshelwood model.
If there is no change in volume due to reaction, u is constant and the overall model for the
system is written as
dC A kC A (3.92)
u =- , C A0 = 1 mol/m3
dz 2
1 + K r CA
k = 2 s -1 , K r = 1 mol 2 /m6
First, this problem will be solved using ode45 solver. Thereafter, the problem will be recast
as a design problem for the PFR and solved using numerical integration. This example
will thus also be used to illustrate the relationship between integration and ODE-IVP. In
Chapter 5, extensions of the PFR will be undertaken as case studies.
FA0 - FA Q ( C A0 - C A )
XA = =
FA0 Q C A0
function dC=pfrModelFun(z,Ca)
% To compute dC/dz for a PFR for a
% single Langmuir-Hinshelwood rate
% Model Parameters
k=modelParam.k;
Kr=modelParam.Kr;
u=modelParam.u0;
% Calculation of PFR model equation
rxnRate=k*Ca/sqrt(1+Kr*Ca^2);
dC=-1/u*rxnRate;
108 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
Figure 3.4 is a plot of conversion vs. length of the PFR. The conversion increases with the
length of the PFR. Conversion of 75% is attained at PFR length of 0.32 m. Conversely, PFR
volume of 0.08 m3 is required for attaining 75% conversion.
d ( Q /Acs ) C A dFA kC A
º =- (3.93)
dz dV 1 + K r C A2
0.8
Conversion
0.6
0.4
0.2
0
0 0.1 0.2 0.3 0.4 0.5
Length (m)
FA = FA0 (1 - X A ) , C A = C A0 (1 - X A )
æ dX ö kC A0 (1 - X A )
FA0 ç - ÷=-
è dV ø 1 + K r C A2 0 (1 - X A )
2
X pfr æ 2 ö
FA0 ç 1 + K r (1 - X A ) ÷
¢
V=
ò
0
k ¢ çè (
1 - XA )
ø
÷
dX , k ¢ = kC A0 , K r¢ = K r C A2 0 (3.94)
fval
This integration can be solved using trapezoidal rule (MATLAB command trapz), as
described in Appendix E.
%% Design Parameters
kPrime=modelParam.k*CA0;
KrPrime=modelParam.Kr*CA0^2;
FA0=(u0*Acs)*CA0;
%% Design of PFR
Xpfr=0.75;
XALL=[0:0.01:Xpfr];
RALL=kPrime*(1-XALL)./sqrt(1+KrPrime*(1-XALL).^2);
fval=FA0./RALL;
% PFR Volume
V=trapz(XALL,fval);
>> V =
0.0798
110 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
Numerical integration requires splitting of the integration domain into several intervals.
In this example, I chose the interval width as h = 0.01. The function value fval can be
1
calculated as ( FA0 ) . This can be conveniently done using the powerful vector opera-
R ( XA )
tions of MATLAB (see Appendix A for a primer). The function trapz uses the trapezoidal
rule (also see Appendix E) to compute the PFR volume. The result of 0.0798 m3 is close to
the approximate value of 0.07 m3 obtained in the previous section.
0.25
0.2
0.15
FA0/R(XA)
0.1
0.05
0
0 0.25 0.5 0.75
Conversion
FIGURE 3.5 Plot of fval vs. conversion required for design of a PFR volume.
Ordinary Differential Equations ◾ 111
CA0, T0
Tj0
CA, T
Tj
dC A F
= ( C A0 - C A ) - rA
dt V
dT
V rc p = F rc p (T0 - T ) + ( -DH ) rAV - UA (T - Tj ) (3.95)
dt
dTj
Vj r j c j = Fj r j c j (Tj 0 - Tj ) + UA (T - Tj )
dt
æ E ö
with the reaction rate given by rA = k0 exp ç - ÷ C A. The values of various parameters are
given in Table 3.1. è RT ø
This problem is solved in the following example.
TABLE 3.1 Model Parameters and Operating Conditions for Jacketed CSTR
CSTR Inlet Cooling Jacket Other Parameters
F = 25 L/s Fj = 5 L/s CA0 = 4 mol/L
V = 250 L Vj = 40 L k0 = 800 s−1
ρ = 1000 kg/m3 ρj = 800 kg/m3 E/R = 4500 K
cp = 2500 J/kg · K cp = 5000 J/kg · K (−ΔH) = 250 kJ/mol
T0 = 350 K Tj0 = 25°C UA = 20 kW/K
112 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
éC A ù
ê ú
y=êT ú
ê Tj ú
ë û
The model equations are written in the standard form y′ = f(t, y), where both y and
dy are column vectors* of size 3 × 1. All the parameters need to be converted to SI
units.
The core part of the code consists of the following model description:
dy(1,1)=F/V*(C0-Ca)-rxnRate;
dy(2,1)=F/V*(T0-T)+DH*rxnRate/(rhoCp)-hXfer/(V*rhoCp);
dy(3,1)=Fj/Vj*(Tj0-Tj)+hXfer/(VRhoC_j);
Prior to this, the following terms need to be defined: rxnRate (rate of reaction),
hXfer (heat transfer term UA(T − Tj)), rhoCp (ρcp), and VRhoC_j (Vjρjcj). This
example continues after a brief interlude.
First, I will describe a “traditional” way of writing the MATLAB function that
is passed to the ode45 solver. In earlier versions of MATLAB, if one needed to
pass a large number of parameters (such as the parameters in Table 3.1), doing so
through function arguments was inconvenient and error-prone. Hence, the model
parameters required in a specific function were defined within the same func-
tion, whereas the parameters required in multiple functions were shared using
global variables. The function, cstrFunOld.m exemplifies this traditional
method.
function dy = cstrFunOld(t,y)
global C0 T0 Tj0
%% Parameters
F=25/1000; % m^3/s
V=0.25; % m^3
<other parameters also defined here>
* Keeping with the convention followed throughout this book, unless otherwise stated, all vectors are defined as n × 1 column
vectors.
Ordinary Differential Equations ◾ 113
%% Key variables
Ca=y(1); T=y(2); Tj=y(3);
rxnRate=k0*exp(-E/T)*Ca;
hXfer=UA*(T-Tj);
rhoCp=rho*Cp;
VRhoC_j=Vj*rhoj*cj;
%% Model equations
dy(1,1)=F/V*(C0-Ca)-rxnRate;
dy(2,1)=F/V*(T0-T)+DH*rxnRate/(rhoCp)-hXfer/(V*rhoCp);
dy(3,1)=Fj/Vj*(Tj0-Tj)+hXfer/(VRhoC_j);
The above code maintains separation of parameter definition and model execution func-
tions: The main part of the code starts at the second section (demarcated by %% Key
variables), prior to which all the parameters are defined. The inlet conditions are
shared with the driver function using global variables. However, a few years ago (perhaps
around 2010), anonymous functions and structure arrays were introduced in MATLAB.
My experience working with cross-functional teams in the industry has led me to embrace
these as powerful tools for writing better MATLAB programs. I personally evangelize this
“modern” method of programming, as shown below.
rxnRate=par.k0 * exp(-par.E/T)*Ca;
Here, par is a structure that is passed to the function, cstrFun, as an argument. The
fields of the variable, par.k0 and par.E represent the preexponential factor and activa-
tion energy of the reaction, respectively. Similarly, the next line is rewritten as
hXfer=par.UA*(T-Tj);
%% Key Variables
rxnRate=par.k0 * exp(-par.E/T)*Ca;
hXfer=par.UA*(T-Tj);
rhoCp=par.rho*par.cp;
VRhoC_j=par.Vj*par.rhoj*par.cj;
tau=par.V/par.F; % Residence time
tauJ=par.Vj/par.Fj;
%% Model Equations
dy(1,1)=(C0-Ca)/tau-rxnRate;
dy(2,1)=(T0-T)/tau+par.DH*rxnRate/rhoCp-hXfer/(par.V*rhoCp);
dy(3,1)=(Tj0-Tj)/tauJ+hXfer/(VRhoC_j);
Comparing with the earlier code (cstrFunOld), the parameters are not defined
within the function. Instead, they will be defined in the driver script (cstrDriver.m)
and passed to the function through par structure. cstrDriver script is given
below:
1000
CA (mol/m3)
500
0
0 10 20 30 40 50
700
600
T, Tj (K)
500
400
0 10 20 30 40 50
Time (s)
The transient simulation results are shown in Figure 3.7. This example illustrates
simulation of a jacketed CSTR. At the end of 50 s, the three modeled variables have
the values, CA = 461.0, T = 647.6, Tj = 472.6.
4000
CA (mol/m3)
2000
0
50 100 150 200
800
600
T, Tj (K)
400
200
50 100 150 200
Time (s)
Figure 3.8 shows the time evolution of the modeled variables for 150 s after step-
down in the inlet temperature. The outlet concentration increases and the tempera-
tures drop rapidly after the step change. The outlet concentration is now close to the
inlet concentration of 4 mol/L, and almost no reaction is taking place in the system.
At the end of 200 s, the values of the corresponding variables are CA = 3991, T = 298.8,
Tj = 298.4.
This example shows steady state multiplicity. When the inlet temperature was T0 = 350 K,
the system was in a steady state with high conversion. When the inlet temperature was
reduced to T0 = 298 K, the system transitioned to another steady state with low conver-
sion. This issue of steady state multiplicity will be discussed in more detail in Section II of
this book.
F0
F1t
h1
H
h2
F1b F2
d
A1 h1 = F0 - F1t - F1b
dt
d
A2 h2 = F1t + F1b - F2
dt
(3.96)
d F Q1
T1 = 0 (T0 - T1 ) +
dt A1h1 A1h1rc p
d Q2
T2 = ( F1t + F1b ) (T1 - T2 ) +
dt A2h2rc p
The first two equations are material balance on the two tanks. The last two equations come
from energy balance. These equations are obtained after rearranging the original energy bal-
ance equation, where the term on the left-hand side is A h rc T ¢ = rc A h × T ¢ + T × h¢ .
( i i p i ) p i ( i i i i )
The last term is simplified using the appropriate material balance to obtain the equations
given above.
The flowrates in the above equations are given by
F1b = c1 h1 - h2 , F2 = c2 h2 (3.97)
and
ì0 h1 £ H
ïï
F1t = íc1 h1 - H h1 > H , h2 £ H (3.98)
ï
ïîc1 h1 - h2 h2 > H
The condition for calculation of F1t can be handled in a couple of ways. One way is to use
if-then statements. Another, more elegant method is to note that first expression in
Equation 3.98 (i.e., when h1 ≤ H) can be written as F1t = c1 H - H :
ìc1 H - H h1 £ H
ïï
F1t = íc1 h1 - H h1 > H , h2 £ H (3.98)
ï
ïîc1 h1 - h2 h2 > H
When h2 > H, the second value inside the square root is h2; else it is H. Likewise, the first
value inside the square root is H when h1 ≤ H and equal to h1 otherwise. Thus, the above
equation may be written as
The forthcoming example illustrates the use of this conditional statement to evaluate the
flowrate and hence solve the model of two-tank heater system. Unlike previous example
that used SI units, I will use minutes as the unit for time and KJ for energy for convenience.
del=max(H,h1)-max(H,h2);
F1t=par.c1*sqrt(del);
The first line defines del as the term within the square root, which is then used in
computing F1t in the next line. The MATLAB function heated2TankFun.m is
function dy=heated2TankFun(t,y,par)
% Model for hybrid two-tank system with heater
Ordinary Differential Equations ◾ 119
The results are shown in Figure 3.10. Since the two liquid heights are fairly close to each
other initially, the net flow out of the second tank is greater than the flow in. Hence, the level
first falls, before increasing again.
120 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
0.6
Height (m)
0.4
0.2
0 5 10
80
60
T (°C)
40
20
0 5 10
Time (s)
FIGURE 3.10 Transient results in terms of height and temperature of water in the two tanks.
The next figure (Figure 3.11) shows the behavior of the system at higher flowrate of
0.48 m3/min. At this higher flowrate, water levels in both the tanks increase above the inter-
mediate pipe. Note that at steady state, F0 = (F1b + F1t) = F2 so that the volume of water in the
two tanks does not change.
æ éS ù ö
rg = ç m max ë û ÷ éë X ùû (3.100)
ç
è K S + éëS ùû ÷ø
Ordinary Differential Equations ◾ 121
Height (m)
0.5
0
0 5 10
50
40
T (°C)
30
20
0 5 10
Time (s)
FIGURE 3.11 Transient results for a higher inlet flowrate of 480 L/min.
where [S] and [X] are substrate and biomass concentrations, respectively. The rate con-
stant μmax is the maximum growth rate and KS is the saturation constant. At large substrate
concentration, when [S] ≫ KS, the growth rate is zero-order in substrate concentration;
whereas when [S] falls substantially, the growth rate becomes first-order in substrate
concentration.
The model equations are given by the following ODEs:
dS
= D ( S f - S ) - rg
dt
dX
= -DX + rg Yxs (3.101)
dt
dP
= -DP + rg Yps
dt
The rate constants are given as μmax = 0.5, KS = 0.25, Yxs = 0.75, and Yps = 0.65.
The next example demonstrates simulation of chemostat.
The second-last line uses ode45 to calculate the solutions tS and YS. The function,
funMonod, passed to the ODE solver is given below:
function dy=funMonod(t,y,monodParam)
mu=monodParam.mu; K=monodParam.K;
Yxs=monodParam.Yxs; Yps=monodParam.Yps;
D=monodParam.D; Sf=monodParam.Sf;
%% Variables to solve for
S=y(1); X=y(2); P=y(3);
%% Model equations
rg= mu*S/(K+S)*X;
dy(1,1)=D*(Sf-S) - rg;
dy(2,1)=-D*X + Yxs*rg;
dy(3,1)=-D*P + Yps*rg;
end
The solver takes more than a hundred steps to reach the solution. On MATLAB 2016a
version, tS is a 109-length vector, whereas the size of YS is 109 × 3. Figure 3.12 shows
the transient response of the chemostat for this problem.
This example also follows the same pattern as the previous ones. It is a good programming
practice to declare all the model parameters at one location. As the code grows more com-
plex for large projects, this way of modularizing the codes proves very useful. As a good
coding practice in the industry, we often went a step further. The parameters related to the
microbial growth kinetics would be located in one structure (say monodParam), which
include μmax, KS, Yxs, Yps. On the other hand, the chemostat operating parameters—that is, D
and Sf—would be included in another structure (say reactorParam). While this demar-
cation would be an “overkill” for this problem, it becomes useful in large projects where
modularity is a key.
Ordinary Differential Equations ◾ 123
Concentrations
3
0
0 10 20 30
Time
For the parameter values of Example 3.15, the chemostat response is smooth. Initially, at
low biomass concentrations, the substrate concentration falls gradually. After around 10 h,
there is a significant reduction in substrate concentration as the higher biomass concentra-
tion results in faster consumption of the substrate. As substrate concentration falls, the
system behavior would transition from an initial zero-order response in [S] to eventually
first-order in [S]. Finally, the system reaches steady state.
Now consider the case when the saturation constant, KS, in Equation 3.100 is reduced to
a value of 0.005 g/mL. The fourth line in driverMonod.m is changed, and the code is run
again. For this value of saturation constant, the ode45 solver becomes unstable. The results
are shown in Figure 3.13. The initial “zero-order response” is similar to that seen earlier, in
Figure 3.12. However, when the substrate concentration drops, the reaction rate becomes
first-order in [S], resulting in a rapid decrease in reaction rate that the explicit ode45
solver is unable to capture. This instability of the explicit solver is not very different from the
100
50
Concentrations
–50
–100
0 10 20 30
Time
FIGURE 3.13 Transient chemostat response for KS = 0.005 g/mL using ode45. The solver becomes
unstable due to the stiff nature of the ODE.
124 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
Concentrations
3
0
0 10 20 30
Time
FIGURE 3.14 Transient chemostat response for KS = 0.005 g/mL using ode15s. A stable solution is
obtained as ode15s is a stiff solver.
one discussed in Section 3.1.5, where the stability of Euler’s explicit and implicit methods
were compared.
STIFF SYSTEM
A stiff system refers to a system of ODEs where one or more components of the system have
a significantly different time scale of response compared to other components.
In the case of the chemostat, at low values of substrate concentration, the growth rate is
significantly lower when the reaction is first-order in [S] compared to the reaction rate
being independent of [S] (i.e., zero-order in [S]). The transition from zero- to first-order
happens when the substrate concentration is of the same order of magnitude as KS. When
the value of KS is very low, the system undergoes a very rapid reduction in the growth rate,
rg. Consequently, the rate at which substrate concentration decays falls precipitously. This
can be seen in Figure 3.14.
Figure 3.14 is obtained using ode15s, which is a stiff ODE solver provided in MATLAB.
The solver ode15s uses an implicit backward difference formula to solve the ODE. As can
be seen from this example, ode15s is the solver of choice if the system of ODEs is stiff, or
when ode45 does not work.
Comparing Figures 3.12 and 3.14, one can see that the former results in smoother curves
for [S], [X], and [P] than ode15s. This is because ode45 is a higher-order solver and is a
better solver that gives an excellent trade-off between accuracy and speed of computation.
As described in Section 3.4, it uses an embedded RK4 method. As a rule of thumb, which
is suggested by MATLAB and corroborated by experience of several expert users, ode45
should be the first solver of choice for a generic ODE problem. If the problem is stiff and
ode45 does not work, ode15s usually provides the next best option.
A further discussion on ode15s will be provided in Chapter 8 when discussing implicit
ODE solver and solution of DAEs. With this note, we end the discussion in this chapter.
Ordinary Differential Equations ◾ 125
3.6 EPILOGUE
The problem of solving ODE-IVP was introduced in Section 3.1. Several types of example
problems can be solved as ODE-IVP. Higher-order ODEs can be converted to a set of first-
order ODEs, or the problem to solve may consist of multiple simultaneous ODEs. For ODE-
IVP, the initial conditions are specified at a single point in the solution domain. All the
components are written together as a solution vector, y. The strategies to obtain ODE-IVP
in the standard form
dy
= f ( t ,y )
dt (3.1)
y (t0 ) = y 0
were discussed briefly in Section 3.1.1 and case studies provided in Section 3.5.
Second-order ODE-IVP methods were discussed in Section 3.2. This chapter focuses
on explicit RK family of methods, which are discussed at length. The stability problems with
explicit RK methods were discussed, and Section 3.5.4 demonstrated the practical implica-
tions on a simulation problem. Section 3.2.5 provided a brief peek into some other important
ODE-IVP methods. These classes of methods are available in the ODE suite of MATLAB.
Section 3.3 extended the discussion to higher-order RK methods. Adaptive step-size
methods were then discussed. All the methods in MATLAB ODE suite are adaptive step-
size methods. These methods were referred to at various points in the chapter, most specifi-
cally in Sections 3.2.5 and 3.4.
The practical aspect of solving ODE-IVP in MATLAB is summarized now. For many
problems of interest to chemical engineers, explicit ODE-IVP methods will prove suffi-
cient. The MATLAB solver ode45 is the primary go-to solver for these cases. While other
explicit solvers (such as ode23 and ode115) are available, ode45 provides very high
accuracy at good computational speed.
There are several other ODE-IVP problems that are difficult to solve. Such stiff ODEs
will be discussed in Chapter 8. If ode45 fails to converge, the stiff solver ode15s would
be the next choice. This variable-order solver is typically less accurate and may not be used
as the first solver in MATLAB. The MATLAB help documentation is an excellent resource
on implementing an ODE solver. However, choosing which ODE solver is suitable requires
some understanding of how the ODE-IVP are solved, as discussed in this book.
If your problem requires you to use a different solver than ode45 and ode15s,
reading Chapter 8 after this chapter will provide a better background to make an informed
choice.
EXERCISES
Problem 3.1 Solve Example 3.1 using RK-2 midpoint method from Equation 3.53 and
RK-2 Ralston’s method from Equation 3.54.
Problem 3.2 Use Richardson’s extrapolation for numerical integration.
126 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
Partial Differential
Equations in Time
4.1 GENERAL SETUP
Chapter 3 covered systems defined by ordinary differential equations, where the dependent
variable varied either in time or in a single spatial dimension. There are several examples
where the dependent variables vary in time and space or in multiple spatial dimensions.
Such models are represented by partial differential equations (PDEs). A general first-order
PDE in two independent variables is given by
æ ¶f ¶f ö
ç f, ¶x , ¶x , x1 , x2 ÷ = 0 (4.1)
è 1 2 ø
where
ϕ is the dependent variable
x1 and x2 are the two independent variables (e.g., x1 could be time t, and x2 could be axial
coordinate z; or x1 and x2 could be axial and radial coordinates z and r)
If the above function is nonlinear in the dependent variable or its derivatives, then the
overall PDE is nonlinear.
A generic second-order PDE contains one or more of the second-order terms, ¶ 2 f/¶x12,
¶ 2 f/¶x22, and/or ∂2ϕ/∂x1∂x2 as well.
We usually encounter either first-order or second-order PDEs in (chemical) engineer-
ing. Higher-order PDEs are not common. These PDEs are further of a specific type called
quasilinear PDEs. These may be written in general as
127
128 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
The coefficients A to F may themselves be functions of x1, x2, and/or ϕ. It would be clear
from the description above that the PDE (4.2) is second-order PDE if one or more of A, B,
and C are nonzero. If A, B, and C are all zero, this is a first-order PDE. Thus, the order of PDE
is the highest order of differentiation of ϕ in the overall equation.
The above PDE needs to have initial or boundary conditions. The number of boundary
conditions required in a particular dimension depends on the order of the PDE in that
dimension. For example, if A is nonzero, two boundary conditions are required in x1; if A is
zero, then single initial condition is sufficient in x1.
4.1.1 Classification of PDEs
Homogeneous PDE: The PDE is said to be homogeneous if F = 0.
Linear vs. Nonlinear PDE: The above PDE is linear if A to E are all functions of indepen-
dent variables only, whereas F is a linear function of ϕ (i.e., F = F0 + F1ϕ). The coefficients A
to F1 should not depend on ϕ. For example, the PDE
¶f 2 ¶f
+t = sin ( t ) (4.3)
¶t ¶x
is linear, because the coefficient E = t2 and the source term F = sin(t) are not a function of the
dependent variable, ϕ. On the other hand
¶f ¶f
+ f = sin ( t ) (4.4)
¶t ¶x
is nonlinear, because the coefficient E = ϕ depends on the dependent variable, ϕ. Additionally,
the following PDE
2
¶f æ ¶f ö
+ = G ( t ,x ) (4.5)
¶t çè ¶x ÷ø
is also nonlinear. It should be noted that Equation 4.5 is a first-order PDE since the highest
order of differentiation with respect to t and x is first order. While Equations 4.3 and 4.4 are
quasilinear PDEs, Equation 4.5 is not.
ftt = v 2 fxx
Partial Differential Equations in Time ◾ 129
to model vibrating strings.* A decade later, Euler generalized this to 2D and 3D wave equa-
tion. The Laplace equation
Ñ2 f = fxx + f yy = 0
was introduced to model Newton’s gravitational potential. Siméon Poisson modified the
Laplace equation to model Gauss’s law for gravity in differential form. This equation is now
well known as the Poisson equation and is written in a general form as ∇2ϕ = ρ. Finally, in
1822, Fourier introduced the transient heat equation:
ft = kfxx
D = B 2 - 4 AC (4.6)
The PDE is said to be elliptic if D < 0, hyperbolic if D > 0, and parabolic if D = 0. More impor-
tant than the definition is the physical significance of the terms. These terms are perhaps
motivated from the equivalence with corresponding curves in conics: ellipse, hyperbola, and
parabola. The implication of these characteristics for solving the PDEs is discussed here.
4.1.3.1 Elliptic PDE
Elliptic PDEs describe systems that have reached a steady state and where diffusion is a
dominant phenomenon (or at least where diffusion significantly affects the independent
variable). Just as ellipse is a smooth curve, elliptic PDEs describe dependent variables
that vary smoothly over space. If the conditions at the boundaries undergo an abrupt step
change, the diffusive terms will tend to smoothen out its effect in the domain. For example,
the following elliptic PDE describes temperature distribution at steady state with heat con-
duction as the dominant phenomenon:
0 = kÑ2T + S (T ) (4.7)
¶2f
* Here I have used shorthand notation ϕtt to represent , and so on.
¶t 2
130 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
æ ¶ 2T ¶ 2T ö
0 = k ç 2 + 2 ÷ + S (T ) (4.8)
è ¶x ¶y ø
In the above equation, B = 0, A = C = k; thus D < 0 and the PDE is classified as elliptic. Thus,
an elliptic PDE in two dimensions has the diffusive term dominant in both the dimensions.
Since the differential equations are second order, two boundary conditions are needed in
both X- and Y-coordinates. For example, if Equation 4.8 governs temperature in a rectan-
gular plate, boundary conditions are specified for each of the four edges of the plate. The key
feature of elliptic PDEs is that the temperature at any point in the domain will be governed
by the conditions at all the four boundaries of the plate.
Elliptic PDEs will not be covered in this chapter. The reader may refer to Chapter 7 for
discussion on solving elliptic PDEs.
4.1.3.2 Hyperbolic PDE
Hyperbolic PDEs are used to describe propagation of waves or convection of material or
energy in transport problem. In conics, a hyperbola is an open curve with two disconnected
branches. By analogy, hyperbolic PDEs can be used to model those systems that are char-
acterized by propagation of an abrupt or step change at boundary along the domain; they
also model systems where this propagation or convection results in discontinuities, such as
in shock waves.
The 1D wave equation introduced earlier is an example of hyperbolic PDE widely stud-
ied in introductory engineering math courses:
Writing this equation in standard form shows D > 0, classifying it as a hyperbolic PDE.
Let us introduce the following coordinate transformation:
x = x + vt , h = x - vt (4.10)
With this transformation, one can verify that Equation 4.9 reduces to
¶2f
=0 (4.11)
¶x¶h
f = f1 ( x ) + f 2 ( h) (4.12)
f ( x ,t ) = f1 ( x + vt ) + f 2 ( x - vt ) (4.13)
Partial Differential Equations in Time ◾ 131
The curves x + vt = c1 and x − vt = c2 are called characteristic curves. What does this physi-
cally mean? Consider a curve x − vt = 1. The second term in Equation 4.13 becomes f2(1)
and thus is a constant. Thus, the value of f2 is constant along the second characteristic.
Likewise, the value of f1(⋅) is constant along the first characteristic.
The key behavior of a hyperbolic PDE relevant to numerical solution techniques is that
the information at the boundary is propagated along the two characteristic curves.
A similar quantitative response is observed in some types of first-order PDEs, as dis-
cussed next.
¶f ¶f
q= , y=v (4.14)
¶t ¶x
¶y ¶q
=v (4.15)
¶t ¶x
Likewise, partial differential θt = ϕtt, which using Equation 4.9 can be written as
¶q ¶y
=v (4.16)
¶t ¶x
Thus, the wave equation (4.9) can be converted into the following matrix form:
¶ é q ù é0 vù ¶ é q ù
ê ú - ê ú ê ú =0 (4.17)
¶t ëy û ëv 0 û ¶x ëy û
This equation is in the form Φt − AΦx = 0. Using the concepts of eigenvalues discussed in
Chapter 2, it is easy to diagonalize the above equation. Noting that the eigenvalues of the
T T
above matrix are ±v and the eigenvectors are v 1 = é1/ 2 1/ 2 ù and v 2 = é -1/ 2 1/ 2 ù ,
ë û ë û
we can change to coordinate axes to ( v 1 v 2 ). Indeed, it is easy to do this coordinate trans-
formation. Recall from Chapter 2 that the physical interpretation of transforming to
( v1 v 2 ) implies that we define the following:
1 1
J= (q + y ), v= (q - y) (4.18)
2 2
132 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
It is easy to verify that Equations 4.15 and 4.16 get converted to the following decoupled set
of first-order PDEs:
¶J ¶J
-v =0 (4.19)
¶t ¶x
¶v ¶v
+v =0 (4.20)
¶t ¶x
The second-order PDE is converted into a set of two first-order ODEs. One would recog-
nize that the first term is the time derivative term, whereas the second is a convection term.
Thus, balance equations involving convection form a set of hyperbolic PDEs.
The curves (x + vt) and (x − vt) form the characteristic curves for PDEs (4.19) and (4.20),
respectively. The key numerical feature of hyperbolic PDEs is the convection or propagation
along the characteristics influences the solution at any point in the domain.
4.1.3.4 Parabolic PDE
Parabolic PDEs are used to solve initial value problems (IVPs) where the system is diffusion
dominated in one spatial dimension. Parabola is a smooth open curve, which represents
transition between an ellipse (closed smooth curve) and hyperbola (disconnected open
curves). In the same manner, a parabolic PDE is used to describe systems that propagate in
one direction, while the dependent variable has a smooth behavior due to a diffusive com-
ponent along the other dimension. A transient heat equation
¶T ¶ 2T
= a 2 + S (T ) (4.21)
¶t ¶z
is a prototypical example of parabolic PDE. This is an IVP in one dimension (t) and a
boundary value problem in the other dimension (z).
The key numerical feature of parabolic PDEs is the propagation of the solution along one
direction, whereas the solution at any point in the domain is influenced by both boundaries
in the other (z) direction.
Dirichlet condition: The value of independent variable itself is specified at one boundary:
f ( t ,z = z 0 ) = f0 (4.22)
Partial Differential Equations in Time ◾ 133
For example, temperature at one end or concentration entering a reactor may be specified.
Neumann condition: Here, the derivative of independent variable is specified at one
boundary:
¶f
=b (4.23)
¶z t , z1
For example, heat flux is zero at insulated boundary or the mass flux may be specified.
Mixed condition: Here, the boundary condition includes both the value of the variable
and its derivative at a boundary:
æ ¶f ö
g ç f ( t ,z1 ) , ÷÷ = 0 (4.24)
ç ¶z
è t , z1 ø
Danckwerts boundary condition is an example of mixed boundary condition. Another
example of mixed boundary condition is when the heat flux at boundary equals the rate
of heat exchange with the surroundings by convection. The mixed boundary condition
often reduces to the form
¶f
k
¶z t , z1
(
= g f ( t ,z1 ) ) (4.25)
4.2.1 Finite Difference
The first class of methods for solving PDEs is finite difference techniques. The domain is
divided into multiple intervals. The differentiation formulae (see Appendix B) are used to
approximate numerical derivatives. Thus, the spatial derivative at any location, i, is written
using backward difference formula as
¶C C - Ci -1
u = ui i
¶z i h
¶C C - Ci -1
u = ui i +1
¶z i 2h
134 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
The solutions are obtained at the individual discrete points in the domain. In a transient
PFR, the concentration will also vary with time. The time derivative may be written as
¶C C p,i - C p -1,i
=
¶t Dt
¶C C p +1,i - C p,i
=
¶t Dt
4.2.2 Method of Lines
Method of lines (MoL) is like that method described above, except that finite difference is
used only for the spatial domain. The finite difference in space
¶C C - Ci -1 ¶C C - Ci -1
= ui i , or u = ui i +1
¶z i h ¶z i 2h
converts the original PDE into a set of ODEs. The solution vector then consists of the prob-
lem variable at discrete points in the spatial domain:
T
f = éëC1 C2 Cn+1 ùû
and the ODEs are written in the standard for ϕ′ = f(ϕ). These may then be solved using
ODE solution techniques mentioned in Chapter 3.
The finite difference and MoL techniques will be discussed in detail in this book. This chap-
ter will use MoL to solve parabolic and hyperbolic PDEs by converting them into a set of
ODEs. Hence, the title of this chapter indicates the solution of transient PDEs.
¶ ( uC )out - ( uC )in
¶z
( uC ) = lim
dz
dz ®0
Partial Differential Equations in Time ◾ 135
The idea of FVM is to model instead the finite volume itself; the flux terms give inlet and
outlet from the boundaries of the volume and the transient and reaction source terms per-
tain to the reacting material in the volume itself.
Thus, in principle, FVM applied to a PFR may be considered as multiple CSTRs in series.
FVM are commonly used in computational fluid dynamics (CFD) software. FVM will not
be discussed further in the rest of this book.
C = å ci yi
where
ψi denote basis functions
ci denote the coefficients
One of the common methods to choose the basis functions is that the functions ψi have
value 1 at the ith node and zero at all other nodes. The solution C is the linear combination
of the basis functions over the entire domain. Once the basis functions are chosen and dis-
cretization is fixed, solution of a PDE using FEM mainly involves finding the coefficients ci.
FEM became popular in solid mechanics. However, applications of FEM have now been
found in a wide variety of problems. FEM continue to be popular in solid mechanics and
structural problems; they are also finding growing use in multiphysics problems where vari-
ous types of physics are involved in solving a single problem.
Again, FEM are beyond the scope of this book. It is also mentioned briefly to contrast it
with our method of choice: finite difference in space.
¶f ¶
+ ff = S ( f) (4.26)
¶t ¶x
The above equation is first-order hyperbolic PDE when the term fϕ is the convective flux.
The convective or advective component is given by
f f = uf (4.27)
where u is the velocity. For example, in continuity equation, which is an overall mass bal-
ance, ϕ = ρ, the average density and S ( f ) = 0 (except for nuclear reactions). Furthermore,
136 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
for the sake of initial discussion, we will assume that u is constant. Thus, we are initially
interested in hyperbolic PDEs of the type
¶f ¶f
+u = S ( f) (4.28)
¶t ¶x
¶C A ¶
+ ( uC A ) = rA ( C ) (4.29)
¶t ¶x
In several cases, the velocity u is constant* (or nearly constant). In such cases, the PFR is
governed by the following PDE:
¶C A ¶C
+ u A = rA ( C ) (4.30)
¶t ¶x
We will now discuss various numerical methods for solving hyperbolic PDE (4.28).
é ¶f ù f p,i - f p,i -1
ê ¶x ú = Dx
(4.31)
ë û p ,i
* This is true for liquid-phase reactions and isothermal gas-phase reactions. The velocity changes in nonisothermal gas-phase
reactions, and/or in gas-phase reactions with significant change in the number of moles due to reaction.
† The methods discussed in this section are useful for understanding some numerical basis behind finite difference techniques
for solving hyperbolic PDEs. A reader interested in numerical methods may continue reading this section, whereas a reader
interested in process simulation may skip this section without loss of continuity.
Partial Differential Equations in Time ◾ 137
where p is the time index such that t = pΔt and i is the spatial index such that x = iΔx. The
source term is calculated at time pΔt and location iΔx and is represented as Sp,i S ( pDt ,iD x ).
( )
Forward difference in time yields the following O D1t accurate approximation:
¶f f p +1,i - f p,i
= (4.32)
¶t Dt
uD t
f p +1,i = f p,i - éf p,i - f p,i -1 ùû + Dt Sp,i (4.33)
Dx ë
which is an explicit expression to calculate ϕp+1, i and march forward in time.
The local stability of the above equation can be analyzed using the so-called von Neumann
stability analysis. It is local because it ignores the effect of boundary conditions and is
derived for homogeneous equation (i.e., without the source term). The derivation of stabil-
ity conditions is relatively straightforward, but will be skipped since it is beyond the scope
of this book. The backward difference formula is stable if velocity u is positive and the fol-
lowing Courant condition is satisfied:
Dt
u £1 (4.34)
Dx
When the flow is from right to left, that is, when the velocity is negative, backward differ-
ence formula is unstable and the forward difference formula should be used.
Thus, the upwind difference formula is given by
ì uD t
ïïf p,i - D éëf p,i - f p,i -1 ùû + Dt Sp,i , if u > 0
x
f p +1,i =í (4.35)
u
ï f p ,i - tD
éëf p,i +1 - f p,i ùû + Dt Sp,i , if u < 0
ïî Dx
Figure 4.1 shows the stencil of dependence for the upwind difference scheme. The square
represents the point (p + 1, i) in the T–X space where the solution needs to be found. The
solid line and arrow connect points used to compute the spatial and temporal derivatives,
respectively. The filled circles denote points where the values are known. The stencil is
shown for all methods discussed in this section.
Let us consider the physical interpretation of the Courant condition. When u > 0, the mate-
rial flows from left to right. The conditions at any spatial location are therefore influenced
by the node that is upstream of itself. Thus, using a backward difference formula retains this
physical condition that ϕi is influenced by preceding nodes ϕi − 1. Same can be observed for
opposite flow direction also. Furthermore, the Courant condition can be written as
Dx
u£
Dt
138 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
t t
Δt
Z Z
Δz
Upwind Crank-Nicolson
t t
Z Z
FIGURE 4.1 Schematic diagram of various finite difference schemes. Thin lines represent the com-
puting grid, with the thin dashed line indicating points in the future. Solid circles are known points
at current time and empty diamonds represent unknown points to be calculated. The thick solid
lines link points connected by finite difference in space; arrows link points connected by finite dif-
ference in time.
Since u is the physical speed of propagation, the numerical rate unum = Δx/Δt should be
greater than the speed u to stably capture the effects of convection. Thus, the spatial differ-
encing should maintain direction of information propagation and the grid in time domain
(i.e., Δt) should be brought closer than the rate at which information convects.
The forward in time, first-order upwind differencing method discussed here is not accu-
rate enough to be useful for practical examples. The discussion here, however, will serve as
a useful reminder for future analysis.
é ¶f ù f p,i +1 - f p,i -1
ê ¶x ú = 2D x
(4.36)
ë û p ,i
uD t
f p +1,i = f p,i - éf p,i +1 - f p,i -1 ùû + Dt Sp,i (4.37)
2D x ë
Partial Differential Equations in Time ◾ 139
The above is an explicit relation. Unfortunately, the FTCS method is unstable for any choice
of Δt. The proof can be arrived at using the von Neumann stability analysis. This implies that
any error in the numerical solution of ϕp,i will grow unbounded and the FTCS method may
not be used for first-order hyperbolic PDEs.
4.3.1.3 Lax-Friedrichs Scheme
Lax and Friedrichs proposed a simple way to make the FTCS method stable. They replaced
the term ϕp,i on the right-hand side of Equation 4.37 with an average of the values at its
neighboring nodes: ϕp,i → (ϕp,i +1 + ϕp,i −1)/2. Thus, the updated equation using Lax-Friedrichs
method is
f p +1,i =
1
2
( f p,i +1 + f p,i -1 ) - 2uDDt éëf p,i +1 - f p,i -1 ùû + Dt Sp,i (4.38)
x
The von Neumann stability analysis results in the same Courant stability condition:
Dt
u £1 (4.34)
Dx
f p +1,i - f p,i =
1
2
( f p,i +1 - 2f p,i + f p,i -1 ) - 2uDDt éëf p,i +1 - f p,i -1 ùû + Dt Sp,i (4.39)
x
Dividing by Δt
Clearly, the first term above is an additional term, which is a finite difference approximation of
Thus, Lax-Friedrichs scheme makes the FTCS method stable by introducing a numerical
diffusion term, described in Equation 4.41.
4.3.1.4 Higher-Order Methods
Accuracy in time can be improved using higher-order differencing in time.
The leapfrogging method uses central difference formula to approximate the time derivative
as well. Recall that the central difference formula is f¢p = ( f p +1 - f p -1 ) /Dt . The update equation
can be derived in a straightforward manner and is left as an exercise. The leapfrogging method
140 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
is stable for a range of Δt values given by the Courant condition. However, the problem is that
the time derivative does not include values at time instance p. Thus, the alternate mesh points
are decoupled, due to which the errors at even- and odd-numbered nodes evolve indepen-
dently of each other. Consequently, this method often does not give acceptable results.
Lax-Wendroff method, which is a combination of Lax-Friedrichs and leapfrogging meth-
( )
ods, is another alternative that is O Dt2 accurate in time. It uses the Lax-Friedrichs scheme
with a half step to determine values ϕp + 0.5 , i ± 0.5. These two values are then used in leapfrog-
ging method, which is again implemented with half-steps. Consequently, this is a two-step
explicit formula. While the method is good in theory, it faces several issues when handling
stiff systems or systems with highly nonlinear source term, S ( f ).
Higher-order upwind method: While the above two methods provided higher-order dif-
ferencing in time, it sometimes also becomes necessary to use higher-order differencing in
space. For example, three-point upwind difference equation can be used to improve spatial
accuracy of upwind method. This will be considered in Section 4.3.3 for “method of lines.”
The derivation of forward-in-time three-point upwind method is also left as an exercise.
In summary, I have discussed several methods for using finite differences in time and
space. I have personally found explicit-in-time finite differences inadequate (poor accuracy
and stability). Crank-Nicolson (Section 4.3.2) and MoL (Section 4.3.3) are my preferred
numerical methods for solving practical problems involving PDEs.
f p +1,i - f p,i é ¶f ù
= -u ê ú + S ( f p +1,i ) (4.42)
Dt ë ¶x û p +1,i
Note that both ∂ϕ/∂x and S ( f ) are computed at time location (p + 1). A central difference
formula may be used for ∂ϕ/∂x and the first-order fully implicit formula may be derived.
The derivation is left as an exercise.
The above first-order implicit method is not very useful due to its low accuracy. Crank and
Nicolson, in the 1940s, derived a second-order implicit formula. Although they derived it for
a general parabolic PDE, it is applicable to hyperbolic PDEs as well. The time differencing is
based on the trapezoidal rule (see Appendix D for numerical integration and Chapter 8 for
ODEs). The idea is similar to the one discussed with Heun’s method in Chapter 3: A higher-
order accurate method is obtained by taking the average of function values at t = pΔt and
t = (p + 1)Δt. Specifically, the right-hand side of Equation 4.42 is replaced by an average at the
two points. This results in the following implicit formula for Crank-Nicolson method:
f p +1,i - f p,i u æ é ¶f ù é ¶f ù ö 1
= - çê ú +ê ú ÷ + ( Sp,i + Sp +1,i ) (4.43)
Dt 2 çè ë ¶x û p,i ë ¶x û p +1,i ÷ø 2
Partial Differential Equations in Time ◾ 141
The above is a nonlinear equation, which is written for each spatial location, 1 ≤ i ≤ n. All
the quantities at time-coordinate p are known and the quantities at p+1 are to be computed.
The above is a set of nonlinear equations. I have not yet discussed methods for solving non-
linear equations. Hence, I will defer numerical solution of Crank-Nicolson method until
later.
T
F p +1 = ëéf p +1,1 f p +1,2 f p +1,n ùû (4.45)
with the convention followed in this book of using column vectors. The above set of equa-
tions (4.44) is written in standard vector form as
G ( F p+1 ) = 0 (4.46)
where G is a vector of functions, with Equation 4.44 representing the ith row. Various meth-
ods for nonlinear algebraic equations are discussed in Chapter 6. An example demonstrating
the use of Crank-Nicolson method is deferred to Chapter 8 in Section II of this book.
dfi f -f
= -u i +1 i -1 + S ( fi ) (4.47)
dt 2D x
* I use h to represent step-size in discretization. In the previous section, since discretization in both space and time was dis-
cussed, I had used Δx instead. I have used the symbols h and Δx replaceably to represent step-size. The notations will be clear
from the context they are used.
142 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
Here, we have only retained index for space and dropped the time-index since the equation
is now converted to an ODE in time. If we define the vector
T
F = éëf1 f2 fn ùû (4.45)
then we will get a set of n ODE-IVP in the standard form. As before, we are not limited to
central difference in space; any appropriate finite difference approximation (see Appendix B
for details on numerical differentiation) may be used as well.
In the remainder of this section, we will use the example of an isothermal PFR to com-
pare central difference, first-order upwind and second-order upwind methods.
¶C A ¶C
+ u A = -kC A
¶t ¶z
The numerical solution of the PDE needs to be benchmarked with a known solution. For
this, we use the steady state model for the PFR:
dC A k
= - CA (4.49)
dz u
æ kz ö
C A = C A,in exp ç - ÷ (4.50)
è u ø
The numerical solution will involve discretizing Equation 4.48 in space, followed by solv-
ing the resulting system of ODEs using an appropriate ODE solver (such as ode45 or
ode15s). The next example shows the use of central difference formula in space for the
PFR example.
Solution: Equation 4.48 is discretized in n equal intervals. Since inlet conditions are
specified, we do not need to solve for the inlet node using an ODE solver. Hence, for
the sake of convenience, the inlet node is denoted as “0,” the internal nodes go from
1 to (n − 1), and the end node is denoted as nth node. The equations for the internal
nodes after discretization are
d C - Ci -1
Ci = -ui i +i - kCi , i = 2 to ( n - 1)
dt 2h (4.51)
d C - Cin
Ci = -ui i +i - kCi , i = 1
dt 2h
Since this is an IVP in spatial domain as well, boundary condition is not specified at
the end node. The domain equation itself is used at the end node, such as by using a
backward difference formula:
d C - Ci -1
Ci = -ui i - kCi , i =n (4.52)
dt h
The vector of dependent variables that is to be solved for in the ODE solver is
T
Y = ëéC1 C2 Cn ùû
Function file for ODE solver: The first step is to write the function file,
pfrMOLLinFun.m. As described in the Chapter 3, this function takes time and Y as
input arguments are returns dY/dt as the output, with variables Y and dY being n × 1
column vectors. The core of this code involves the calculation of reaction term (r) and
convection term (convec):
In the above, h=L/n. The function also requires the following variables to be defined:
k, u0, C0, L, n
I have followed a consistent and modular style of coding, wherein all the parameters
related to reactor operation are defined once in the main driver script and passed on
as function arguments. Recall that the function information is provided to the solver
ode45 using the following anonymous function definition:
@(t,Y) pfrMOLLinFun(t,Y,modelParam)
144 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
where the parameters mentioned above are provided to the function using
modelParam structure.
The complete function file pfrMOLLinFun.m is given below:
function dY=pfrMOLLinFun(t,Y,par)
% Function file for transient PFR problem
% to be solved using method of lines
%% Get the parameters
k=par.k;
C0=par.C0;
u=par.u0;
L=par.L;
n=par.n;
h=L/n;
%% Model equations
dY=zeros(n,1);
for i=1:n
r=k*Y(i);
if i==1
convec=u/(2*h) * (Y(i+1)-C0);
elseif (i==n)
convec=u/h * (Y(i)-Y(i-1));
else
convec=u/(2*h) * (Y(i+1)-Y(i-1));
end
dY(i)=-convec-r;
end
Driver script to solve the problem: As described in Chapter 1, the driver script con-
sists of three parts: defining problem parameters (the list was given above), solving
the ODE, and result output. The ODEs will be solved using ode45 in MATLAB®,
with the prescribed initial conditions. For results, the final steady state profile of CA
vs. z is plotted as figure(1) and the transients in outlet concentration plotted as
figure(2).
Y0=ones(n,1)*C0;
[T,CA]=ode45(@(t,Y) pfrMOLLinFun(t,Y,modelParam), ...
[0 10],Y0);
%% Plotting results
% Steady state plot
C_ss=[C0, CA(end,:)]; % Results: steady state
Z=[0:h:h*n]; % Location
CModel=C0*exp(-modelParam.k*Z/u0);
figure(1)
plot(Z,C_ss,'-',Z,CModel,'.');
xlabel('Length (m)'); ylabel('C_A (mol/L)');
% Exit concentration vs. time
figure(2);
plot(T,CA(:,end));
xlabel('time (min)'); ylabel('C_A (mol/m^3)');
The results at steady state are plotted in Figure 4.2. Symbols represent the analytical
solution for comparison, whereas the solid line represents the numerical solution.
There is clearly instability observed in the numerical solution. The numerical solution
oscillates around the true solution.
This example demonstrates the unsuitability of central difference formula in MoL solu-
tion with explicit transient ODE solver for convection-dominated systems.
0.95
CA (mol/L)
0.9
0.85
0.8
0.75
0 0.1 0.2 0.3 0.4 0.5
Length (m)
FIGURE 4.2 Numerical results with MoL employing central differences in space (solid line)
compared with true solution at steady state (symbols).
146 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
space will provide better stability behavior when coupled with ode45. When the flow is in
the positive z-direction, upwind implies a backward difference formula. Thus
d C - Ci -1
Ci = -ui i - kCi , "i with C0 ¬ C A,in (4.53)
dt h
( )
The two-point backward difference formula is only O h1 accurate, but is easy to imple-
ment. If a greater accuracy is desired, a higher-order formula is needed for special discreti-
zation to improve accuracy. A three-point backward difference formula
d 3C - 4Ci -1 + Ci -2
Ci = -ui i - kCi , i = 2 to n with C0 ¬ C A,in (4.54)
dt 2h
( )
can be used, with an O h2 accuracy.
The most convenient option for the first internal node (i = 1) is to use a standard two-
point backward difference formula:
d C - C A,in
Ci = -ui i - kCi , i = 1 (4.55)
dt h
While this is simple to use and does not adversely affect the stability of upwind method, its
disadvantage is that it is only O ( h ) accurate. Since the leading error is governed by the least
accurate method, using a two-point formula makes the overall scheme less accurate.
The next example presents MoL with upwind differencing in space, for the conditions
in Example 4.1. First, the standard two-point backward difference formula will be
used. A comparison between central difference (Example 4.1) and upwind difference
(Example 4.2) will demonstrate the suitability of the latter for hyperbolic PDEs. Thereafter,
the PFR problem will be solved for a greater residence time (i.e., with lower inlet velocity)
to compare the two- and three-point upwind differences.
Partial Differential Equations in Time ◾ 147
%% Model equations
dY=zeros(n,1);
for i=1:n
r=k*Y(i);
if i==1
convec=u/h*(Y(i)-C0);
else
convec=u/h*(Y(i)-Y(i-1));
end
dY(i)=-convec-r;
end
The rest of the code remains the same. The simulations are run for a time span that
is long enough for the system to reach steady state. The concentration CA (obtained
at the last time instance) is plotted against spatial location. Figure 4.3 shows that the
numerical technique is stable and the numerical solution closely matches the true
solution at steady state.
Comparison with three-point formula: The two-point upwind method gives good
performance for the conditions above. The conversion from the PFR is less than 25%.
Conversion can be increased by increasing the residence time (increase the length or
decrease the inlet velocity). The operating condition for the PFR is changed to inlet
0.95
CA (mol/L)
0.9
0.85
0.8
0.75
0 0.1 0.2 0.3 0.4 0.5
Length (m)
FIGURE 4.3 MoL with backward differences in space (solid line) compared with true solution at
steady state (symbols). The line may not be clearly visible since the symbols overlap closely.
148 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
0.8
CA (mol/L)
0.6
0.4
0.2
0
0 0.1 0.2 0.3 0.4 0.5
Length (m)
FIGURE 4.4 Comparison of two-point (solid line) and three-point (dashed line) backward differ-
ence formula solutions at steady state.
velocity of u0 = 0.05 and the number of axial nodes is reduced to n=25. The simula-
tion time span is increased to [0, 25], which is long enough to reach steady state.
Steady state results are plotted in Figure 4.4 as a solid line. Clearly, the numerical solu-
tion differs from the analytical one by ~2%–3%.
The accuracy can be improved by using the three-point upwind method. The code
for three-point upwind method is left as an exercise. The dashed line in Figure 4.4
shows that the three-point upwind difference is more accurate, as can be expected
( )
from the fact that it is O h2 accurate. Transient evolution of the concentration at the
PFR exit is plotted in Figure 4.5 as solid (two-point upwind) and dashed (three-point
upwind) lines. The two methods show some difference as we approach steady state.
Based on comparison with steady state solution in Figure 4.4, the three-point results
are more accurate.
0.8
CA (mol/m3)
0.6
0.4
0.2
0
0 5 10 15 20 25
Time (min)
FIGURE 4.5 Comparison of transient results for two- and three-point backward difference formulae.
Partial Differential Equations in Time ◾ 149
4.3.4 Numerical Diffusion
Another key issue in the finite difference methods is the so-called numerical diffusion. This
was discussed for the Lax-Friedrichs scheme, where the modification to improve stability
of central difference was shown to be equivalent to introducing diffusion. The reality is that
numerical diffusion is introduced by all the finite difference methods.
I will demonstrate this using the following example of a tracer injection. Let us consider
the flow of material through a pipe. In the absence of axial diffusion (i.e., for an ideal plug
flow), any packet of fluid spends its residence time τ = L/u within the pipe. Initially, pure
water flows through the pipe (initial condition is C(0) = 0). At certain time, t = 0+, a dye is
injected into the inlet at an inlet concentration of 1 mol/L. The flow of this tracer through
the pipe can be modeled as a plug flow system with no reaction:
¶C ¶C
+u = 0, C ( t ,z = 0 ) = Cin , C ( t = 0 ) = 0 (4.56)
¶t ¶z
Let the inlet velocity be 0.1 m/min and the PFR length as 0.5 m. Based on the physics of the
system, we expect that at 1 min after the start of injection, the tracer (at 1 mol/min) to cover
the first 0.1 m, whereas the remainder contains “original” dye-free fluid packets that have
not yet been pushed out of the PFR. This tracer moves as a step through the PFR as time
progresses until t = 0.5/0.1 min, after which the entire PFR contains water with dye.
The above problem can be solved using the code from Example 4.2, with k=0. Simulation
results obtained using upwind method with a very large number of nodes, n=10000, is
shown by solid lines in Figure 4.6. This is close to the true solution. The dotted lines rep-
resent n=1000, whereas dashed lines represent the original number of nodes n=100. All
these cases do not show perfectly vertical lines. Clearly, we observe that the tracer has “dif-
fused” into adjoining packets, leading to a smooth transition from 1 to 0 mol/L. The amount
of diffusion is negligible with n=10000 (solid lines), the effect is obvious with n=1000
0.8
5 min
CA (mol/L)
0.6
1 min
2 min
3 min
4 min
0.4
0.2
0
0 0.1 0.2 0.3 0.4 0.5
Length (m)
FIGURE 4.6 Simulation of tracer injected in a pipe at various times using 10,000 nodes (solid line),
100 nodes (dashed lines), and 1,000 nodes (dotted lines).
150 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
(dotted lines) and prominent with n=100 (dashed lines). The cause of this observed behav-
ior is numerical diffusion.
Consider the upwind difference in space. Although we wrote f¢i = ( fi - fi -1 ) /h, we have
neglected the higher-order terms. In fact, the approximation using Taylor’s series is
¶C A h 2 ¶ 2C A h 3 ¶ 3C A
C A,(i -1) = C A,i - h + - + (4.57)
¶z i 2 ! ¶z 2 i 3 ! ¶z 3 i
Therefore
d é ¶C h ¶ 2C A h 2 ¶ 3C A ù
Ci = -ui ê A - 2
+ 3
- ú (4.59)
dt êë ¶z i 2 ¶z i 6 ¶z i úû
h ¶ 2C A
ui (4.60)
2 ¶z 2
xÎéëi -1,i ùû
The numerical diffusion decreases with decreasing step-size, h, as was evident in Figure 4.6.
The reaction source term in transient PFR predominates and masks the effects of numeri-
cal diffusion. This is also true for the majority of problems of interest to chemical engineers.
Thus, first- or second-order upwind methods are sufficiently accurate with reasonable step-
size, h. However, in some problems involving tracking of a moving front, high resolution
numerical methods must be used. Such problems are not considered in this book and a
discussion of high-resolution hyperbolic PDE methods is beyond the scope of this text.
¶f ¶ æ ¶f ö
= D + S ( f) (4.62)
¶t ¶z èç ¶z ÷ø
¶f ¶ ¶ æ ¶f ö
+ ( uf ) = ç D ÷ + S ( f ) (4.63)
¶t ¶z ¶z è ¶z ø
¶f ¶f ¶2f
+u = D 2 + S ( f) (4.64)
¶t ¶z ¶z
Packed-bed reactor with axial mixing: The PFR model of the previous section assumed fluid
elements to move as “plugs” and axial back-mixing was ignored. A packed bed reactor with
axial mixing will be used as an example of parabolic PDE in this section:
¶C A ¶C ¶ 2C A
+ u A = De - kC A (4.65)
¶t ¶z ¶z 2
The initial condition remains the same as in the previous (PFR) case:
C A ( t = 0,z ) = C0 (4.66)
Two boundary conditions are required. The inlet boundary condition may be concentration
specified at the inlet, whereas a no-flux condition may be used at the outlet:
¶C A
C A ( t ,z = 0 ) = Cin , =0 (4.67)
¶z t ,z =L
The above two represent Dirichlet and Neumann boundary conditions, respectively.
The previous example was solved in dimensional quantities. For this example, let us use
a nondimensional formulation. Let us define variable ϕ = CA/Cin as the dimensionless con-
centration and ζ = z/L as the dimensionless length. Using the residence time as the key time
scale in the reactor, we define τ = t/(L/u). This yields
u ¶f uCin ¶f De Cin ¶ 2 f
Cin + = 2 - kCin f (4.68)
L ¶t L ¶z L ¶z 2
152 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
Rearranging
¶f ¶f 1 ¶ 2 f
+ = - Da × f
¶t ¶z Pe ¶z 2
uL kL (4.69)
where Pe = , Da =
De u
are Péclet and Damköhler numbers, respectively. Péclet number is the ratio of convective to
diffusive transport rates and the Damköhler number* is the ratio of reaction to convective
transport rates. The higher the Péclet number, the greater is the contribution of the convec-
tive transport compared to axial diffusion. The reaction time scale for an nth order reaction is
r (CA )
trxn = = knC An-1 (4.70)
CA
In summary, Equation 4.69 is the parabolic PDE of interest:
¶f ¶f 1 ¶ 2 f
+ = - Da × f (4.69)
¶t ¶z Pe ¶z 2
subject to initial and boundary conditions
f ( t = 0,z ) = f0 (4.71)
¶f
éëf ùû z =0 = 1, =0 (4.72)
¶x z =1
The next subsections discuss numerical methods to solve parabolic PDEs.
* More precisely, this is known as Damköhler number of the first kind, DaI. The Damköhler number of the second kind is the
ratio of reaction to transverse mass transport rates. Mass transfer in this example is axial mass transfer.
Partial Differential Equations in Time ◾ 153
numerical diffusion term. The presence of a physical diffusion in parabolic PDE makes
the original FTCS method conditionally stable.
As before, let us choose Δt as the step-size in the t-direction, Δz as the step-size in the
z-direction, and n = L/Δz as the number of spatial divisions. The central difference formula
for the diffusive term is
é ¶2f ù f p,i +1 - 2f p,i + f p,i -1
ê 2ú = (4.73)
ë ¶z û p,i D 2z
The forward difference in time and central difference for convective term remain the same
as before. Substituting the finite differences in Equation 4.64 and rearranging yields the
FTCS update:
uD t DD
f p +1,i = f p,i - éëf p,i +1 - f p,i -1 ùû + 2 t éëf p,i +1 - 2f p,i + f p,i -1 ùû + Dt Sp,i (4.74)
2D z Dz
Comparing the above equation with Equation 4.37, we see that the diffusive term is additional.
DD
The diffusive term, 2 t éëf p,i +1 - 2f p,i + f p,i -1 ùû, makes the FTCS stable for parabolic PDEs.
Dz
The condition for stability of FTCS is
DDt 1 u 2 Dt
< , <1 (4.75)
D 2x 2 2D
The first condition implies Δt should be chosen such that the grid is sufficient to capture
the physical diffusive effects governed by the coefficient D. The second condition ensures
that the diffusive component at the local grid is dominant over the convective component.
Note that for a purely diffusive system, the second condition is trivially satisfied since u = 0.
Leapfrogging and Lax-Wendroff methods, which are both O Dt2 accurate, can be adapted ( )
for parabolic PDEs as well. They follow the same principles as elaborated before.
4.4.2 Crank-Nicolson Method
The implicit Crank-Nicolson method, described in Section 4.3.2, is applicable for para-
bolic PDEs as well. Crank-Nicolson method is unconditionally stable* for both hyperbolic
and parabolic PDEs. In other words, there is no limit on the step-size Δt to ensure stability.
The reader can verify that the Crank-Nicolson method for Equation 4.64 is
f p +1,i - f p,i u æ f p,i +1 - f p,i -1 f p +1,i +1 - f p +1,i -1 ö 1
+ ç + ÷ - ( Sp,i + Sp +1,i )
Dt 2è 2D x 2D x ø 2
D æ f p,i +1 - 2f p,i + f p,i -1 f p +1,i +1 - 2f p +1,i + f p +1,i -1 ö
- ç + ÷=0 (4.76)
2è D 2x D 2x ø
* It should be noted that these stability conditions are derived with constant source term. In practice, the stability conditions
may be violated due to strongly nonlinear source terms. Also, these conditions are derived locally, implying that the effect of
( )
boundary conditions or strongly varying coefficients u or D is not covered in these derivations.
154 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
The above is a nonlinear equation, which is written for each spatial location, 1 ≤ i ≤ n. The
quantities at time p + 1 should be computed simultaneously using an appropriate nonlin-
ear equation solver. Chapters 6 and 7 will discuss techniques that will equip us to solve
Equation 4.76 and the solution will be discussed in Chapter 8.
and the step-size, h = L/n. The discretized equations for the internal nodes (i = 2 to n–1) are
d f -f f - 2fi + fi -1
fi = - i +1 i -1 + i +1 - Da i × fi (4.78)
dt 2h Pei h2
The nondimensional inlet boundary condition yields the equation for the first node (i = 1):
d f - 1 f - 2fi + 1
fi = - i +1 + i +1 - Dai × fi , i = 1 (4.79)
dt 2h Pei h2
The above domain equation for the end node (i = n) introduces a variable at a new point,
ϕn + 1. This point is not within the domain that is being solved, but can be eliminated using
the exit boundary condition. The Neumann boundary condition at exit yields
Substituting the value of ϕn + 1 into the equation above, the following ODE is obtained for i = n:
d -2fi + 2fi -1
fi = - Da i × fi , i = n (4.81)
dt Pei h2
Partial Differential Equations in Time ◾ 155
As an example, we will modify the PFR conditions from Example 4.1. The residence
time is tres = 10 and Damköhler number is Da = 2. Considering the effective diffusivity,
De = 5 ´ 10-3 m2/min, which is at the higher end of the spectrum, for small molecules, the
value of Pe = 1. For this condition, the model at steady state is
d 2 f ¶f
- - 2f = 0 (4.82)
dx2 ¶x
(D 2
)
-D -2 f = 0
(4.83)
are (2, −1), the steady state solution for the system can be obtained as
e 2x + 2e 3-x
f= (4.84)
1 + 2e 3
The next example shows a numerical solution of the axial dispersion model using MoL for
these conditions.
function dY=axialDiffLinFun(t,Y,par)
% Function for transient reactor w/ axial mixing
% to be solved using method of lines
%% Get the parameters
n=par.n;
Pe=par.Pe;
Da=par.Da;
%% Model equations
h=1/n;
dY=zeros(n,1);
for i=1:n
r=Da*Y(i);
if i==1
convec=(Y(i+1)-1)/(2*h);
diffu =(Y(i+1)-2*Y(i)+1)/(Pe*h^2);
156 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
elseif (i==n)
convec=0;
diffu =(-2*Y(i)+2*Y(i-1))/(Pe*h^2);
else
convec=(Y(i+1)-Y(i-1))/(2*h);
diffu =(Y(i+1)-2*Y(i)+Y(i-1))/(Pe*h^2);
end
dY(i)=-convec+diffu-r;
end
The driver script is like the ones we have used before, and hence will be skipped for
brevity. The steady state analytical solution is computed as
CModel=(exp(2*Z)+2*exp(3-Z))/(1+2*exp(3));
and plotted as symbols for comparison. The numerical solution at various times
(τ = [0.02, 0.1, 0.2], which corresponds to 0.2, 1, and 2 min) is plotted in Figure 4.7.
The dashed line is the steady state numerical solution and the symbols are the analyti-
cal solution. Clearly, the numerical solution closely matches the analytical solution at
steady state, even with n=25 axial nodes.
As we did in the previous section, we can run the simulations with initial concentration as
ϕ = 0 and no reaction condition of Da = 0. This will simulate tracer experiment, where a dye
is injected in a reactor and its fate is tracked. Two different conditions are simulated, when
Pe = 5 (solid lines) and when Pe = 50 (dashed lines). For a system that is convection dominant,
Pe → ∞. As axial mixing increases, D increases and, consequently, Péclet number decreases.
Thus, the former case (Pe = 5) is further away from the PFR than the latter (Pe = 50).
Figure 4.8 shows the transient evolution of step tracer in a reactor with axial diffusion.
Although at steady state, both cases show uniform concentration of ϕ(ξ) = 1, the transients
1
τ = 0.02
0.9
τ = 0.1
Concentration, φ
0.8
τ = 0.2
0.7
0.6 S.S.
0.5
0 0.5 1
Distance, ξ
FIGURE 4.7 Concentration profile within a packed bed reactor with axial dispersion and first-order
reaction at various times. Symbols represent analytical solution at steady state.
Partial Differential Equations in Time ◾ 157
1
Dash: Pe = 50
0.8
Concentration, φ
Solid:
0.6
Pe = 5
0.4
0.2
0
0 0.5 1
Distance, ξ
FIGURE 4.8 Simulation of tracer injected in a reactor with axial dispersion at various times. Solid
lines represent Pe = 5, and dashed lines represent Pe = 50.
are significantly different. For a larger Pe value, we see a diffuse tracer front traveling in the
reactor, whereas the tracer is highly diffused at the lower Pe value.
In this case study, we will build upon the isothermal PFR model to consider a noniso-
thermal liquid-phase PFR with complex reaction kinetics. The velocity may be assumed
constant for the liquid-phase system. The PFR material balance remains the same as before:
¶C A ¶C
+ u A = rA ( C A ) (4.30)
¶t ¶z
kC A
rA = -
K r + CA
(4.85)
æ E ö æ DE ö
k = k0 exp ç - ÷ , K r = K 0 exp ç - r ÷
è RT ø è RT ø
¶T ¶T ( -DH r ) r
+u = (4.86)
¶t ¶z rc p
The model for nonisothermal PFR consists of two hyperbolic PDEs, (4.30) and (4.86),
subject to Dirichlet boundary condition at the inlet (inlet concentration is specified). The
initial conditions will be assumed to be the same as at the inlet. The parameters are given
in Example 4.4 itself.
Now, we are solving for two variables, CA and T. The above two PDEs are discretized
using the upwind method. The spatial domain is discretized into n intervals and the two
variables are specified at each of the nodes. The variable to be solved for will therefore be
é C A1 ù
ê ú
ê T1 ú
êC A2 ú
ê ú
T2 ú
F = êê
ú
ê ú
ê ú
êC ú
ê An ú
ëê Tn úû
Note the arrangement of variables in the column vector Φ. We first list out both the vari-
ables at node-1, followed by both variables at node-2, and so on. The above structure of Φ is
a personal preference, due to convenience of this organization of the dependent variables.
Note that when Equations 4.30 and 4.86 are discretized, the concentration and temperature
depend on the values at the same ith node and the values at its neighboring nodes. Thus,
I find the above structuring more convenient to use.
Partial Differential Equations in Time ◾ 159
éC A1 C A2 C An ù
F=ê ú (4.87)
ë T1 T2 Tn û
YBar=reshape(Y,2,n);
The individual columns of Y are reshaped into a 2 × n matrix. Reshape can also be used to
convert the above matrix back into the original form:
Y=reshape(YBar,2*n,1);
One needs to be careful in using this command. The elements in the first column are “read”
in a sequence, followed by the second column, and so on.
The transient adiabatic PFR problem is solved in the example below.
é E æ 1 1 öù é DE æ 1 1 öù
k = k1 exp ê - ç - ÷ ú , K r = K r1 exp ê - r ç T - T ÷ú
ë R è T T1 ø û ë R è 1 øû
The heat of reaction is (–ΔH) = 100 kJ/mol and fluid density and the specific heat
capacity is ρcp = 800 J/mol K.
Solve the PFR problem for L = 2 m and u0 = 0.4 m/min.
Solution: I will reuse the function file from Example 4.2 as a starting point.
Reaction rate calculation: As a first step, we will make the function more modu-
1.
lar by using a separate function to calculate the reaction rate. Since reaction rate
is dependent on only the local temperature and concentration, the function to
calculate the reaction rate is given below:
function r=rxnRate(T,C,par)
% Function computes local rate of reaction
160 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
The advantage of a modular function is that the overall PFR solver can be reused
for a variety of different conditions. For example, if simulating a PFR for a dif-
ferent process, the core of PFR solver does not change; only the reaction rate and
process parameters change. The former is captured in the reaction rate function,
whereas the latter are provided in the driver script.
Function for PFR balances: Next, let us write the function for PFR balance equa-
2.
tions. The function arguments will be time t and solution vector Y. First step is
to extract concentration and temperature from Y. Indeed, this can be done by
using CA=Y(1:2:end); and T=Y(2:2:end);.
However, I will show a more convenient and flexible way to do so using
reshape:
YBar=reshape(Y,2,n);
CA=YBar(1,:);
T=YBar(end,:);
CA=YBar(1:nsp,:);
At this stage, I should mention that using the colon-index notations is also easy,
but I find reshape method more elegant.
The discretized equation will contain two terms: the discretized convective
term and the reaction term. We will use upwind method for the former, while
the latter is computed by invoking the function rxnRate that we created ear-
lier. This is computed for each of the n nodes in a for loop. This forms the
Partial Differential Equations in Time ◾ 161
core “%% Model Equations” section of the function file, where we will
compute both dC and dT. The vector dY will be extracted from these two using
reshape again.
The function file is given below:
function dY=pfrAdiabfun(t,Y,par)
% Function file for transient adiabatic PFR
% solved using method of lines
%% Get the parameters
n=par.n; % Number of nodes
L=par.L;
h=L/n;
u=par.u0; % Velocity remains constant
C0=par.Cin; % Inlet concentration
T0=par.Tin; % Inlet temperature
rCp=par.rCp;
DH=par.DH;
%% Variables
Y=reshape(Y,2,n);
CA=Y(1,:);
T=Y(2,:);
dC=zeros(n,1);
dT=zeros(n,1);
%% Model equations
for i=1:n
% Reaction rate
r=rxnRate(T(i),CA(i),par);
% Convection term
if i==1
convecC=u/h*(CA(i)-C0);
convecT=u/h*(T(i)-T0);
else
convecC=u/h*(CA(i)-CA(i-1));
convecT=u/h*(T(i)-T(i-1));
end
% Model equations
dC(i)=-convecC-r;
dT(i)=-convecT+(-DH)*r/rCp;
end
%% Return dY as vector
dY=reshape([dC’;dT’],2*n,1);
Recall the caution I had expressed in using reshape. Note the structure of
matrix F from Equation 4.87: The first row contains concentration and the
162 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
%% Displaying results
% Extract values for plotting
CA=Ysol(:,1:2:end);
T=Ysol(:,2:2:end);
XA=(modelParam.Cin-CA)/modelParam.Cin;
h=modelParam.L/n;
Z=[0:n]*h;
* In fact, in this example, it may even be more convenient to define dC and dT as row vectors. However, I have found that
beginners often find it confusing when different conventions are used for different examples. Hence, it is more for the
pedagogical reason that I have stuck to the convention of defining vectors as column vectors, except when there is a good
reason to do otherwise.
Partial Differential Equations in Time ◾ 163
The results are shown in Figures 4.9 and 4.10, which will be discussed presently.
Grid independence study is an important step that must be performed to ensure that the
numerical procedure gives an acceptable solution. In examples until this point, we had the
analytical solution to compare. However, the PFR model in this example cannot be solved
analytically. As we have seen multiple times before, decreasing the step-size improves the
accuracy. Grid independence study verifies the adequacy of the chosen grid by comparing
the results with two different grid sizes. A rule of thumb that we employ is that the solution
is taken as grid-independent if halving the step-size (i.e., doubling the number of nodes)
causes no further change in the solution. Figure 4.9 shows an example of grid-independence
study. For this example, we may consider the solutions from n = 100 to be grid independent
(thick line in the figure), because doubling the number of nodes to 200 did not cause a sig-
nificant change in the axial profile of conversion at steady state.
Figure 4.10 shows the temporal evolution of concentration and temperature in the PFR.
Initially, the temperature and concentration are uniform. As reaction takes place, eventually
0.8
Conversion
0.6
0.4 n = 100
n = 200
0.2 n = 50
n = 25
0
0 0.5 1 1.5 2
Location (m)
FIGURE 4.9 Conversion vs. axial location at steady state for different spatial discretization.
164 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
600
550
Temperature
500
450
(b)
1
CA (mol/m3)
0.5
0
0 0.5 1 1.5 2
(a) Location (m)
FIGURE 4.10 Temperature (a) and axial profiles of concentration (b) at various times from start
(0.5, 1.0, 1.5, 2 min, and steady state). The arrow marks increasing time, t.
the steady state is reached. The arrow shows increasing time. As time progresses, the con-
version and temperature both increase smoothly to reach the steady state value.
A ® P + Q, 0.05C A
(4.88)
A ® P + R, 0.005C ACP
There are five species in the system: A, P, Q, R, and inerts. The model for a packed bed
reactor is given by
¶Ck ¶C ¶ 2Ck
¶t
+ u k = Deff
¶z ¶z 2
+ ån r kj j (4.89)
j
Partial Differential Equations in Time ◾ 165
where k represents the five species and j sums over the two reactions. The value of Deff can
be computed from Taylor’s correlation as Deff = 3.57 u0d f . The boundary condition at the
inlet is the Dirichlet condition; inlet concentrations of all the species are known. The r eactor
inlet consists of the reactant A and inerts. A boundary condition is not implemented at the
outlet; instead, backward difference formula is used at the boundary node and the diffusive
term is neglected. The initial condition is that the reactor contains only inerts. Thus, the
initial and inlet boundary conditions are given by
C ( 0,z ) = éë0
T
0 0 0 CI 0 ùû
(4.90)
C ( t ,z = 0 ) = ëéC A,in
T
0 0 0 C I ,in ùû
We will use MoL to solve the PDE. The spatial domain is discretized in n intervals, with
step-size h = L/n. Central difference formula will be used for convective as well as diffusive
terms, which are given by
The solution vector consists of the five concentrations at each of the n locations:
éC A1 C A2 C An ù
ê ú
Y=ê ú (4.91)
ê CI1 CI 2 CIn úû
ë
It is convenient to express Y in the above form. However, the ODE solver, ode45, requires
Y as a column vector.* The interconversion between the two can be easily done as
Y = reshape(Y, nSpecies,n);
é k1C A ù
r=ê ú (4.92)
ëk2C ACP û
and the axial dispersion coefficient is computed as Deff = 0.013 m2 /s . The next example
demonstrates this problem.
* MATLAB experts know that this is not strictly true. However, if you were one, you wouldn’t be reading this book. So, let us
assume this statement is true!
166 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
function r=reactionRate(C,par)
%% Calculate reaction rate
r(1,1)=par.k(1)*C(1);
r(2,1)=par.k(2)*C(1)*C(2);
end
Writing the main function file: The main function file consists of a code to compute
é dCk ù
ê dt ú . As per Equation 4.89, this will consist of three parts: convective term, conv-
ë ûi
Term; axial diffusive term, diffTerm; and reaction term, rxnTerm. The stoichio-
metric matrix for this example is defined as
é -1 -1ù
ê ú
ê1 -1ú
n=ê1 0ú
ê ú
ê0 1ú
ê0 0 úû
ë
The powerful matrix operations of MATLAB can be used to compute the reaction
term for all five species at any location simultaneously, because
é -1 -1ù
ê ú
ê1 -1ú
é k1C A ù
å n kj rj = ê 1
ê
0 úê ú
ú ëk2C ACP û
(4.93)
ê0 1 ú
j
r
ê0 0ûú
ë
Equipped with the above information, the function for computing the overall model
can be written as follows:
function dY=packBedFun(t,Y,par)
%% Get parameters and variables
nsp=par.nSpecies;
n=par.n; h=par.h;
Partial Differential Equations in Time ◾ 167
Y=reshape(Y,nsp,n);
dY=zeros(nsp,n);
CIn=par.CIn;
nuStoic=par.stoicCoef;
%% Model equations
for i=1:n
C=Y(:,i);
rxnTerm=nuStoic*reactionRate(C,par);
if (i==1)
diffTerm=par.DCoef*(Y(:,i+1)-2*C+CIn)/h^2;
convTerm=par.u0*(Y(:,i+1)-CIn)/(2*h);
elseif(i==n)
diffTerm=0;
convTerm=par.u0*(C-Y(:,i-1))/h;
else
diffTerm=par.DCoef*(Y(:,i+1)-2*C+Y(:,i-1))/h^2;
convTerm=par.u0*(Y(:,i+1)-Y(:,i-1))/(2*h);
end
dY(:,i)=-convTerm + diffTerm + rxnTerm;
end
dY=reshape(dY,n*nsp,1);
The use of reshape command to interconvert YÛY is underlined in the code above.
Let us investigate the term convTerm (for internal nodes). The spatial derivative is
é C A,i +1 - C A,i -1 ù
ê 2h ú
éC A ù ê ú æ éC A,i +1 ù éC A,i -1 ù ö
ê ú ê CP ,i +1 - CP ,i -1 ú çê ú ê ú÷
¶ êCP ú 1 ç êCP ,i +1 ú êCP ,i -1 ú ÷
=ê 2h ú = ç -
¶z ê ú ê ú 2h ç ê ú ê ú ÷÷
ê ú ê ê
ú
êë CI úû ê CI ,i +1 - CI ,i -1 ú ç ê CI ,i +1 úú êê CI ,i -1 úú ÷
èë û ë ûø
êë 2h ú
û
Based on the definition of Y as per Equation 4.91, the first term in the brackets is
Y(:,i+1) and the second term is Y(:,i-1). Hence, the convection term was
written as
Likewise, diffTerm follows the same arguments. Both these yield a 5 × 1 column
vector. The reaction source term, ∑jνkjrj, as per Equation 4.93, is computed for all five
species in rxnTerm, which also yields a 5 × 1 column vector.*
* This demonstrates how the power of MATLAB matrix operations is utilized fully to write highly efficient and readable
codes. Recall my insistence on defining all vectors as column vectors, unless the structure of a problem or storage of
“historical” data suggests otherwise.
168 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
Driver script: The driver script is straightforward and follows the same pattern as
before. The parameters, inlet boundary values, and initial conditions are specified. The
MATLAB command repmat is used to obtain the entire initial condition vector Y0.
The ODE is solved using ode45 for a time span of [0 100]. The grid independence
was verified by running the code for n = 25 nodes and comparing the results with
n = 50. The closeness of the two suggests that n = 25 gives a solution with sufficient
accuracy. The code with this value of n is given below:
% Driver script for simulation of a
% packed bed reactor with multiple species
%% Model parameters
Q=0.01;
d=0.25;
Acs=pi/4*d^2;
u0=Q/Acs; modelParam.u0=u0;
L=10; modelParam.L=L;
DCoef=0.013; modelParam.DCoef=DCoef;
% Reaction conditions
modelParam.nSpecies=5;
modelParam.k=[0.05; 0.005];
modelParam.stoicCoef=[-1 -1; 1 -1; 1 0; 0 1; 0 0];
% Inlet conditions
CaIn=5.0; CiIn=45.0;
CIn=[CaIn; 0; 0; 0; CiIn];
modelParam.CIn=CIn;
%% Initialization and solution
n=25; modelParam.n=n;
h=L/n; modelParam.h=h;
C0=[0;0;0;0;CiIn];
Y0=repmat(C0,n,1);
[tSol,YSol]=ode45(@(t,Y) packBedFun(t,Y,modelParam),...
[0 100], Y0);
%% Plotting results
figure(1)
Z=0:h:L;
C_All=[CIn', YSol(end,:)];
C_All=reshape(C_All,5,n+1);
plot(Z,C_All(1:4,:));
Figure 4.11 shows the axial concentration profiles for the four species at steady state.
The profiles show expected behavior.
The code above will be used to further analyze the behavior of a packed bed reactor. Figure
4.12 shows the temporal variations in the concentrations of the four species at the exit.
The residence time in the reactor is 49 s. The concentrations stay at their initial condition
Partial Differential Equations in Time ◾ 169
CQ
Concentrations (mol/m3)
4
CA
3 CP
1
CR
0
0 5 10
Location (m)
FIGURE 4.11 Axial concentration profiles of the four species at steady state.
4.5
CQ
Concentrations (mol/m3)
3 CP
1.5
CR
CA
0
0 50 100
Time (s)
FIGURE 4.12 Transient profiles of the four species (except the inert) at the exit. The inlet conditions
start affecting the exit only after some time, which depends on the residence time in the reactor.
for the initial 40 s. The effect of inlet conditions only start being observed at the exit after
a sufficient time has elapsed. The results seem to indicate that convection is predominant
under the operating conditions considered in this example. In the absence of the diffusive
term (and numerical diffusion), the exit concentrations would be affected only after 49 s.
However, the diffusive term makes the overall response smoother.
The importance of the diffusive term is further verified by comparing the results of
the packed bed reactor with a PFR (dashed line) in Figure 4.13. At the nominal value of
Deff = 0.013, the two curves are fairly close to each other. Since PFR represents the case with
no axial diffusion, one could conclude that the convection is dominant and diffusion plays
only a minor role in this packed bed reactor example. The dash-dot line in Figure 4.13 rep-
resents an order of magnitude greater value of Deff . Clearly, this increase in axial diffusion
results in a decrease in conversion.
170 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
5
PFR
Concentration, CA (mol/m3)
4 Deff = 0.13
Deff = 0.013
3
0
0 5 10
Location (m)
FIGURE 4.13 Axial profile of CA for packed bed reactor with Deff = 0.013 (solid line) and results
from steady state PFR simulations (dashed line) for comparison. The profile for a much higher value
of Deff = 0.13 (dash-dot line) is also shown.
¶T a ¶ æ ¶T ö
u = r (4.94)
¶z r ¶r çè ¶r ÷ø
where
u is the velocity
α is the thermal diffusivity
The inlet temperature at z = 0 is specified. The symmetry boundary condition is applicable at
the center, whereas either the temperature (Dirichlet) or the heat flux (Neumann) is specified
at the wall. We will consider the problem where boundary temperature is specified:
T ( z = 0,r ) = T0
é ¶T ù (4.95)
ê ¶r ú = 0, T ( r = R ) = Tb
ë û r =0
æ ærö 2
ö
u ( r ) = umax ç 1 - ç ÷ ÷ (4.96)
ç èRø ÷
è ø
Partial Differential Equations in Time ◾ 171
n+1
n
2
1
Parabolic Radial grid Flat profile
FIGURE 4.14 Velocity profile and radial grid used in Graetz problem.
It needs to be emphasized (for readers without heat transfer background) that for a tube,
R is a given value (radius) while r is a variable that changes from 0 at the center to R at
the wall.
We have simulated transient parabolic PDEs before in this chapter. The PDE (4.94) is
exactly of the same form, the only difference being that both independent variables are
spatial. The same tools employing MoL can be applied to this example as well. We will dis-
cretize using finite differences in radial direction. The velocity profile and computational
grid are schematically shown in Figure 4.14. The location at the axis of symmetry (r = 0)
is 1. Since the temperature at the final node is specified, we do not need to solve for the final
node. Discretizing the PDE for radian nodes 1 to n yields the following set of ODEs in axial
direction:
dT a é Tj +1 - 2Tj + Tj -1 1 Tj +1 - Tj -1 ù
= ê + ú , j = 2 to ( n - 1)
dz u j ë h2 rj 2h û
(4.97)
dT a é Tb - 2Tj + Tj -1 1 Tb - Tj -1 ù
= ê + ú, j = n
dz u j ë h2 rj 2h û
where Tb is the boundary temperature specified. The symmetry boundary condition at the
first node results in the condition T2 − T0 = 0, yielding the following ODE:
dT a é 2Tj +1 - 2Tj ù
= ú, j =1 (4.98)
dz u j êë h2 û
Note that since ∂T/∂r is zero at the boundary, the second term drops out of the above equa-
tion. The following example shows the simulation of fluid flowing in a tube with hot walls.
This is the Graetz problem with constant boundary temperature.
Solution: This is a standard parabolic PDE, where we use MoL. Discretizing in radial
direction leads to a set of ODEs given by Equations 4.97 and 4.98. The following func-
tion file, which follows the same pattern as previous examples, is used:
function dTdz=graetzFun(z,T,par)
% Function to compute dT/dz for
% heat transfer in a heated tube
r=par.r; % Radial coordinate
u=par.u; % Radial velocity profile
n=par.n; % Nbr of radial nodes
h=par.h; % Radial step-size
alpha=par.alpha;
Tb=par.Tb;
% Model equations
dTdz=zeros(n,1);
for j=1:n
if (j==1)
term1=(2*T(j+1)-2*T(j))/h^2;
term2=0;
elseif (j==n)
term1=(Tb-2*T(j)+T(j-1))/h^2;
term2=(Tb-T(j-1))/(2*h*r(j));
else
term1=(T(j+1)-2*T(j)+T(j-1))/h^2;
term2=(T(j+1)-T(j-1))/(2*h*r(j));
end
dTdz(j)=alpha/u(j)*(term1+term2);
end
Although obvious to intermediate and expert users, it bears repeating that the overall
methodology and structure of the file remains exactly as before. The following driver
script is used to solve the Graetz problem:
modelPar.L=L; modelPar.R=R;
modelPar.n=n; modelPar.h=h;
modelPar.r=r;
modelPar.u=u;
%% Set up and solve the PDE
T0=ones(n,1)*modelPar.Tin;
opts=odeset('relTol',1e-12,'absTol',1e-10);
[Zsol,Tsol]=ode15s(@(z,T) graetzFun(z,T,modelPar),...
[0, L], T0, opts);
%% Plot results and post-process
% Include wall temperature in Tsol
nAxial=length(Zsol);
Twall=modelPar.Tb*ones(nAxial,1);
Tsol=[Tsol,Twall];
% Temperature vs. location
figure(1);
plot(Zsol,Tsol(:,[1,26,51,76,86]));
xlabel('Axial location, z (m)');
ylabel('Temperature (K)');
% Temperature contours
figure(2)
subplot(2,1,1)
contourf(Zsol,r,Tsol');
colorbar
xlabel('Axial location, z (m)');
Figure 4.15 shows the temperature profile vs. axial distance for four different radial loca-
tions in the tube. The final temperature is higher and the temperature increases more rap-
idly as one goes closer to the hot wall.
400
380
Temperature (K)
360
340
320
300
0 0.5 1 1.5 2
Axial location, z (m)
FIGURE 4.15 Axial temperature profiles at various locations: r = 1.25, 2.50, 3.75, and 4.25 cm from
the central symmetry axis. The arrow indicates increasing radial coordinate.
174 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
0.05 400
350
0 300
0 1 2
(a)
0.05 400
350
0 300
0 1 2
(b) Axial location, z (m)
FIGURE 4.16 Contours of temperature for parabolic (a) velocity profile and plug flow (b).
u=uMax/2*ones(n+1,1); % Plug-flow
Recall that the average velocity in a tube is uavg = umax/2. Comparing the contour maps,
the warm region “diffuses” into the tube faster with plug flow than parabolic velocity profile.
This is expected since the Nusselt number (Nu) for the two cases is 5.78 (plug) and 3.66
(parabolic).
hD
Nu = (4.99)
k
Partial Differential Equations in Time ◾ 175
é ¶T ù
kê ú
ë ¶r û R
This is the rate at which energy enters the tube through a unit area of the wall. At any axial
location, if ⟨T⟩ is the average temperature, the convective term for heat transfer to the fluid
is given by
(
h Tb - T )
Since at steady state these two are equal at any axial location, we get
é ¶T ù
h ê ¶r ú
ë ûR (4.100)
Nu = D = 2R
k Tb - T ( )
Note that this definition is valid for each axial location.
The term ∂T/∂r can be obtained using three-point backward difference formula:
T z =
ò T ( z,r ) × ru (r ) dr
0
R
(4.102)
ò ru (r ) dr
0
Since we have already precomputed vectors r and u, the denominator is calculated using
trapz. Likewise, we will use trapz to calculate the numerator at each axial location. The
code for calculating Nu profile is given below.
% Calculate mixing-cup T
Tavg=zeros(nAxial,1);
176 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
103
umax = 0.5 m/s
umax = 0.1 m/s
102
Nu
101
100
0 0.5 1 1.5 2
Axial location, z (m)
FIGURE 4.17 Nu profile for parabolic velocity profile for two values of umax.
denom=trapz(r,u.*r);
for i=1:nAxial
Tradial=Tsol(i,:);
numer=trapz(r,u.*r.*Tradial');
Tavg(i)=numer/denom;
end
% Calculate dT/dr at wall
dTdr=zeros(nAxial,1);
for i=1:nAxial
dTdr(i)=(3*Tb-4*Tsol(i,n)+Tsol(i,n-1))/(2*h);
end
% Calculate and plot Nusselt Nbr
Nu=2*R* dTdr ./ (modelPar.Tb-Tavg);
figure(3)
semilogy(Zsol,Nu)
Figure 4.17 shows the Nu profile for fully developed velocity in a tube. Note the
logarithmic scale on Y-axis. The value of Nu settles at its asymptotic value when
the temperature profile is fully developed. Hence, at a lower velocity of 0.1 m/s, the
asymptotic value of Nu is reached very rapidly.
The other example of Graetz problem is when the flux is specified at the boundary. This is
left as an exercise for the reader.
4.6 EPILOGUE
Numerical techniques for solving transient PDEs were the focus of this chapter. An intro-
duction to PDEs was presented in Section 4.1. Most of the distributed parameter prob-
lems of interest to chemical engineers are expressed in the form of quasilinear PDEs of the
first and second order. The historic classification of these PDEs as elliptic, hyperbolic, and
parabolic was discussed. Elliptic PDEs arise in multidimensional steady state balances of
Partial Differential Equations in Time ◾ 177
systems where diffusion terms play an important or predominant role. Hyperbolic PDEs
describe convection-dominated systems. The solution propagates without damping along
the characteristic curves. Parabolic PDEs, on the other hand, lie somewhere between ellip-
tic and hyperbolic; like hyperbolic systems, the solution propagates along the first indepen-
dent variable (usually time), but like elliptic systems, the solution at any point in the domain
is influenced by both the boundaries.
Numerical methods based on finite difference approximation and MoL were discussed
in Section 4.3 for hyperbolic PDEs and in Section 4.4 for parabolic PDEs. Methods that
are explicit in time are attractive because they are computationally less intensive than fully
implicit methods. MoL was the predominant approach discussed in this chapter to solve the
hyperbolic and parabolic PDEs. It involves discretizing the PDEs in space to convert them
into a series of ODEs in time. ODE solution techniques from the previous chapter may then
be used for solving the resulting problem.
The physical nature of the hyperbolic and parabolic PDEs determines the choice of
numerical method to solve the problem. I showed that central difference formula for dis-
cretizing the convection term can result in an unstable behavior of the solution of PDEs.
Thus, upwind method (either first or higher order) was suggested for spatial discretization
in hyperbolic PDEs.
Section 4.3.4 also discussed the issue of numerical diffusion in hyperbolic PDEs.
The diffusion term in parabolic PDEs results in a stabilizing effect, allowing one to use
the central difference formula for discretizing in the spatial domain.
The Crank-Nicolson method was introduced as a fully implicit in-time numerical solu-
tion technique based on finite differences (Sections 4.3.2 and 4.4.2). However, this method
was not discussed further. The actual solution using Crank-Nicolson method will be dis-
cussed in Chapter 8, in Section II of this book.
In summary, MoL is the method of choice for solving hyperbolic and parabolic PDEs.
MoL allows us to use higher-order formulae for integrating in time. The nature of the two
systems dictates the use of upwind method and central difference for discretization in
hyperbolic and parabolic PDEs, respectively. The MATLAB ode45 and ode15s are the
preferred solvers for solving the discretized equations.
EXERCISES
Problem 4.1 (a) Derive the leapfrogging scheme for first-order hyperbolic PDE (4.28)
using a central difference in time and central difference in space.
(b) Derive forward-in-time second-order upwind method for first-order
hyperbolic PDE (4.28) with u > 0. Use forward difference approximation
for ϕt and a three-point backward difference formula for ϕx.
(c) How does the upwind method derived in (b) above change for u < 0?
(d) Derive the fully implicit formula for hyperbolic PDE using first-order
approximation in time and central difference in space. Extend it to para-
bolic PDEs as well.
178 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
Problem 4.2 Solve the PFR with first-order reaction (see Example 4.1) using various
explicit forward difference in time and upwind in space discretization from
Section 4.3.1.
Problem 4.3 Complete the problem in Example 4.2 by writing the code for MoL with
three-point upwind method. Verify results shown in Figures 4.4 and 4.5.
Problem 4.4 Convert the general parabolic PDE (4.64) into a set of ODEs using MoL.
Use central difference formula for the diffusive term and upwind method
for the convective term.
Problem 4.5 The adiabatic PFR problem in Section 4.5.1 was solved ignoring the axial
diffusion. If the axial diffusion was relevant, this will result in parabolic PDE.
(a) If D is the diffusivity and λ is the thermal conductivity, then write down
the mass and energy balance equations for a reactor with axial diffusion.
(b) Solve the above problem using MoL. The Péclet number for heat and
mass transfer are PeH = uL/α = 1 and Pe M = uL / D = 2.5, respectively.
Note that α = λ/ρcp is the thermal diffusivity. No-flux boundary condi-
¶C ¶T
tions may be chosen at the exit, that is, = 0 and = 0.
¶z z = L ¶z z = L
Problem 4.6 Solve the Graetz problem with flux specified at the boundary. The model
equation and boundary conditions are given by
¶T a ¶ æ ¶T ö
u = r (4.94)
¶z r ¶r çè ¶r ÷ø
T ( z = 0,r ) = T0
é ¶T ù é ¶T ù (4.103)
ê ¶r ú = 0, ê ¶r ú = \beta
ë û r =0 ë ûr =R
æ æ r ö2 ö
u ( r ) = umax ç 1 - ç ÷ ÷ (4.96)
ç èRø ÷
è ø
Problem 4.7 Compute the Nu profiles for the above problem and compare with the
constant wall temperature case.
CHAPTER 5
Section Wrap-Up
Simulation and Analysis
Several examples were covered as case studies in this part of the book. Chapter 2 laid the
linear systems basics, which we will use in this chapter to bring together the contents in this
part. The next two chapters focused on process simulation examples, where the model was
intended to capture how the solutions evolve in time.
Chapter 3 covered examples that can be cast into an ordinary differential equation–initial
value problem (ODE-IVP) of the type
dy
= f ( t ,y ) (5.1)
dt
y (t0 ) = y 0 (5.2)
These included transient simulations, where y ∈ R n is a solution vector that varied in time,
as well as cases where the solution variable y varied along a single spatial direction. The fol-
lowing case studies were discussed:
* A hybrid system is governed by continuous dynamics and binary (or integer) switching variables. In the two-tank heater
case study, the flow between two tanks was turned on or off based on the height of the liquid in the two tanks. In systems
engineering parlance, such systems are known as autonomous hybrid systems. In contrast, the other type of hybrid systems
involves ones where an inlet may be switched on or off (binary decision).
179
180 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
Chapter 4 extended our analysis to systems that vary in more than one dimensions (space
and time, or two spatial dimensions). Such systems* are governed by partial differential
equations (PDEs). The chapter focused on hyperbolic and parabolic PDEs. Such systems
arise due to material, energy, or momentum balance equations and are represented as
¶f ¶ ¶ æ ¶f ö
+ ( uf ) = ç G ÷ + S ( f ) (5.3)
¶t ¶z ¶z è ¶z ø
The first term on the right-hand side is the diffusive term. The PDE is said to be parabolic if
this term is included in the model, whereas it is hyperbolic if it can be ignored. The method
of lines approach used finite difference approximation to discretize the z-dimension and
convert the PDE (5.3) into a set of ordinary differential equations (ODEs) of the form
Equation 5.1. With such a conversion
é f1 ù
ê ú
f2
yê ú (5.4)
êú
ê ú
êëfn úû
where ϕi represents the variable ϕ at the ith spatial node. A good balance of case studies was
provided in Chapter 4 to cover the most important systems:
This chapter builds upon the understanding gained in the previous two chapters for simu-
lation and linear analysis of process systems. The examples are meant to highlight a rather
broad spectrum of problems. Often, textbooks or courses deal with examples that can be
bucketed into “ODE-IVP” or “parabolic PDE” or “hyperbolic-PDE” molds. However, in
simulating practical systems, such distinctions are not so clearly defined. The case studies
in this chapter are intended to break these definitions. The intention here is to show that
concepts in the previous chapters are very much applicable, it is just a matter of breaking
these molds from our minds.
* Process variables in such systems vary with both time and space. Hence, systems governed by PDEs are referred to as
distributed parameter systems. On the other hand, systems defined by ODEs have their properties evolve in one direction
(time for transient systems, or one spatial dimension for examples such as PFR), and are therefore termed as lumped
parameter systems. This distinction is not particularly useful for simulating the processes, but is sometimes useful in
analysis and control.
Section Wrap-Up ◾ 181
The core concepts that I will build on in this chapter include eigenvalues, eigenvectors,
and decomposition (Chapter 2); explicit and stiff ODE solving using ode45 and ode15s
(Chapter 3); and finite difference discretization using central or upwind differencing
(Chapter 4).*
5.1.1 Model Description
The mole fraction of the lighter component will be the process state variable.
Figure 5.1 also shows representation of a single equilibrium stage in the column. The liq-
uid and vapor are in equilibrium, and they flow from one stage to the next, and the process
is thus modeled as a sequence of equilibrium stages. The figure also shows a single ith stage
in the column. The mole fractions of the lighter component in the liquid and gas phase on
this stage are represented as xi and yi, respectively. These are assumed to be in equilibrium.
The molar flowrate of liquid and vapor streams exiting the stage are Li and Vi, respectively.
D, xD
F, z
Vi, yi
yi
xi
B, xB
Li, xi
(a) (b)
FIGURE 5.1 Schematic of (a) a distillation column and (b) a single equilibrium stage in the column.
* In each chapter, we spent some time discussing theory, and then solving problems that used these concepts to provide
efficient solutions. I would like to emphasize that the definitions such as stiff vs. nonstiff or hyperbolic vs. parabolic are
not critical. What matters more is an understanding about how these categories affect our choice of numerical techniques.
182 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
The distillate is drawn from the condenser at the top of the column. The distillate flow-
rate is D and the mole fraction is xD. The bottoms is drawn from the reboiler at a flowrate of
B and mole fraction xB. The feed enters the feed tray as a saturated liquid, with a flowrate F
and mole fraction z.
The liquid holdup on each stage is assumed constant. Thus, the overall material balance
requires the following condition to be satisfied:
F = D + B
A couple more assumptions are made to simplify the model. The first one is that of constant
relative volatility, which is defined as
y/(1 - y )
a= (5.5)
x/(1 - x )
Larger the value of α, easier it is to separate the two components. Constant relative volatility
is a reasonable assumption for mixtures that are close to ideal. Thus, the liquid- and vapor-
phase compositions are related to each other as
ax i
yi = (5.6)
1 + ( a - 1) xi
The second assumption is that of a constant molar flowrate. In other words, Li and Vi are con-
stant in the stripping (below the feed stage) and rectification (above the feed stage) sections.
The molar balance on the light component (see schematic in Figure 5.1) on each tray
yields
dxi
Mi = Vi +1 yi +1 + Li -1 xi -1 - Vi yi - Li xi (5.7)
dt
The model for the feed tray (see Equation 5.12) will include feed term as well.
One of the assumptions was the molar flowrate remains constant. Thus, the vapor-phase
flowrate remains constant at V within the entire column. Likewise, within the rectification
section (i > ifeed), the liquid molar flowrate is constant at L. However, feed gets added as liq-
uid on the feed tray; so, in the stripping section (i ≤ ifeed), Li = L + F.
The liquid stream that enters the reboiler is split into two streams, bottoms and a boil-up
stream. Thus
L + F = V + B
Likewise, the vapor that enters condenser is split into distillate and reflux stream:
V = L + D
Section Wrap-Up ◾ 183
Li = L, Vi = V (5.8)
Li = Ls , Vi = V , where Ls = L + F (5.9)
dx1
Condenser: MC = Vy2 - Dx1 - Lx1 (5.10)
dt
dxi
Rectification stage: M = V ( yi +1 - yi ) + L ( xi -1 - xi ) (5.11)
dt
dxi
Feed stage: M = V ( yi +1 - yi ) + Lxi -1 + Fz f - Ls xi (5.12)
dt
dxi
Stripping stages: M = V ( yi +1 - yi ) + Ls ( xi -1 - xi ) (5.13)
dt
dx N
Reboiler: M R = Ls x N -1 - Vy N - Bx N (5.14)
dt
The following example simulates the binary distillation column for an easy separation
process with α = 4. The distillation column has five stages, one each in the rectification and
stripping sections.
Solution: We will simulate the system in the so-called “D-V mode.” Here, the distil-
late and vapor flowrates are specified. Since there are only two degrees of freedom,
the other flowrates are computed from these two. Other modes are the L-V mode, or
where reflux ratio (RR = L/D) is specified along with one bottoms parameter.
184 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
The readers who have followed the previous chapters will realize that the binary distil-
lation function file is written in the same manner as before:
V=par.V; L=par.L;
D=par.D; B=par.B;
F=par.F; xf=par.xf;
Ls = L+F;
The driver code for the distillation column is given below. Again, the driver script is
self-explanatory. The above code as well as the driver script are hard-coded for five
stages. Expanding it to a larger number of stages is left as an exercise. Since α = 4, this
is a rather easy separation.
F=25;
D=12.5; % D-V mode: D is given
V=30.0; % D-V mode: V is given
L=V-D;
B=F-D;
modelPar.F=F;
modelPar.B=B; modelPar.D=D;
modelPar.V=V; modelPar.L=L;
modelPar.alpha=4;
Section Wrap-Up ◾ 185
0.8
Composition, xi
0.6
0.4
0.2
0
0 200 400 600 800
Time (s)
FIGURE 5.2 Transient results for the data given in the table.
%% Plotting results
figure(1)
plot(tSol,xSol);
figure(2)
plot(xSol(end,:)); hold on;
R=L/D;
disp(['Reflux ratio: ',num2str(R), ...
'=> xd: ',num2str(xSol(end,1))]);
The transient simulation results are shown in Figure 5.2. Starting at the initial condi-
tions, the compositions settle at their steady state values in about 10 min. At simulation
conditions, the distillate at 12.5 mol/s contains 89.2% by moles of lighter component.
The reflux ratio is 1.4.
1
RR = 1.4
0.8 RR = 2
RR = 8
Composition, xi
0.6
0.4
0.2
0
1 2 3 4 5
Tray location
FIGURE 5.3 Steady state profile of compositions at various trays for three different conditions.
1
α=4
0.8 α=3
α=2
Composition, xi
0.6
0.4
0.2
0
1 2 3 4 5
Tray location
flowrates (L, V, and Ls) are significantly higher to achieve the same separation. This will
require a larger distillation column.
Alternatively, a distillation column with a larger number of trays may be used to achieve
a higher amount of separation. This is left as an exercise.
Relative volatility is an indicator of how easy or difficult separation through distillation
will be. Two close-boiling liquids will have relative volatility closer to 1 and are difficult to
separate. On the other hand, mixtures with large α are easier to separate. Figure 5.4 shows
the performance of a distillation column for three different values of relative volatility. As
relative volatility decreases, the purity of the two components also decreases. This indicates
that separation of liquids with small values of α would require a larger number of trays or
higher reflux or both.
systems in multiple dimensions is linear stability analysis. In this section, the need for a
linear stability analysis will be first motivated using the chemostat example. Qualitatively
different types of dynamical behavior of linear systems will be reviewed in subsequent sub-
sections in this section.
As I shall argue in this section that analysis of linear systems itself is an important field.
The analysis in this section will build upon linear algebra concepts discussed in Chapter 2,
specifically eigenvalues and eigenvectors and vector subspace. It will be useful to review
those concepts to get a better understanding of linear system analysis.
Even to an undergraduate student, this topic is not entirely new. We have been intro-
duced to the topic of stability analysis in the Process Dynamics and Control course. One
may recall the discussion that a stable system has all its poles in the left-half plane. The same
requirement for closed-loop systems gave rise to the Routh-Hurwitz criterion for stability.
These criteria were introduced for analysis in the Laplace domain; similar criteria will be
generalized and discussed in the time domain, applied to a (multidimensional) system* of
ODEs y′ = Ay.
Since S and X do not depend on P, we will ignore P from further discussion. Recall from
Chapter 3 that starting at an arbitrary initial condition, the system attained steady state
and reached the above values. This indicates that the above steady state is a stable steady
state.
The stability of a system can be analyzed in the vicinity of the steady state by linearizing
the nonlinear model at that steady state. A function f(y) can be linearized around steady
state ySS using multivariate Taylor’s series expansion:
f ( y ) » f ( y SS ) + Ñf ( y SS ) ( y - y SS ) (5.16)
where J = ∇f(ySS) is the Jacobian. Only the first-order term is retained for linearization. By
definition, at the steady state solution, f(ySS) = 0. Thus, the linearized model is given by
d
y » Jy (5.17)
dt
* The discussion in control texts is for a controlled system y′ = Ay + Bu. Here, we are only analyzing the “autonomous” part of
the system, without the input manipulation. The stability concepts, though, are general.
188 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
where y = y - y SS represents deviation from the steady state value. In the case of the chemo-
stat, the Jacobian is calculated as
é ¶f1 ¶f1 ù
ê ¶X ú
J = ê ¶S ú (5.18)
ê ¶f 2 ¶f 2 ú
ëê ¶S ¶X úû
é mSX ù
é f1 ù ê D ( S f - S ) - K + S ú
ê ú=ê ú (5.19)
ë f 2 û ê DX - mSX Yxs ú
êë K +S úû
é mXK mS ù
ê -D - - ú
(K + S)
2
K +S
ê
J =ê ú (5.20)
mXK mS ú
ê Y
2 xs
-D + Yxs ú
êë ( K + S ) K +S ú
û
é -4.06 -0.1333ù
J =ê ú
ë 2.97 0 û
The eigenvalues and eigenvectors of the system are computed using the eig command.
The eigenvalues are λ1 = − 3.96 and λ2 = − 0.1, and the corresponding eigenvectors are
é -0.8 ù é 0.0337 ù
v1 = ê ú , v2 = ê ú
ë 0. 6 û ë -0.9994 û
Eigenvalue decomposition was discussed in Chapter 2. Any matrix with distinct eigenval-
ues can be written in the form J = VΛV−1, where V is a matrix whose n columns are com-
posed of n eigenvectors of the matrix. Coordinate transformation was also discussed in
Section Wrap-Up ◾ 189
y = [ v 1 v 2 ] y (5.21)
so that y = [v1 v2 ] y , where y is the same vector in the transformed coordinate system.
Therefore
V y ¢ = JV y
(5.22)
y ¢ = V -1
JV y
L
d
y = li yi Þ yi = e lit (5.23)
dt i
The above is stable iff the real part of the eigenvalues, re(λi) < 0. In vector form
y = e Lt Þ y = Ve LtV -1
Stability condition for y' = Jy: Thus, the condition for stability of a linear ODE is that the
real part of all eigenvalues of the matrix J should be less than zero. When the eigenvalues
of the Jacobian are plotted on a complex plane, all the eigenvalues should lie in the left-half
plane for the system to be linearly stable.
One can check that the linear system is indeed stable, as evidenced by xSol reaching
the origin at the end time. The values of Jacobian and its eigenvalues and eigenvectors are
é -4.06 -0.1333ù
J =ê ú, li = {-3.96, - 0.1} (5.24)
ë 2.97 0 û
é -0.8 ù é 0.0337 ù
v1 = ê ú , v2 = ê ú (5.25)
ë 0. 6 û ë -0.9994 û
mSX
S¢ = D ( S f - S ) - (5.26)
K + S
x2
x1
FIGURE 5.5 Phase-plane plot for linear bioreactor from Example 5.2. The two dashed lines repre-
sent the two eigenvectors.
Section Wrap-Up ◾ 191
mSX
X ¢ = -DX + Yxs (5.27)
K +S
æ mS ö
0 = X ç -D + Yxs ÷ (5.28)
è K + S ø
has two solutions. The first one was mentioned in (5.15) above, whereas X = 0 (and hence
S = Sf ) is the second solution. This second solution may be called trivial solution.
Starting from a condition X = 0, we always reach the trivial solution. However, if the sim-
ulations are started with a very small positive value (say X = 10−3), the system still reaches
the steady state denoted in (5.15). This behavior can be further understood by calculating
the Jacobian at the trivial steady state. Specifically
é -0.1 -0.476 ù
J =ê ú
ë 0 0.257 û
the two eigenvalues are λ1 = − 0.1 and λ2 = 0.257, and the corresponding eigenvectors are
é1 ù é -0.8 ù
v1 = ê ú , v2 = ê ú
ë0 û ë 0. 6 û
With a change of coordinate axes to the eigenvectors, the first eigenvalue represents a stable
response. The first eigenvector is the Stable Subspace: Starting at any arbitrary point along
v1, the system asymptotically approaches the steady state. Since the first eigenvector is the
S-axis, an initial condition [S0 0]T , where S0 is any arbitrary value, will result in a stable
response. This is exactly what we observed with a nonlinear system in Chapter 3, where
X(0) = 0 led to the trivial solution. The system linearized at the trivial steady state can be
analyzed in a manner similar to Example 5.2, with steady state values, S=5 and X=0.
In summary, the eigenvalues of the Jacobian matrix determine the local stability of a
system. For an n × n Jacobian matrix, if all n eigenvalues have negative real parts, then the
system is stable in the vicinity of that steady state. If only m eigenvalues have negative real
parts (with m < n), the m-dimensional subspace spanned by the corresponding eigenvectors
is the stable subspace, whereas the remaining (n–m)-dimensional subspace is the unstable
subspace of the system. The stability of a linear system is further analyzed in this section.
general n-dimensional system. The preceding discussion not only motivates the need to
better understand linear dynamics but also provides a template to do so.
The case of real and distinct eigenvalues was discussed in the preceding section. Consider
the ODE
y ¢ = Ay (5.29)
y ¢ = V -1
AV y (5.22)
L
converted the original ODE into a set of decoupled ODEs. Based on this diagonalization,
we argued that eigenvalues in the left-half plane (i.e., eigenvalues with negative real parts)
are stable. This result is not limited to distinct eigenvalues but is applicable for all linear
dynamical systems. The proof of the above result is skipped since this is beyond the focus
of this book.
Nonetheless, let us take the discussion further using 2D examples in the rest of this sec-
tion. We will first take the case of real and distinct eigenvalues, and then extend it to other
general cases.
é c e l1t ù
y = ê 1 ú (5.30)
l t
êëc2e 2 úû
The solution, y, is obtained by changing the coordinate system back to the original coor-
dinates. From Equation 5.21
é c1e l1t ù
y = éë v 1 ù
v2 û ê l t ú
êëc2e 2 úû (5.31)
y = c1 v 1e l1t + c2 v 2e l2t
which is a familiar result from an introductory calculus course. The constants of integration
are obtained by substituting the initial condition:
é c1 ù
y 0 = c1 v 1 + c2 v 2 = éë v 1 v 2 ùû ê ú (5.32)
ëc2 û
Section Wrap-Up ◾ 193
The implication of the above result is interesting. If the initial condition y0 lies on the first
eigenvector, y0 = c1v1 and c2 = 0.
The expression (5.32) also clarifies the geometric interpretation of results discussed in
previous subsection, which is summarized below:
1. If both λ1 and λ2 are negative, the system is stable, starting at any point in the state
space.
2. If both λ1 and λ2 are positive, the system is unstable at any point in the state space.
3. If λ1 is negative and λ2 is positive, then the system is a saddle node, with the first eigen-
vector as stable subspace and the second eigenvector as unstable subspace. In other
words, if the initial condition is y0 = αv1, then the system is stable. This is because as
per (5.32), the solution is y(t) = αv1e l1t ; since λ1 < 0, the system is stable.
The dynamics for this case is interesting. If y0 is not in the stable subspace, the final
solution is represented as
y ( t ) = c1 v 1e l1t + c2 v 1e l1t
The dynamics along v1 are convergent and stabilize after sufficient time, whereas
dynamics in the direction of v2 are divergent.
4. The final case is when λ1 is negative and λ2 = 0. The system is stable along the first
eigenvector, whereas it remains unchanged along v2.
As a first example, consider
é -4 1ù
y¢ = ê úy (5.33)
-3û
ë2
A
The two eigenvalues are λ = {−5, −2}, and the two eigenvectors are
é1ù é1 ù
v1 = ê ú , v 2 = ê ú
ë -1û ë2 û
Figure 5.6 shows the transient response of the ODE for two different initial conditions.
The first is starting from the first eigenvector. Specifically, y0 = v1. The system will evolve
along this eigenvector, with its response governed by the first eigenvalue; that is
é e -5t ù
y ( t ) = ê -5t ú
êë -e úû
194 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
–1
1
0.5
0
Time
FIGURE 5.6 Transient response and phase portrait for linear ODE with A=[-4, 1; 2, -3].
The two eigenvectors are v1=[1; -1] and v2=[1; 2].
é0.5e -2t ù
y ( t ) = ê -2t ú
êë e úû
These two are shown on the left panels of Figure 5.6. Clearly, the system responds faster if
the initial condition is along the eigenvector v1, whereas the response is slower if the initial
condition is along the eigenvector v2. Thus, the more negative eigenvector is the fast mode
of the system and the less negative eigenvector is the slow mode of the system.
The right-hand panel of Figure 5.6 shows a phase-planeplot (also known as phase por-
trait). This is a 2D plot of state variable y1 vs. y2. It shows how the system responds starting
at various points in the state space. Specifically, with various initial conditions, ode45 is
used to find the variation of y with time; the locus of points y1 vs. y2 on the phase plane
gives the overall phase portrait of the system. The phase portrait charts out the dynamic
response of the system in the relevant regions in the two main dimensions. The dashed
lines on the phase portrait represents the system starting along either of the two eigenvec-
tors. Since v1 = [1 − 1]T is the fast eigenmode, the system response along v1 is faster than
the response along v2. The trajectories therefore approach v2 and then move along it to
reach the origin.
Figures 5.7 and 5.8 show the phase portraits of the linear system when the eigenvalues are
real and distinct. The eigenvalues are mentioned in the figure captions, and the eigenvec-
tors are indicated as dashed thick lines. When the two eigenvalues are negative and close
to each other (Figure 5.7a), the trajectories move uniformly in the state space. When the
two eigenvalues are an order of magnitude different (as in panels b and c), the trajectories
move rapidly along the fast eigenvector to reach the slow eigenvector; after this, the trajecto-
ries move along the slow eigenvectors. When the eigenvalue is increased to 0, the trajectories
move only along the first eigenvector and remain unchanged along the second eigenvector.
Section Wrap-Up ◾ 195
(a) (b)
(c) (d)
FIGURE 5.7 Phase portrait for various 2D linear systems with two distinct real eigenvalues: (a) a
stable node with (λ1 = − 5, λ2 = − 4), (b) a stable node with (λ1 = − 5, λ2 = − 0.5), (c) a stable node with
(λ1 = − 0.5, λ2 = − 5), and (d) a case when one eigenvalue is 0. The two eigenvectors are v1=[1; -1]
and v2=[1; 2].
(a) (b)
FIGURE 5.8 Phase portrait for various unstable 2D linear systems with two distinct real eigenval-
ues: (a) an unstable saddle node (λ1 = − 5, λ2 = 2) and (b) an unstable node (λ1 = 5, λ2 = 2). The two
eigenvectors are v1=[1; -1] and v2=[1; 2].
When one of the eigenvalues becomes positive (see Figure 5.8a), the system behaves as a
saddle node. The first eigenvector is the stable subspace, whereas the second eigenvector is
unstable. The overall system is unstable. If the initial condition is along the stable eigenvec-
tor, the system asymptotically goes to origin. Starting from other points in the state space,
the system reaches the unstable eigenvector and moves along that eigenvector to instability.
When the other eigenvalue is positive as well (Figure 5.8b), the system becomes unstable in
the entire 2D space. The linear systems of ODEs for which the results in Figures 5.7 and 5.8
196 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
are generated are given below. The subscripts represent the phase portrait responses (a)
through (f):
One can easily verify using the eig command that the above matrices represent the ODE
systems y′ = A(⋅)y for Figures 5.7 and 5.8.
The phase portrait when eigenvalues are complex is shown in Figure 5.9. Complex eigen-
values always occur in complex conjugate pairs. Since
e(
a +ib )t
(
= e at cos ( bt ) + i sin ( bt ) )
the real part of the eigenvalues determine stability and the imaginary part results in oscilla-
tory behavior. Thus, if the real part of the eigenvalues re(λi) is negative, the system exhibits
(a) (b)
(c)
FIGURE 5.9 Various examples when eigenvalues are a complex conjugate pair: (a) a stable spiral
when re(λi) < 0, (b) a center when λi are imaginary, and (c) an unstable spiral when re(λi) > 0.
Section Wrap-Up ◾ 197
stable spiral in phase portrait, if re(λi) is positive, the system exhibits unstable spiral, and if
the eigenvalues are purely imaginary, the phase portrait consists of concentric closed cyclic
trajectories. This is known as center. This behavior is discussed in the next example.
5.2.2.2 An Example
A classical example is a mass-spring system. The motion of a mass suspended from an ideal
spring that is displaced from its natural position is given by
mx ¢¢ + kx = 0 (5.34)
Consider displacement (x) and velocity (v) as the two state variables:
d éx ù é 0 1ù é x ù
ê ú=ê úê ú (5.35)
dt ë v û ë -k / m 0û ë v û
The eigenvalues of the above matrix are li = ±i k / m . Thus, the mass-spring system is qual-
itatively represented by the phase portrait in Figure 5.9b. The mass at rest in its equilibrium
position, x = 0, continues to remain at rest at that position. If the mass is displaced from its
equilibrium position, the spring makes the mass return to its original position. However,
in the absence of damping, the mass will keep oscillating around its equilibrium position.
If the original displacement of mass was greater, the mass will keep oscillating with a larger
amplitude. The phase portrait would therefore be one of the various closed cycles, which
depends on the original displacement of the mass from its equilibrium position.
5.2.2.3 Summary
Figure 5.10 summarizes the qualitative responses of the system for various eigenvalues
denoted in the central figure on a complex plane. Eigenvalues on a real axis result in stable
Im(λ)
Re(λ)
y ¢ = Ay
has so far focused on the effect of eigenvalues on the system dynamic behavior. The eigen-
values determine the asymptotic or long-time behavior of the system. Eigenvalues with
negative real parts indicate stable response. When the two eigenvalues are close to each
other, the system does not “favor” any direction (see Figure 5.7a). However, when the
ratio of eigenvalues increases, the system response shows “faster” and “slower” modes (see
Figure 5.7b and c). The larger the magnitude of negative eigenvalue, the faster is the sys-
tem response along the corresponding eigenvector. This contrasting behavior along fast and
slow modes becomes especially significant as the ratio of the eigenvalues increases.
While the previous discussion focused on eigenvalues, the importance of eigenvectors
will be highlighted in this subsection.
AAT = AT A (5.36)
An important property that follows from this definition is that the eigenvectors of a nor-
mal matrix are orthogonal to each other. Conversely, a nonnormal matrix is one where
the eigenvectors are not orthogonal. Geometric interpretation is that a highly nonnormal
matrix will have its eigenvectors further away from being orthogonal (or at a closer angle
to each other).
There are several ways to quantify nonnormality. The one definition that is aligned to
the discussion in this section is adapted from the one presented by Ruhe.* When a matrix
is normal, the eigenvalues and singular values are equal. Thus, if |λi| and σi represent an
ordered set of eigenvalues (based on modulus) and singular values, respectively, then
max si - li (5.37)
i
* Ruhe, A., On the closeness of eigenvalues and singular values for almost normal matrices, Linear Algebra Applications,
11 (1975), 87–94.
Section Wrap-Up ◾ 199
is one measure of nonnormality. Another measure is based on condition number. The condi-
tion number of a normal matrix is equal to the ratio of its largest and smallest (by modulus)
eigenvalues. Thus, the relative values of condition number and the ratio of eigenvalues, that is
s1 l max
m= , m= (5.38)
sn l min
é -4 1ù
y¢ = ê úy (5.33)
-3û
ë2
A
was shown in Figure 5.6. The matrix is nonnormal; however, the two eigenvectors are close
to being orthogonal. Hence, the condition number
>> cond(A)
ans =
2.6180
is close to the ratio of its eigenvalues, 2.5. Thus, the system is only slightly nonnormal. Thus,
starting at an initial condition, the magnitude of dependent state variables y1 , y2 falls mono-
tonically to origin.
Now consider another matrix
é4 -9 ù
A1 = ê ú
ë6 -11û
which has the same eigenvalues (λ = {−5, − 2} as the original matrix, A. The eigenvectors
of A1 are given as
é1ù é1.5ù
w1 = ê ú , w 2 = ê ú
ë1û ë 1 û
200 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
The eigenvectors of matrix A1 make a small angle (w1 and w2 are 11.3° apart). The condition
number of this matrix
>> cond(A1)
ans =
25.3606
is an order of magnitude greater than the ratio of its eigenvalues. The condition number
is a metric that indicates how sensitive a system is to a change in the input parameters.
The asymptotic (long-term) response of the system is stable, and no eigenvector is signifi-
cantly faster or slower than the other. However, the closeness of the eigenvectors affects the
short-term transients of the system.
The transient response of the system for various initial conditions is shown in Figure 5.11.
The two panels on the left show the response of the system vs. time starting at four different
initial conditions:
é 0. 8 ù é 0. 8 ù é 0. 2 ù é 0. 2 ù
y (0) = ê ú , y (0) = ê ú , y (0) = ê ú , y (0) = ê ú
ë 0. 6 û ë -0.6 û ë0.98 û ë -0.98 û
2
3
2 1
E(t)
y1
1 0
0 –1
0 2.5 5 0 2.5 5
(a) Time (b) Time
1.5
–1.5 1.5
–1.5
(c)
FIGURE 5.11 Demonstration of transient growth in a nonnormal system: (a) energy E(t) vs. time,
(b) transient response of y1(t) vs. time, and (c) phase portrait. The four thick lines in the phase
portrait correspond to the four lines in the two left transient panels.
Section Wrap-Up ◾ 201
Figure 5.11a shows the change in energy of the system with time. For the purpose of this
example, energy E(t) is defined as
yT y y12 ( t ) + y22 ( t )
E (t ) = = (5.39)
y T0 y 0 y12 ( 0 ) + y22 ( 0 )
The first initial condition, y0 = [0.8 0.6]T, starts between the two eigenvectors (see phase
portrait in Figure 5.11c, where the dashed lines represent eigenvectors). The value of E(t) as
well as y1(t) and y2(t) decrease monotonically. This behavior seems qualitatively similar to
that observed in nearly normal systems.
The transient response starting at the second initial condition is shown as dash-dot lines
in all the three panels. A fairly large surge in energy is observed, with the maximum value
of E(t) exceeding 3. Likewise, the maximum value of y1(t) attained is 1.575. Some initial
conditions show large transient growth, whereas some other initial conditions do not. This
is an important feature of transient response of nonnormal systems: For certain initial con-
ditions, a large surge in short-term transients of the system is observed before the system
settles down asymptotically to its equilibrium point (origin).
The phase portrait of the nonnormal system (Figure 5.11c) is exactly as expected. The
initial response of the system is along the fast eigenvector, w1, and eventually approaches
and moves parallel to the slower eigenvector, w2 = [1.5 1]T. Since the two eigenvectors
form an acute angle, any trajectory starting from outside this region gets stretched along
w1, before returning to the origin along w2.
The transient growth will be even higher if either the two eigenvectors are brought closer
to each other or if the ratio of eigenvalues is increased. For example, with
é 8. 5 -13.5ù
A2 = ê ú
ë9 -14 û
the transient growth in E(t) is significantly higher. Note that the eigenvectors of A2 are the
same (w1, w2) as those of A1, whereas the eigenvalues are λ = {−5, − 0.5}. Since the ratio of
eigenvalues is much greater than in the previous case, a larger transient growth is observed.
Note that the condition number for this matrix is μ = 212. Phase-plane analysis and tran-
sient growth for this system is left as an exercise.
Although the rules for phase-plane analysis remain the same for highly nonnormal sys-
tems, the consequence of skewed eigenvectors manifests as a large transient growth.
cross-linking of the polymeric chains. The mixture solidifies and gets its properties based
on the degree of cross-linking that happens during the process. Indeed, the actual process
involves tens or hundreds of reactions and is complicated. Moreover, geometries required
of these materials (such as tyre) are also highly complex. What we will do in this case study
is to define a simplified problem that is motivated by this important process. We will retain
just a couple of reactions and simulate a 1D slab geometry as a representative system.
Consider a 2 cm long slab, which is originally at room temperature. The two ends of the
slab are heated using a constant temperature source, which results in a rise in slab tempera-
ture due to conduction. At elevated temperature, the following reactions take place:
k1 k2
P ® C ® D
where
P is the original polymer
C is the cross-linked polymer
D is dead-ended polymer
A specific range of C and D are needed for the product to have desired properties.
The simplified geometry is a 1D slab with its ends kept at 180°C. Since the material is a
semisolid material in a mold, the temperature effects are modeled by a 1D transient heat
equation:
¶T ¶ 2T
= a 2 + S (T ,C ) (5.40)
¶t ¶z
where S (T,C ) is the source term due to reactions. We will neglect reaction heat effects. The
two reactions take place at each location in the slab. Since there is no transport of material
within the solid slab, the equations governing the concentration are
¶
CP = -r1
¶t
(5.41)
æ E ö
r1 = k10 exp ç - 1 ÷ CP
è RT ø
¶
CC = r1 - r2
¶t
(5.42)
æ E ö
r2 = k20 exp ç - 2 ÷ CC
è RT ø
¶
CD = r2 (5.43)
¶t
boundary conditions are imposed at the two ends of the slab since the temperature is speci-
fied. The next example shows the methodology.
E1 E2
k1 = 5 ´ 108 min -1 , = 9500 K, k2 = 1011 min -1 , = 13, 500
R R
Compute the temperature profile and how the cross-link and dead-end concentra-
tions vary within the slab with time. The slab is heated for 75 min with the two ends
held at 180°C and allowed to cool under ambient conditions.
Solution: We will discretize the parabolic PDE (5.40):
d
C X ,i = R X ,i (5.45)
dt
where X ≡ P , C , D represents the three species. Since the boundary temperatures are
known, we will write the above equations for only the (n − 1) internal nodes. We define
the solution variable as
T
F = éëT1 CP1 CC1 C D1 Tn-1 CP ,n-1 CC ,n-1 CD ,n-1 ùû (5.46)
function dPhi=curingFun(t,Phi,par)
% Model for simplified polymer curing example
% Combination of parabolic PDE for energy
% and ODE for material balances
%% Model parameters
alpha=par.alpha;
Tb1=par.Tb1;
Tb2=par.Tb2;
n=par.n;
h=par.h;
diffPar=alpha/h^2;
204 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
%% Model variables
Phi=reshape(Phi,4,n-1);
T=Phi(1,:);
C=Phi(2:end,:);
%% Model equations
dT=zeros(1,n-1);
dC=zeros(3,n-1);
for i=1:n-1
r=rxnRate(T(i),C(:,i),par);
% Energy balance
if (i==1)
diffCoef=T(i+1)-2*T(i)+Tb1;
elseif (i==n-1)
diffCoef=Tb2-2*T(i)+T(i-1);
else
diffCoef=T(i+1)-2*T(i)+T(i-1);
end
dT(i)=diffPar*diffCoef;
% Material balance
dC(:,i)=par.nuStoich * r;
end
dPhi=reshape([dT;dC],(n-1)*4,1);
end
Calculation of the reaction rate involves two reactions. The reaction term for each
species is given by
RX = ån r
X, j j (5.47)
j
where νX , j is the stoichiometric coefficient for species X in jth reaction. The matrix ν
is a 3 × 2 matrix, whereas the reaction rate vector is a 2 × 1 vector. The stoichiometric
matrix is
é -1 0ù
ê ú
nX, j =ê1 -1ú (5.48)
ê0 1 úû
ë
% First reaction
r(1,1)=k(1)*C(1); % k1*C_P
% Second reaction
r(2,1)=k(2)*C(2); % k2*C_C
end
Note that the main challenge in this problem was converting the given set of PDE and
ODEs into a standard form after discretization. Once this was achieved as Equations
5.40 through 5.43, the rest followed the same procedure as we have done before. On
the same lines, the driver script also follows the same structure as earlier problems.
Since the solution vector is in the form of Equation 5.46, the initial conditions need to
be handled appropriately.
We will split the driver solution into two parts. The first part is slab heating, and
the second part is cooling down. The ODE solver, ode45, is used for a time span of
[0:75]. We use colon notation in time span because the solution is desired at every
1 min interval. When the solution is obtained, the last row of the solution vector con-
tains the values at time 75 min.
Thereafter, to simulate cooling in atmosphere, the solver is called again with a time
span of [75:90] and with the initial condition on the second call being the value at
the final time of the first call. The driver script is given below:
% Part-1: Heating up
tSpan=0:75;
[tSol,YSol]=ode15s(@(t,y) curingFun(t,y,slabParam),...
tSpan,Y0);
Y0=YSol(end,:); % Solution at t=75
% Part-2: Cool-down
tSpan=75:120;
slabParam.Tb1=25+273;
slabParam.Tb2=25+273;
[tSol2,YSol2]=ode15s(@(t,y) curingFun(t,y,slabParam),...
tSpan,Y0);
tSol=[tSol;tSol2(2:end)];
YSol=[YSol;YSol2(2:end,:)];
%% Plotting results
figure(1)
Z=[0:n]*h;
T1=YSol([6,11,21,41,81],1:4:end);
T1=[Tb1*ones(5,1), T1, Tb2*ones(5,1)];
plot(Z,T1);
MidT=YSol(:,101);
MidC=YSol(:,102:104);
figure(2);
subplot(2,1,1); plot(tSol,MidT);
subplot(2,1,2); plot(tSol,MidC(:,2)./midC(:,1));
Figure 5.12 shows the temperature profile along the length of the slab at various times
(5, 10, 20, 40, and 80 min). The last line occurs 5 min after the start of the cool-down
450 40 min
80 min
Temperature (K)
400 20 min
350 10 min
5 min
300
0 0.005 0.01 0.015 0.02
Location (m)
FIGURE 5.12 Evolution of spatial temperature profile in the 1D slab at various times. Solid lines
represent heating phase and dashed line 5 min into the cooling phase.
Section Wrap-Up ◾ 207
0.5
0.4
Concentration (mol/L)
CC
0.3
0.2 CD
CP
0.1
0
0 40 80 120
Time (min)
FIGURE 5.13 Concentration of the three species at the center of the slab during the 2 h process.
450
Temperature (K)
400
350
300
(a)
20
15
CC/CD
10
0
0 40 80 120
(b) Time (min)
FIGURE 5.14 Variation of (a) temperature and (b) ratio of crosslink and dead-end concentrations
(CC/CD) at the center of the slab as a function of time.
period and is shown as dashed line. Due to low thermal diffusivity, significant tem-
perature gradients are present during the process.
Figures 5.13 and 5.14 show temporal variations in several key variables at the center
of the slab. Figure 5.13 indicates that the cross-link concentration, CC goes through
maxima. Moreover, although significant temperature gradients exist in the slab, the
reactions stop nearly 6 min into the cooling period, although the center temperature
208 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
is still quite high (Figure 5.14 top panel). This is attributed to the fact that the reac-
tion rate falls significantly due to large activation energy. Finally, the bottom panel
of Figure 5.14 shows that at the end of the process, the two desirable conditions of
curing are met: complete conversion of the original polymer (CP ≈ 0) and a good
crosslink:dead-end ratio.
A real polymer curing process is more complex, with tens or hundreds of reactions.
Moreover, heat effects due to reaction may also be important. The aim of this case study
was to introduce students to a different type of problem, where parabolic PDE and ODE-
IVP are solved simultaneously in process simulation. The intent of this exercise is to dem-
onstrate that a complex problem is easily solvable using tools in this book, once we break it
down to a standard form.
dS
= D ( S f - S ) - rg
dt
dX
= -DX + rg Yxs (5.49)
dt
dP
= -DP + rg Yps
dt
æ éS ù ö
where rg = ç m max ë û ÷ éë X ùû (5.50)
ç K S + ëéS ùû ÷ø
è
The rate constants are given as μmax = 0.5, KS = 0.25, Yxs = 0.75, and Yps = 0.65.
D=monodParam.D;
It is easy to see that simulating sinusoid input simply means replacing that line with a
sine function that calculates
function D=varyingInlet(t,y,par)
D0=par.D;
f=par.freq;
D=D0*(1+0.2*sin(2*pi*f*t));
end
function dy=monodFun(t,y,monodParam)
%% System parameters
mu=monodParam.mu;
K=monodParam.K;
Yxs=monodParam.Yxs;
Yps=monodParam.Yps;
Sf=monodParam.Sf;
D=varyingInlet(t,y,monodParam);
3.5
[S] or [X ] (mol/L)
2.5
1.5
0.5
0 10 20 30 40 50
Time (h)
FIGURE 5.15 Simulation of a chemostat with ±20% sinusoid input for a frequency of 0.5 Hz
(solid line) and 0.2 Hz (dashed line). The results with constant (nonsinusoidal) input are shown as
dash-dot line.
dy(1)=D*(Sf-S) - rg;
dy(2)=-D*X + Yxs*rg;
dy(3)=-D*P + Yps*rg;
The underlined line is where we inserted the calculation of dilution rate, D. Figure 5.15
shows the transient behavior of chemostat with sinusoidal input, with the dash-dot
line representing the case with constant D for comparison. The solid line indicates a
0.5 h−1 input sinusoid, whereas the dashed line indicates a slightly lower frequency
of 0.2 h−1. The output oscillates in both the cases. The output oscillates with a greater
amplitude when the frequency of sinusoid is lower.
The above example demonstrates how a sinusoidal input can be used in simulations. The
sinusoid can be replaced by a square wave using the following command instead:
D=D0*(1+0.2*square(2*pi*f*t));
3.5
1.5
0.5
0 10 20 30 40 50
Time (h)
FIGURE 5.16 Simulation of chemostat using sinusoid and square waves at the same frequency of 0.2 Hz.
function D=varyingInlet(t,y,par)
% Dilution ramped down in a series of four
% ramp-down-and-hold sequence
if (t<1)
D=0.3;
elseif(t<2)
D=0.3-0.05*(t-1);
elseif(t<3)
D=0.25;
elseif(t<4)
D=0.25-0.05*(t-3);
elseif(t<5)
D=0.2;
elseif(t<6)
D=0.2-0.05*(t-5);
elseif(t<7)
D=0.15;
elseif(t<8)
D=0.15-0.05*(t-7);
else
D=0.1;
end
The chemostat is simulated, and the results are shown in Figure 5.17, with the top
panel showing the output concentration and bottom panel showing input dilution rate.
212 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
[S],[X],[P] (mol/L)
2
0.3
Dilution (h–1)
0.2
0.1
0 5 10 15
Time (h)
FIGURE 5.17 Chemostat with dilution rate ramped down in a sequence of ramp-down-and-hold.
Starting at 0.3 h−1, the dilution rate is held for an hour followed by ramp down at the rate of 0.05 for
the next hour.
dy
= f ( t ,y ) (5.1)
dt
The time interval at which measurements are available in a discrete-time system is known
as sampling interval. We will represent sampling interval as Δts. Let us say that the che-
mostat was monitored at the rate of Δts = 0.2 h, then information is available at times t = 0,
0.2, 0.4, …. In discrete time, this is represented at measurement intervals k, which is an
integer. We represent these values as y(k). In other words, the values at times mentioned
above are represented as y(0), y(1), y(2), … and the equivalent discrete-time model is
given by
(
y ( k + 1) = f D y ( k ) ) (5.52)
Section Wrap-Up ◾ 213
Analog
Digital
FIGURE 5.18 Block diagram of the chemostat process with digital-to-analog reconstruction at the
input and analog-to-digital sampling of the output.
Let us say that the dilution rate was a manipulated variable, then the discrete-time model
with manipulated control input is given by
(
y ( k + 1) = f D y ( k ) ,v ( k ) ) (5.53)
At time kΔts, the value of state variables is y(k). Based on the ZOH reconstruction, the value
of v is kept constant in the interval. We can solve the ODE-IVP starting at t = kΔts and y = y(k).
This will yield us behavior of the system at future time instances. The system is sampled at
t = (k + 1)Δts, until which time, the input is held constant. Therefore, the discrete-time model
is nothing but the ODE-IVP solution at time (k + 1)Δts, starting with y = y(k) at t = kΔts.
[S],[X],[P] (mol/L)
2
0.3
0.25
Dilution (h–1)
0.2
0.15
0 2 4 6 8 10
Time (h)
FIGURE 5.19 Simulation of chemostat with ZOH. Top plot: sampled output variables; bottom plot:
manipulated input dilution rate.
Figure 5.19 shows the results of discrete-time simulation of the chemostat. The top
panel shows the state variables, with symbols representing the actual measured values.
Since this is a sampled-data system, there is no information between consecutive time-
steps. The bottom panel represents the input. Here, a stairs plot is used for ZOH.
However, in some cases, (for example, highly exothermic system) the temperature may vary
within the system.
Analysis of reactors with strong temperature effects can indeed be done by solving mass
and energy balances for the PFR simultaneously. However, at times, detailed analysis or
parameter estimation in the presence of local temperature trends calls for an approach to
decouple temperature and mass balance effects. In other cases, the boundary temperature
may act as a constraint to the overall system.*
dC A kC A
u =- (5.55)
dz K r + CA
æ E ö æ DE ö
k = k0 exp ç - ÷ , K r = K 0 exp ç - r ÷ (5.56)
è RT ø è RT ø
The activation energy is E = 60 kJ/mol and ΔEr = − 10 kJ/mol. The rate constants are speci-
fied at the temperature T1 = 450 K as k1 = 2 and Kr , 1 = 1. The inlet temperature was 450 K,
concentration 1 mol/m3, and flowrate 0.4 m/min. The temperature profile was measured in
the experiments at six different locations:
* A simplified version of this was seen in the Graetz problem in Section 4.5.3, where the temperature at the external bound-
ary (walls) was specified. An analogous problem of relevance in this case study is when the boundary temperature profile
is specified. There are several examples of practical relevance where one needs to implement a boundary value profile as a
constraint. This case study provides a way to solve problems of similar nature.
Section Wrap-Up ◾ 217
pfrTProfile=spline(zMeas,TMeas);
T=ppval(pfrTProfile,z);
We will use this within the PFR simulations. The driver script is given below:
% Driver script for simulation of a PFR
% with specified temperature profile
%% Model Parameters
modelParam.k1=0.2;
modelParam.K1=1;
modelParam.T1=450;
modelParam.E=60*1000;
modelParam.DEr=-10*1000;
modelParam.Rgas=8.314;
% Reactor conditions
modelParam.u0=0.4;
L=2;
Cin=1;
% Measured profile
zMeas=[0, 0.25, 0.5, 1.0, 1.25, 1.5, 2.0];
TMeas=[450, 454, 460, 480, 520, 595, 605];
% Spline interpolating polynomial
modelParam.pfrTProfile=spline(zMeas,TMeas);
The key difference from the earlier PFR example is that the temperature is speci-
fied. However, since the temperature is not available as an explicit function, we fit a
spline interpolating polynomial to data. The PFR function, given below, uses this
spline polynomial to simulate the reactor with specified boundary temperature:
function dC=pfrModelFun(z,Ca,modelParam)
% To compute dC/dz for a PFR with a single reaction
% and reactor temperature specified
218 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
%% Model Parameters
u=modelParam.u0;
pfrTProfile=modelParam.pfrTProfile;
% Interpolate to obtain T(z)
T=ppval(pfrTProfile,z);
% Calculate reaction rate
r=rxnRate(T,Ca,modelParam);
% Model equation
dC=-1/u*r;
end
The following is a function for calculating reaction rate. This is same as the function
used in transient PFR example from Chapter 4:
r=k*C/sqrt(1+Kr*C^2);
end
Figure 5.20 shows the concentration profile in the PFR. Since the temperature is rather
low in the initial section of the PFR, only a modest amount of reaction takes place,
0.8
CA (mol/m3)
0.6
0.4
0.2
0
0 0.5 1 1.5 2
Length (m)
FIGURE 5.20 Concentration vs. length of the PFR, simulated with a given temperature profile.
Section Wrap-Up ◾ 219
and the conversion is about 43% at z = 1 m from the inlet. As we move further down-
stream, the temperature increases, which results in a rapid increase in conversion of A
and complete conversion is observed in less than z = 1.5 m of the reactor.
5.6 WRAP-UP
This chapter built upon the case studies from previous chapters in Section I of the book.
The previous chapters simulated cases where the system was represented by a set of ODEs
(Chapter 3) or PDEs (Chapter 4) with specified initial and boundary conditions.
The first section of this chapter focused on simulating a staged operation. Like a distrib-
uted parameter system (DPS), the modeled variables of interest vary in time and location.
However, unlike a traditional DPS, the model is not governed by PDEs; instead, the model
consists of multiple interconnected stages, with the process variables’ values specified only
at these discrete stages.
The second section of this chapter focused on using linear algebra tools for analysis and
understanding of transient systems. Specifically, we observed how rich dynamic behavior
emerges for even a simple example of linear ODE in two dimensions. We limited ourselves
to 2D since it is easy to visualize behavior in 2D pictorially. The concepts of linear system
analysis are more general, and extendible to multiple dimensions as well. A thorough non-
linear analysis will have to wait until we cover additional topics in Section II of this book.
The first two case studies in Chapter 9 will extend this linear analysis and cover nonlinear
analysis and bifurcation. An interesting concept of nonnormal system analysis is also intro-
duced in this section, as an extension of linear stability analysis. While most books on linear
systems focus on long-term asymptotic response, this section brought out the importance
of short-term transient growth.
Thereafter, we took a few examples to extend our analysis of transient systems. We con-
sidered a system where the energy balance leads to a parabolic PDE and material balance to
an ODE-IVP. The overall system of equations was solved simultaneously. A combination of
ODE and PDE occurs in practical examples but is usually not covered in numerical simula-
tion textbooks.
Finally, the next two case studies demonstrated simulations of systems where (i) a pro-
cess input or parameter varies with time and (ii) a boundary profile constraint is imposed
on a system.
This completes our discussion of transient systems in Section I of this book. The next
part will cover (nonlinear and linear) algebraic equations, ODE-BVPs (boundary value
problems), elliptic PDEs, and differential algebraic equations (DAEs, which are a combina-
tion of ODEs and algebraic equations).
EXERCISES
Problem 5.1 Repeat the simulations for a binary distillation column from Example 5.1 for
three stages each in rectification and stripping sections. Compare the results
with a five-stage column for the same operating conditions.
220 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
Problem 5.2 Repeat the previous problem for a tougher distillation, with α = 1.75. How
many trays are required to get 95% purity (if it is possible) in the distillate?
Let D = 12.5 mol/s and V = 37.5 mol/s.
Problem 5.3 A mass-spring-damper system consists of a mass suspended by a spring and
a damper that damps the oscillations. The model is given by
mx ¢¢ + cx ¢ + kx = 0
x ¢¢ + ( 2zw) x ¢ + w2 x = 0
What is/are the conditions on ζ for the system to have oscillatory response?
Problem 5.4 Use the methodology of Section 5.2.3 for analysis of transient growth and
phase portrait of the system, y′ = A2y, where
é 8. 5 -13.5ù
A2 = ê ú
ë9 -14 û
é -1 - cot ( q ) ù
y¢ = ê úy
êë 0 -2 ûú
The eigenvalues of this matrix are λ = {−1, −2}. Show that the eigenvectors of
T
this matrix are v 1 = ëé1 0 ùû , v 2 = éëcos ( q ) sin ( q ) ùû .
T
Problem 5.7 In this problem, we will simulate a condition similar to Example 5.5 but with
variations in the substrate concentration. Consider that the chemostat is
used for effluent treatment. The substrate concentration is measured every
half hour and is found to be as follows:
Use spline to fit a smooth spline function to the above values, and hence
use the spline interpolation to simulate the effect of the above variations in Sf
on the performance of a chemostat. Choose D = 0.1 and start with an initial
condition of Y0 = [0.1; 3.6; 3.2];.
Problem 5.8 In Example 5.5, the dilution rate was ramped down in a series of steps. The
problem was simulated with a continuously varying input. Instead, a dis-
crete-time controller with a sampling time of 0.25 h is used. The trajectory
with ZOH that is implemented is shown in the figure below:
0.3
v(k)
0.2
0.1
0 5 10
Time (h)
223
CHAPTER 6
Nonlinear Algebraic
Equations
6.1 GENERAL SETUP
Section I of this book focused on methods to solve ordinary differential equations (ODEs).
This first chapter of Section II will deal with solving a set of algebraic equations of the type
g ( x ) = 0 (6.1)
where
x ∈ R n is an n × 1 solution vector
g(⋅) is a vector-valued function
I will retain the convention followed in Section I: All vectors (unless otherwise specified)
will be column vectors. Thus, x is a n × 1 column vector and g(⋅) is an n × 1 function vector.
Prior to generalizing the discussion to an n-variable problem, interpretation of equation
solving for a single variable is first discussed. Solving algebraic equation is essentially find-
ing the value(s) of variable x that satisfy Equation 6.1. The notation x⋆ will be used to denote
the solution. Thus, g(x) represents how the value of the function varies with x, whereas g(x⋆)
represents the value at a specific point x = x⋆. Since x⋆ is the solution, the function value at
this point is, by definition, 0.
The geometric interpretation of solving a single nonlinear algebraic equation in one vari-
able is shown in Figure 6.1. The function g(x) varies with varying x and intersects the X-axis
at two points, x1 and x2 . The points are known as the roots of the function, since they satisfy
Equation 6.1.
225
226 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
g(x)
x*1 x*2
FIGURE 6.1 Schematic for a curve g(x), which intersects the X-axis at two points, x1 and x2 .
Pv = RT
where
P is the pressure in Pa
v is the molar volume in mol/m3
R is the ideal gas constant
T is the temperature in K
The ideal gas law does not accurately capture the behavior of real gases. Equations of
state (EOS) provide a better prediction of the volumetric properties of fluids. One such
popular EOS is Redlich-Kwong (RK) EOS:
RT a/ T
P= - (6.2)
v - b v (v + b)
The problem of finding the molar volume of a gas at a particular temperature and pres-
sure using the ideal gas EOS is straightforward since it provides volume as an expression,
v = RT/P. On the other hand, RK EOS is a cubic equation.
Both the EOS can be put in the standard form of Equation 6.1 as
- RT
Pv
=0
(6.3)
g ideal ( v )
v -b a
g ( v ) º P ( v - b ) - RT + =0 (6.4)
v (v + b) T
Nonlinear Algebraic Equations ◾ 227
1000
500
g(v)
0
g΄(v) v*
–500
FIGURE 6.2 Plot of g(v) vs. v for Redlich-Kwong EOS using Equation 6.4 for the chosen values of
RK parameters at 100 bar and 340 K. The dashed arrow represents working of bisection rule while
the solid arrow represents working of Newton-Raphson method.
The latter equation is obtained by multiplying the RK EOS by (v − b) and rearranging.
The value, v⋆, that satisfies the above equation is the molar volume of gas at the correspond-
ing temperature and pressure.
In this chapter, the RK EOS will be used to compute the molar volume of a fluid at 340 K
temperature and 100 bar pressure and hence demonstrate solution of nonlinear equations.
The values of the two parameters in the RK EOS that I will use in this example (in SI units)
are a = 6.46 and b = 2.97 × 10−5. Figure 6.2 shows plot of g(v) vs. v for a range of values of v
using Equation 6.4. The curve intersects the X-axis at v⋆ = 1.646 × 10−4. This is the solution
of the nonlinear equation and hence represented as v⋆.
The corresponding molar volume using ideal gas assumption is videal
= 2.827 ´ 10-4.
Solving the equation (6.3) signifies finding videal
, the point at which the function gideal(v)
(which is a straight line) intersects the X-axis.
Numerical methods for solving a single nonlinear equation in one unknown are dis-
cussed in the next section using the above example. Thereafter, the methods are extended
to a general n-variable case. Subsequently, MATLAB® functions fzero and fsolve used
to solve algebraic equations are discussed, followed by a few exemplary case studies.
The following notations are used in this chapter. Various guesses are represented as
x(i), where i is the iteration index. Approximation error between subsequent iterations is
e(i) = x(i) − x(i − 1) and the true approximation error is et( ) = x - x ( ) . The corresponding rela-
i i
tive errors will be represented as ε(i) and et( ). The notation Δ will be used to represent differ-
i
ence, which will be explained based on the context in this chapter. The term g(i) will be used
as a shorthand notation for g(x(i)). By definition of the solution, g⋆ = g(x(⋆)) = 0.
6.2.1 Bisection Method
Bisection method is an example of bracketing method, where two initial guesses are chosen.
These are represented as x(L) and x(U). They are chosen such that they lie on either side of the
desired solution. Thus, the function values for admissible initial guesses satisfy the relation
( )( )
g x( ) g x( ) < 0
L U
(6.5)
The two guesses, x(L) and x(U), shown in Figure 6.2, lie on either side of the true solution, x⋆.
The value of g(L) is negative and that of g(U) is positive, thus satisfying the condition (6.5). The
new guess using bisection method is generated by simply bisecting the line segment joining
x(L) and x(U). Thus
x( ) + x( )
L U
x( ) =
i +1
(6.6)
2
is the new iteration value. The condition (6.5) is verified for the new iteration value x(i + 1),
and either x(L) or x(U) is retained so that the current guesses still bracket the true solution.
Thus, if x(i + 1) lies on the same side as x(L), then x(i + 1) replaces x(L) for the next iteration while
x(U) remains unchanged; else, x(L) remains unchanged and x(i + 1) replaces x(U). Numerically
( i +1
)( )
if g x ( ) g x ( ) > 0, x ( ) = x ( ) , else x ( ) = x ( )
L L i +1 U i +1
(6.7)
Thereafter, the algorithm continues. At each iteration, Equation 6.6 is used to compute the
next guess, whereas the condition in Equation 6.7 is used to determine whether this iterant
replaces x(L) or x(U), based on the sign of g(i + 1). This method is continued iteratively until the
error e(i + 1) falls below a predetermined tolerance threshold:
e ( ) º x ( ) - x ( ) < etol
i +1 i +1 L
(6.8)
If the initial length of the line segment is e(0) = x(U) − x(L), then after the first iteration
e( )
0
e( ) =
1
2
Nonlinear Algebraic Equations ◾ 229
Each subsequent iteration also halves the line segment. Thus, the bisection method has a
linear rate of convergence. The box below discusses convergence order of an iterative numer-
ical method.
( )
m
e( ) = k e( )
i +1 i
(0)
e
e( ) = n
n
(6.9)
2
If the error e(n) falls below the desired error tolerance threshold, etol in n iterations, then the
value etol satisfies Equation 6.9. Thus, irrespective of the form of function, g(x), the number
of iterations required to meet the desired error threshold can be computed a priori as
e( )
0
n = log 2 (6.10)
etol
The following example demonstrates the bisection method for finding the molar volume of
a fluid represented by RK EOS.
The core of the code involves computing the function g(v) as per Equation 6.4
for =340 , P = 100 × 105 , and the current guess value v(i). A new guess will be computed
using Equation 6.6, x(i) will replace either x(L) or x(U) using condition (6.7), and the
stopping condition will be verified using Equation 6.8.
%% Display result
disp(['Molar volume, v = ', num2str(v)]);
e ( ) = v ( ) - v ( ) = 1.4 ´ 10-4
0 U L
The desired error tolerance was etol = 10−8. Thus, according to Equation 6.10
( )
n = log 2 1.4 ´104 = 13.8
(6.11)
Thus, the error falls below the tolerance threshold in 14 iterations, as seen in Example 6.1.
The bisection method is easy to use, and the number of iterations required for its con-
vergence is known a priori. It is also robust because the two guesses always bracket the
solution, and the new guess being midpoint of the current guesses can always be obtained
(irrespective of the nature of g(x)). There are, however, certain cases where the bisection
method fails. These are shown in Figure 6.3a: Although a root exists, there are no feasible
starting points for the bisection method. Figure 6.3b shows that feasible starting points fall
in a narrow range and are therefore difficult to obtain. Figure 6.3c shows where the feasible
starting points bracket a singularity and not a solution.
FIGURE 6.3 Three examples of functions where the bisection method “fails.”
232 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
I will end this section with a modular code for solving Example 6.1. Hereafter, similar
modular coding will be practiced.
Once the above function is written, it may be directly used for computing g(v) within
the main script. The main script is given below. The script has undergone only modest
changes to make it modular: (i) g(v) is computed through the above RK function and
(ii) parameters are passed on to the function as a structure. Both these will make the
code less prone to errors. The driver script is
x - x( ) x( ) - x( )
i i i -1
= (6.12)
y - g( ) g( ) - g( )
i i i -1
The point at which the line intersects the X-axis is given by y = 0. Substituting and rearrang-
ing, the following equation can be used to update the secant solution:
x ( ) - x ( ) (i )
i i -1
x ( ) = x ( ) - (i )
i +1 i
×g (6.13)
g - g( )
i -1
The secant method starts with two initial guesses, g(0) and g(1) , and uses the above expres-
sion to generate the next guess. The code for the secant method is given in the next example.
234 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
The result from the secant method is same as before. The secant method takes only
five iterations (at value of i=6) to converge.
Nonlinear Algebraic Equations ◾ 235
The locations of key differences between bisection and secant codes are underlined
above. The main difference, of course, is how the next iteration value, x(i + 1) , is calcu-
lated. This is the first line in the for loop, where Equation 6.13 is used to compute the
next iteration value and is stored in variable v. The value g(v) is calculated as before;
however, the sign of g(v) need not be compared with the sign of g(v(L)).
The rate of convergence of the secant method is said to be super linear, since it converges
faster than linear convergence. Specifically, it can be shown that for the secant method
g -1
æ f ¢¢ ( x ) ö
(e( ) ) , 5 +1
g
(i +1) i
e =ç g= = 1.618 (6.14)
ç 2 f ¢ ( x ) ÷÷ 2
è ø
The exponent, γ, in the above equation is called the golden ratio.
The first two values, v(2) and v(3) , in the list above both lie on the left of the solution 1.646 × 10−4.
This shows that the secant method is not a bracketing method since g(2) × g(3) > 0. Method
of false position, on the other hand, is a bracketing method. A new guess is generated in a
similar way as Equation 6.13, though it uses x(L) and x(U) instead:
x( ) - x( ) (L)
U L
x ( ) = x ( ) - (U )
i +1 L
×g (6.15)
g - g( )
L
and then using condition (6.7) to determine whether x(L) or x(U) gets replaced. Thus, calculation
of the new approximation is similar to secant method, though it is a bracketing method where
the approximations retained are the ones that have opposite signs like the bisection method.
6.2.2.2 Brent’s Method
Equation 6.12 used in the derivation of the secant method involves fitting a straight line to
two points, (x(i), g(i)) and (x(i − 1), g(i − 1)). A more accurate method can be derived by fitting
a quadratic curve to three points, followed by the location where the curve intersects the
X-axis. Using Lagrange interpolating polynomial (see Appendix D)
x=x (i -2 ) ( y - g ( ) )( y - g ( ) )
i -1
+x ( )
i
( y - g ( ) )( y - g ( ) )
i -1
i -2 i
( g ( ) - g ( ) )( g ( ) - g ( ) )
i -2 i -1 i -2
( g ( ) - g ( ) )( g ( ) - g ( ) )
i i -1 i -2 i -1 i
() (
y - g ( ) )( y - g ( ) )
i -2 i -1
i
+x (6.16)
( g ( ) - g ( ) )( g ( ) - g ( ) )
i i -2 i i -1
236 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
The above quadratic curve intersects the X-axis when y = 0. Thus, the next approximation is
x( ) g ( ) g ( ) x( ) g ( ) g ( )
i -2 i -1 i i -1 i -2 i
x( ) =
i +1
+
(g( i -2 )
- g( )
i -1
)( g ( i -2 )
- g( )
i
) (g( i -1)
- g( )
i -2
)( g ( i -1)
- g( )
i
)
x( ) g ( ) g ( )
i i -2 i -1
+ (6.17)
( g ( ) - g ( ) )( g ( ) - g ( ) )
i i -2 i i -1
The above expression can be used directly, or may be simplified algebraically. If the function
g(x) is smooth in the region [x(L), x(U)], this quadratic method converges rapidly.
Brent’s method is more popularly known as inverse quadratic interpolation, because a
quadratic function of Equation 6.16 is employed for the next guess. It is called inverse qua-
dratic because here x is interpolated as a function of g(i)’s.
I end this section by mentioning that the four methods discussed herein, bisection,
regula falsi, secant, and inverse quadratic interpolation, are methods that work with multi-
ple initial guesses. Here, they are arranged in ascending order of their order of convergence.
I will refer to these methods again when discussing MATLAB algorithm in Section 6.4.1.
The next two methods discussed presently use a single initial guess.
x = G ( x ) (6.18)
x( ) = G x( )
i +1 i
( ) (6.19)
One simple way to get (x) related to the original problem (6.1) is to write
G ( x ) = x + ag ( x )
though more often, it is better to rearrange the nonlinear equation appropriately. For exam-
ple, the RK EOS (6.2) may be rewritten as
RT v -b a
v =b+ - (6.20)
P v (v + b ) P T
Nonlinear Algebraic Equations ◾ 237
The code of fixed point iteration using the equation above is skipped. Starting with the ini-
tial guess of v = 2.2 × 10−4, it took 14 iterations for this method to converge. The linear rate
of convergence is one disadvantage of fixed point iteration. A greater disadvantage is that it
is not a robust method, since there are stringent requirements on G(x) for stability of fixed
point iteration.
These results can be derived by expanding the function G(x) around the current guess
x(i) using Taylor’s series:
( x - x( ) )
2
i
G ( x ) = G ( x ) + ( x - x )G ( x ) +
() () ¢ ()
G¢¢ ( x ( ) ) +
i i i i
(6.21)
2
( x - x( ) )
2
i
G ( x ) = x( )
+ ( x - x )G ( x ) +
() ¢ ()
G¢¢ ( x ( ) ) +
i +1 i i i
The above is obtained by using the expression (6.20) in the first term of Equation 6.21.
The solution x⋆ satisfies the condition x⋆ = G(x⋆). Substituting in the above
(x ) G¢¢ x ( ) +
2
- x( )
i
( )
i
(
G x = x ( ) + x - x ( ) G¢ x ( )
i +1 i
) ( ) +
2
( ) i
(6.22)
x
( )
x - x ( ) = x - x ( ) G¢ ( x ) , x Î é x ( ) , x ù
i +1 i
ë
i
û
(6.23)
et( ) = G¢ ( x ) × et( )
i +1 i
(6.24)
Thus, fixed point iteration has a linear rate of convergence, and the error converges if
|G′(ξ)| ≤ 1. If the value of G′(ξ) is negative, the error will switch sign at every iteration.
If G′(ξ) is positive and less than 1, then the error will decrease monotonically.
The error analysis can be performed for Redlich-Kwong problem analyzing G′(v⋆):
( v - b ) - 2b2
2
a
G¢ ( v ) = (6.25)
v2 (v + b )
2
P T
One must exercise caution while using the condition for stability of fixed point iteration.
If the derivative G′(ξ) changes significantly in the vicinity of the solution, the estimates
based on Equation 6.24 can be highly incorrect. Secondly, the stability for convergence of
fixed point iteration in Equation 6.24 is a rather stringent condition. Although fixed point
iteration has an advantage of simplicity, it finds limited real-world applications because the
method is stable for only a restricted set of applications.
y - y( ) = g ¢ x( )
i
( ) ( x - x ( ) )
i i
(6.26)
The point where the above line intersects the X-axis is obtained by substituting y = 0.
Newton-Raphson method uses this as the next iteration value for solving the nonlinear
equation
x (i +1)
=x (i )
-
( )
g x( )
i
(6.27)
g ¢( x( ) )
i
Note that I used the fact that the point y(i) on the curve is actually g(x(i)). I will now use the
Newton-Raphson method to solve the RK EOS problem.
a æ v ( v + b ) - ( v - b ) ( 2v + b ) ö
g ¢(v ) = P + ç ÷
( )
2
T ç v 2
v + b ÷
è ø (6.28)
a æç 2vb - v 2 + b2 ö
g ¢(v ) = P + ÷
T ç v2 (v + b)
2
÷
è ø
Nonlinear Algebraic Equations ◾ 239
function dg = RKdg(v,par)
% Returns g'(v) at a particular T and P
% Parameters and operating conditions
a=par.a; b=par.b; R=par.R;
T=par.T; P=par.P;
% Function value
dg=P + a/sqrt(T)*(2*v*b-v^2+b^2)/v^2/(v+b)^2;
end
The main driver script for Newton-Raphson is similar to those of bisection and secant
methods. The only change is that RKdg is used to calculate g′(v) and v(i + 1) is calculated
from Equation 6.27. For the sake of brevity, only the Newton-Raphson loop is given
below:
%% Newton-Raphson iterations
X=v; % For storing v values
G=g; % For storing g(v)
for i=1:maxIter
dg=RKdg(v,param);
v=v-g/dg;
g=RK(v,param);
% Store values & check convergence
X(i+1)=v;
G(i+1)=g;
err=abs(X(i+1)-X(i));
if (err<eTol)
break
end
end
The following are the values of v(i) generated, starting from v(0) = 2.2 × 10−4:
In contrast to the bisection, secant, and fixed point iterations, Newton-Raphson con-
verged in four iterations.
A formal error analysis of Newton-Raphson method, as seen before, involves using Taylor’s
series expansion of g(x) around x(i):
( x - x( ) )
2
i
( ) ( ) ( )
g ( x ) = g x( ) + x - x( ) g ¢ x( )
i i i
+
2
( )
g ¢¢ x ( ) +
i
(6.29)
240 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
( x *- x ( ) )
2
i
( ) ( ) ( ) 2 g ¢¢ ( x ( ) ) +
0 = g x ( ) + x *- x ( ) g ¢ x ( )
i i i
+
i
( x *- x ( ) )
2
i
( x *- x ) g ( x ) = g ( x ) + 2 g ¢¢ ( x ( ) ) +
() ¢ ()
i i () i i
(6.30)
g ( x ( ) ) ( x * - x ( ) ) g ¢¢ ( x ( ) )
2
i i i
( x *- x ) = g ¢ x ( ) + 2 g ¢ x ( ) +
()
i
( ) i
( )
i
The above equation can be rearranged to give the Newton-Raphson method. The infinite
series can be truncated to the first-order term (i.e., the first term on the right-hand side) and
the remainder is the error. Thus
( ) ùú + éê- ( x *- x( ) ) g ¢ ( x ( ) ) + ùú
2
é g x( )
i i i
ê (i )
x = x -
(6.31)
ê
êë ( ) úúû êêë 2 g ¢ ( x ( ) ) úúû
g ¢ x( )
i i
The term in the first bracket is the new iterated value x(i + 1), whereas the term in the second
bracket is the error, e(i + 1):
x (i +1)
=x (i )
-
( )g x( )
i
(6.27)
g ¢( x( ) )
i
( e( ) )
2
i
g ¢¢ ( x )
e( )
i +1
=- (6.32)
2 g ¢(x)
Note that the mean value theorem is used to convert the higher order terms of the infinite
series, where ξ ∈ [x(i), x(i + 1)]. It is clear from the above expression that the Newton-Raphson
method has a quadratic rate of convergence. Hence, in the vicinity of the solution, x⋆, Newton-
Raphson converges faster than the other methods discussed in this section. Consequently,
it emerged as one of the most popular methods for solving nonlinear algebraic equations.
( )
m
e( ) = k e( )
i +1 i
(6.33)
Nonlinear Algebraic Equations ◾ 241
10–4
10–8
e (i + 1)
10–12
10–16
10–10 10–8 10–6 10–4
e(i)
FIGURE 6.4 Plot of errors e(i + 1) vs. e(i) for fixed point iteration (diamonds), secant (squares), and
Newton-Raphson (circles). The Newton-Raphson method has significantly lower errors and shows
a faster drop in error.
If the error in the (i + 1)th iteration is plotted on X-axis against error in the (i)th iteration
on a log-log plot, it will result in a straight line with a slope, m, equal to the order of con-
vergence of the numerical method. Figure 6.4 shows this log-log plot for three methods.
Bisection and regula falsi are skipped for ease of discussion. As is clear from the figure, all
the three are straight lines, with slopes of 1, 1.6, and 2, respectively. The slope can either be
computed from the figure or qualitatively confirmed. For example, e(i) and e(i + 1) decrease by
almost four orders of magnitude for the bisection method. For Newton-Raphson, on the
other hand, e(i + 1) decreases by 12 orders of magnitude when e(i) has decreased by six orders
of magnitude (indicating a quadratic rate of convergence).
g ( x ) = 0, x ÎR n , g : R n ® R n (6.1)
6.3.1 Multivariate Newton-Raphson
Derivation of a general Newton-Raphson method uses multivariate Taylor’s series expan-
sion. If the n elements of the n-dimensional function g(x) are g1(x) , … , gn(x), then
( ) i
( )( x - x( ) ) +
0 = g 1 x ( ) + Ñg 1 x ( )
i i
(6.34)
0 = gn x( ) + Ñg ( x ) ( x - x ) +
(i )
n
(i ) (i )
242 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
( ) é ¶g ¶g k
Ñg k x ( ) = ê k
i
¶
ë 1x ¶x 2
¶g ù
kú
¶xn û
(6.35)
( ) ( ) ( x - x ( ) ) +
0 = g x( ) + J x( )
i i i
(6.36)
In the above, J(x(i)) is called the Jacobian, which is an n × n matrix. Unlike the single vari-
able case, the order of matrix multiplication is important. The Jacobian premultiplies the
difference (x − x(i)). It is easy to see that the above can be rearranged to obtain multivariate
Newton-Raphson formula as below:
x ( ) = x ( ) - J -1g x ( )
i +1 i i
( ) (6.37)
é ¶g 1 ¶g 1 ¶g 1 ù
ê ¶x
¶x2 ¶xn ú
ê 1 ú
ê ¶g ¶g 2 ¶g 2 ú
ê 2 ú (6.38)
J = ê ¶x1 ¶x2 ¶xn ú
ê ú
ê ú
ê ú
ê ¶g n ¶g n
¶g n ú
êë ¶x1 ¶x2 ¶xn úû x = x (i )
In the rest of this chapter, I will drop the boldface. The examples considered hereafter will
be multivariate. It bears repeating that all the vectors, unless stated otherwise, will be col-
umn vectors; subscripted number indicates corresponding element of the vector, and the
iteration index is indicated by superscript in bracket.
The multivariate Newton-Raphson technique will be demonstrated using the chemostat
example from Chapter 3. The model for the growth of microorganisms, substrate consump-
tion, and product formation in a continuous stirred reactor is given by
0 = D ( S f - S ) - rg
0 = -DX + rg Yxs
0 = -DP + rg Yps (6.39)
SX
rg = m max
Ks + S
The model parameters and solution conditions may be used from the case study in Chapter 3.
The solution is given in the next example.
Nonlinear Algebraic Equations ◾ 243
é ¶rg ¶rg ù
ê -D - ¶S -
¶X
0 ú
ê ú
¶rg ¶rg
J = ê Yxs -D + Yxs 0 ú
ê ¶S ¶X ú
ê ¶r ¶rg ú
ê Yxs g Yps -D ú
êë ¶S ¶X úû
function gVal=funMonod(C,monodParam)
mu=monodParam.mu; K=monodParam.K;
Yxs=monodParam.Yxs; Yps=monodParam.Yps;
D=monodParam.D; Sf=monodParam.Sf;
%% Variables to solve for
S=C(1); X=C(2); P=C(3);
%% Model equations
rg= mu*S/(K+S)*X;
gVal(1,1)=D*(Sf-S) - rg;
gVal(2,1)=-D*X + Yxs*rg;
gVal(3,1)=-D*P + Yps*rg;
end
The only difference in the above function being that it is a function of only C, and
not of time t, since a steady state solution is sought. The function jacMonod.m for
calculation of numerical Jacobian is
function J=jacMonod(C,monodParam)
mu=monodParam.mu; K=monodParam.K;
Yxs=monodParam.Yxs; Yps=monodParam.Yps;
D=monodParam.D; Sf=monodParam.Sf;
%% Variables to solve for
S=C(1); X=C(2); P=C(3);
244 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
%% Jacobian calculation
r_s=mu*X*K/(K+S)^2; % drg/dS
r_x=mu*S/(K+S); % drg/dX
J = [-D-r_s, -r_x, 0;
r_s*Yxs, -D+r_x*Yxs, 0;
r_s*Yps, r_x*Yps, -D];
end
The function g(C) and Jacobian J(C) can then be used in Newton-Raphson solver. The
driver script, monodRun.m, is given below:
The value of the three state variables at steady state using the above code is
On the other hand, if the initial guess were changed to C0=[4; 1; 1], then
Note that the above signifies multiple coexisting steady states. This will be discussed
in further detail in Case Studies (Section 6.5).
Nonlinear Algebraic Equations ◾ 245
The above example demonstrates steady state multiplicity. When there are no microor-
ganisms in the chemostat, no conversion of substrates into products occurs, and the out-
let stream contains unconverted substrate (at concentration of sf ), without any biomass or
product. On the other hand, for a range of values of dilution rate, D, there exists a steady
(
state C = éë0.091 3.68 3.19 ùû
T
) with a significant conversion of the substrate to the
product.
Numerically, Newton-Raphson suffers from a couple of problems. First, the intermediate
solutions may violate bounds on variables. For example, with C0=[4; 1; 1], the vari-
ous iterated values in the above Newton-Raphson example were
X =
4.0000 5.0281 5.0000 5.0000 5.0000
1.0000 -0.0211 -0.0000 -0.0000 -0.0000
1.0000 -0.0183 -0.0000 -0.0000 -0.0000
Note that the values of X and P at the first iteration are negative. This does not affect conver-
gence in Example 6.5. However, if the function g(C) included square root or logarithm of
C(2) or C(3), then Newton-Raphson would give complex values or fail. Another problem is
that while Newton-Raphson converges rapidly in the vicinity of the solution, x⋆, its global
convergence behavior is poor. That means starting from arbitrary initial guess, as would
be the case when the solution x⋆ is unknown, it often does not converge. It also fails if the
guess values or iterated values are close to local maxima or minima (where g′(x) is zero or
Jacobian is noninvertible). While this may not be a major issue for small problems (single to
a few variables), it can be a significant problem for Newton-Raphson in multiple variables.
Finally, the Jacobian may not be explicitly known. Some of these issues are discussed in the
rest of this section.
(i )
J k ,l =
gk ({x} + h ) - g ({x} ) ,
(i )
l k
(i )
l
h = xl( ) ´10-8
i
(6.40)
h
(i )
Here, the notation {x}l represents that only the lth element of the vector x is changed,
and all other elements of x(i) are kept at their original values. We have used the forward
difference formula for numerical integration. While the central difference formula is more
(i )
( )
accurate, it involves an additional function calculation, g {x}l - h . The following example
shows the use of Newton-Raphson with numerical Jacobian.
246 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
function J=monodJacNum(x,g,par)
% This function takes the current value x(i) and g(i)
% and returns the numerical Jacobian
for l=1:3 % Cycle over all x(l)
xbracket=x;
h=abs(xbracket(l))*1e-8;
xbracket(l)=x(l)+h;
g1=monodFun(xbracket,par);
J(:,l)=(g1-g)/h;
end
The driver can now be executed with the numerical Jacobian replacing the previous
case of analytical Jacobian. The code takes exactly the same number of iterations and
converges to the same solution:
( )
-1
x ( ) = x ( ) - é B( ) ù g x ( )
i +1 i i i
ë û
(i )
Bk ,l =
( ) ( )
g k xl( ) - g k xl( )
i i -1
(6.41)
( x( ) - x( ) )
l
i
l
i -1
(- éëg )
T
(i )
- g ( ) ù - B( ) éd ( ) ù éd ( ) ù
i -1 i -1 i i
û ë û ë û
B( ) = B( )
i i -1
T
(6.42)
é d (i ) ù é d (i ) ù
ë û ë û
where d ( ) = x ( ) - x ( )
i +1 i +1 i
(6.43)
Broyden’s method is mentioned for the sake of completeness and will not be investi-
gated further.
Nonlinear Algebraic Equations ◾ 247
( ) ( )
-1
s( ) = é J x ( ) ù g x ( )
i i i
(6.44)
ëê úû
x ( ) = x ( ) - s( )
i +1 i i
(6.45)
1
G ( x ) = gT g (6.46)
2
Since the above function is a quadratic function of real-valued vector, its minimum
value is 0. It is easy to see that the roots of g(x) form the minima of G(x). It can be shown
that the vector −s above provides the direction of steepest decrease in the value of G(x), and
thus the direction of steepest approach of g(x) to a root. Instability in Newton-Raphson
arises because the complete step may be too aggressive. In order to improve robustness, a
partial step may be taken:
x ( ) = x ( ) - ws( )
i +1 i i
(6.47)
There are various ways to obtain the line search parameter, ω. Several algorithms exist that
try to obtain ω in an “optimal” way. The key idea is to “search along the direction” −s such
that the most optimal reduction in the value of G(x(i + 1)(ω)) is obtained. Note that as per
Equation 6.47, the next iteration value x(i + 1) is written in terms of the parameter ω. This
is mentioned for the sake of completeness; the discussion of these methods is beyond the
scope of this text.
For the purpose of this section, I will discuss some heuristic ways to improve the per-
formance of Newton-Raphson. One simple way is to start with the full Newton-Raphson
step, that is, ω = 1. If the x(i) is in bounds and g(x(i)) has decreased, retain the step. Else, ω is
halved. This can be progressively done until a reasonable step is obtained. This back-tracing
process is repeated at each iteration.
Another option, which may work for certain specific problems (though it is not a gen-
eral purpose algorithm) is to choose an appropriate ω that works for the problem under
consideration. We often need to solve not a single problem but a family of problems involv-
ing the solution of nonlinear equations. For example, we may execute the simulation of
Example 6.5 for various different values of dilution rates (D), feed substrate concentration
(Sf ), or rate constants (μmax, Ks). One can set up a nominal case, and try the simulation
248 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
with ω = 1. If that does not yield a solution, choose ω = 0.5 and keep it constant for all
maxIter iterations. If that does not converge, repeat for ω = 0.25, etc. It should be noted
that using lower values of ω would necessitate a larger number of iterations; so maxIter
should be a reasonably large number (~100 × n, where n is the number of equations to
be solved). I have usually used ω > 0.1, and very rarely have I had to use ω = 2−4 or 2−5, for
reasonably tough problems.
The next example demonstrates one simple way to use the line-search parameter ω to
avoid negative values of concentrations. As we will see in Chapter 7, this idea of under-
relaxation is used in other iterative numerical techniques as well. It is widely used in com-
putational fluid dynamics (CFD) and finite element software to ensure stable performance
of the solver.
5.0281
-0.0211
-0.0183
Note that the step taken in value of S was 1.0281, while the maximum allowable step is
1 (since S cannot exceed Sf ). Based on this, we will redo Example 6.5 with ω = 0.9. The
following results are obtained:
As can be seen above, negative values are avoided in the iterations. Moreover, it takes
nine iterations for Newton-Raphson to converge.
PERSONAL NOTE:
I personally find the dichotomy between line search and under-relaxation quite interesting.
I use the two terms replaceably, since the core idea is the same: to use the parameter ω to
slow down or speed up an iterative numerical technique. The term “line search” is often used
with the Newton-Raphson method (or optimization methods) when the aim is to find the
value ω so that the one approaches the solution (or minima) in the “best” manner. The term
“under-relaxation” is commonly used in Gauss-Siedel (Chapter 7) or in CFD, where the aim
is to slow down the steps in order to ensure robust and stable algorithm.
Nonlinear Algebraic Equations ◾ 249
If the above heuristic ideas do not solve the problem, more advanced techniques based
on trust region or other methods (such as those used by MATLAB solvers) need to be
used. Besides, there exist other methods, rather than the one shown in Example 6.7, that
are designed for nonlinear equations in the presence of constraints. Either of these are not
within the scope of this book.
6.4 MATLAB® SOLVERS
MATLAB solvers for nonlinear algebraic equations are provided in the Optimization
Toolbox and are not available in the base MATLAB package. If optimization toolbox is not
available, one may need to write one’s own solution code based on algorithms discussed in
the previous sections. This section discusses two MATLAB methods, fzero and fsolve,
to solve single-variable and multivariable nonlinear equation problems, respectively.
X = fzero(FUN,X0)
Find a zero of the function FUN near X0, if X0 is scalar.
X = fzero(FUN,X0), where X0 is a vector of length 2,
Assumes X0 is a finite interval where the sign of
FUN(X0(1)) differs from the sign of FUN(X0(2)).
Calling fzero with an initial interval (using X0 as a vector of length 2) guarantees fzero
will return a solution if it exists. The fzero algorithm first attempts an inverse quadratic
step; if that results in x(i + 1) that lies between x(L) and x(U), then the solution is accepted. If the
value x(i + 1) lies outside the interval, a bisection step is used instead. The solution v⋆ remains
bracketed all the time. I recommend using fzero supplying an initial interval (rather than
a scalar initial value), as seen in the next example.
% Operating conditions
param.P=1e7; % Pressure (Pa)
param.T=340; % Temperature (K)
Note that the underlined line was the only line that was added to solve the problem
using fzero. The result using the above code is the same as those in the previous
examples:
The above call to fzero uses default optimization parameters. User-specified options
can be supplied to fzero using
X = fzero(FUN,X0,OPTS)
where OPTS is a structure that can be created using the OPTIMSET command. Setting
options will be covered in the next section, while discussing fsolve.
If more information regarding the solution is required, the following output arguments
can be used with fzero:
[v,gVal,flag,info]=fzero(@(v) RK(v,param),[vL,vU]);
When used in the above example, the value returned for flag = 1, indicating that the
solution was found successfully by fzero. The following information is obtained from the
structure info: that fzero required seven iterations and it used a combination of bisec-
tion and Brent’s inverse quadratic interpolation methods.
With the above expression, I am using another nonlinear equations solver, fsolve, avail-
able in MATLAB (optimization toolbox). fsolve solves the above problem and returns
the value
vSol =
1.6460e-04
This solution is same as the ones obtained using other techniques in this chapter.
Nonlinear Algebraic Equations ◾ 251
The syntax for using fsolve is similar. Let us go over the similarities and differences in
the usage of fzero and fsolve: (i) Like fzero, it requires users to supply function and
an initial guess, and (ii) unlike fzero (and similar to Newton-Raphson), fsolve accepts
only a single initial guess and not an interval. The second point is quite important because
fsolve is a multivariate solver for nonlinear equations of the type
g ( x ) = 0, x Î R n , g : R n ® R n (6.1)
X = fsolve(FUN,X0)
where the size of vector X0 is n × 1, where n is the number of equations to be solved. Thus,
as per the above equation, the size of X0 and the size of function values returned by FUN
should be the same.
Coming back to the Redlich-Kwong example, the solution of the nonlinear equation is
v = 1.646 ´ 10-4
fsolve uses the following default tolerance values: tolX=1e-6 and tolFun=1e-6.
With this option, the solution is considered to be converged if
x ( ) - x ( ) < tolX
i +1 i
(6.48)
( ) ( )
g x ( ) - g x ( ) < tolFun
i +1 i
(6.49)
Since the value of molar volume is expected to be of the order of 10−4 for this problem,
tighter tolerance values—at least four orders of magnitude lower—are recommended. The
tolerance values can be specified using the optimset command and then passing the
options structure to fsolve:
opts=optimset('tolX',1e-10,'tolFun',1e-10);
vSol=fsolve(@(v) RK(v,param),vU,opts);
With this change, the solution is now obtained with a more stringent set of error tolerance
values. I bring to your attention that in this particular example, higher tolerance values
resulted in almost the same solution. However, this is not always the case, and the tolerance
values should be carefully chosen based on the order of magnitude of the expected solution
x⋆ and the order of magnitude of the function g(x(0)).
Since fsolve is a multivariate solver, I will now use it for the chemostat example below.
252 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
0.0909
3.6818
3.1909
Let us now analyze the above problem a bit further. First, the solutions are concentrations
of substrate, biomass, and product. All of these are expected to be of the order ~ O (1). The
initial guess was C0=[0.25; 4.0; 4.0]. The function, monodFun, evaluated at this
initial condition gives
>> monodFun(C0,monodParam)
ans =
-0.5250
0.3500
0.2500
The typical values of the function g(x) computed by monodFun are of the order O ( 0.1).
Therefore, the default tolerance values of tolX=1e-6 and tolFun=1e-6 are suffi-
cient and need not be changed for this example. Although the readers know this, it still
is worth repeating that the initial guess and solution vectors C0 and X (respectively) are
3 × 1 vectors; the function monodFun returns g(x) as a column vector of the exact same
dimension.
Nonlinear Algebraic Equations ◾ 253
RT a/ T
P= - (6.2)
v - b v (v + b)
The above equation was put in the standard form of nonlinear equations in one unknown as
v -b a
P ( v - b ) - RT + =0 (6.4)
v (v + b) T
g (v )
The above nonlinear equation was solved for given values of T and P to obtain the corre-
sponding molar volume, v.
Another popular EOS is Peng-Robinson EOS:
RT a
P= - (6.50)
V - b V (V + b ) + b (V - b )
An interested reader may use the above PR EOS and obtain the molar volume for given T
and P. You may choose the parameter values of a = 0.364 and b = 3 × 10−5.
b
( )
ln Pksat = a -
T + c
(6.51)
where
Pksat is the saturation pressure in kPa
T is the temperature in °C
Given a temperature, the above equation can be used to calculate the saturation vapor pres-
sure, or vice versa.
Now consider a mixture of two components. If the mixture obeys Raoult’s law, then the
vapor pressure of the component in the gas phase is simply the saturation pressure from
Equation 6.51 multiplied by its liquid-phase mole fraction:
pk = x k Pksat (6.52)
254 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
This can be used for vapor-liquid equilibrium (VLE) calculations. At the boiling point of
a pure fluid, the saturation pressure equals the total pressure.* In case of a mixture, we
equivalently calculate bubble point and dew point temperatures for the mixture at a given
pressure (or pressure at a given temperature) for various liquid- and gas-phase composi-
tions, respectively.
The value of temperature that solves the above equation is the bubble point. Note that the
function g(T) is a single function in one variable, given by
P1sat (T ) P2sat (T )
1 - x1 + (1 - x1 ) =0 (6.54)
P P
gB (T )
Additionally, to obtain the initial guesses, we will use the property that the bubble point of
ideal mixture lies between the boiling points of the two individual species.
y k P = x k Pksat (T ) (6.55)
x1 =
yP 1 (1 - y1 ) P
, 1 - x1 = sat (6.56)
P (T )
1
sat
P1 (T )
P P
1 - y1 sat + (1 - y1 ) sat =0 (6.57)
P1 (T ) P2 (T )
gD (T )
* For example, at its boiling point of 100°C, the saturation pressure of water is 1 atm.
Nonlinear Algebraic Equations ◾ 255
function gB=bublFun(T,par)
% Bubble point residual function (6.54) for fzero
pSat=antoine(T,par); % Get pSat for both species
x=par.x;
P=par.P;
gB=1-x*pSat(1)/P-(1-x)*pSat(2)/P;
end
function gD=dewFun(T,par)
% Dew point residual function (6.57) for fzero
pSat=antoine(T,par); % Get pSat for both species
y=par.y;
P=par.P;
gD=1-y*P/pSat(1)-(1-y)*P/pSat(2);
end
Both these functions require antoine.m, which takes in the temperature and cal-
culates the saturation pressure of all the species using Equation 6.51. This function is
given below:
function pSat=antoine(T,par)
a=par.a;
b=par.b;
c=par.c;
pSat=exp(a - b./(T+c));
end
Note that the underlined command is vectorized with respect to the species. Given
Antoine’s coefficients for n species, it will return the saturation pressures Pksat for all the
species as a vector at the temperature T.
256 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
Both the bubble point and dew point calculations are performed in the same driver
script VLEdriver.m. The calculations are performed in a loop, for a range of values
of x1 ∈ [0, 1] for bubble point and y1 ∈ [0, 1] for dew point:
Figure 6.5 shows the T–x–y curve for this system. The last point, corresponding to
x1 = 1, is the boiling point of acetonitrile, and the first point, corresponding to x1 = 0,
is the boiling point of nitromethane. The lower line is the bubble point curve, whereas
the upper line is the dew point curve.
This method can be easily extended to any system in m-components. Standard thermody-
namics texts use the example of ternary mixture of acetonitrile, nitromethane, and acetone
105
100
95
T (°C)
90
85
80
0 0.5 1
(x or y)acetonitrile
FIGURE 6.5 T–x–y diagram for acetonitrile-nitromethane mixture, with mole fraction of acetoni-
trile as the abscissa.
Nonlinear Algebraic Equations ◾ 257
é æ E ö ù
0 = F ( C A0 - C A ) - êk0 exp ç - ÷ CA ú V
ë è RT ø û
é æ E ö ù
0 = F rc p (T0 - T ) + ( -DH ) êk0 exp ç - ÷ C A ú V - UA (T - Tj ) (6.58)
ë è RT ø û
0 = Fj r j c j (Tj 0 - Tj ) + UA (T - Tj )
The parameters for this system are given in Table 6.1 (reproduced from Chapter 3). In
Chapter 3, an ODE solver (ode45) was used to obtain the transient simulation results
for the CSTR. After 50 s of operating time, the CSTR exit conditions were obtained
as CA = 461.0, T = 647.6, Tj = 472.6. Thereafter, the inlet temperature was reduced to
298 K, and the exit conditions obtained at the end of 200 s were CA = 3991, T = 298.8,
and Tj = 298.4.
Note the qualitative difference between the solutions obtained with T0 = 350 K and
T0 = 298 K. In the former case, there is significant conversion of the reactant and a
CA0, T0
Tj0
CA, T
Tj
TABLE 6.1 Model Parameters and Operating Conditions for Jacketed CSTR
CSTR Inlet Cooling Jacket Other Parameters
function gVal=cstrFunTry(x,par)
%% Inlet and current conditions
C0=par.C0; T0=par.T0; Tj0=par.Tj0;
Ca=x(1); T=x(2); Tj=x(3);
%% Key Variables
rxnRate=par.k0*exp(-par.E/T)*Ca;
hXfer=par.UA*(T-Tj);
FRhoCp=par.F*par.rho*par.cp;
FjRhoC_j=par.Fj*par.rhoj*par.cj;
%% Model Equations
gVal=zeros(3,1);
gVal(1)=par.F*(C0-Ca)-rxnRate*par.V;
gVal(2)=FRhoCp*(T0-T)+par.DH*rxnRate*par.V-hXfer;
gVal(3)=FjRhoC_j*(Tj0-Tj)+hXfer;
Nonlinear Algebraic Equations ◾ 259
Before discussing the CSTR driver script, let us check the values of g(x(0)). Based on
the solution obtained in the case study from Chapter 3, the value of initial guess is
chosen as:
>> cstrFunTry(x0,modelPar)
ans =
1.0e+06 *
0.0000
-4.3979
-0.8400
The function file needs to be modified based on these observations, as discussed below.
The value of g1(x) is several orders of magnitude lower than the other two functions. As a
result, the three equations are modified as follows:
0 = F ( C A0 - C A ) - rAV
0 = (T0 - T ) +
( -DH ) rAV - UA (T - Tj ) (6.59)
F rc p
UA (T - Tj )
0 = (Tj 0 - Tj ) +
Fj r j c j
CSTR function file is now modified. The modified file is shown below.
function gVal=cstrFun(x,par)
%% Inlet and current conditions
C0=par.C0; T0=par.T0; Tj0=par.Tj0;
Ca=x(1); T=x(2); Tj=x(3);
%% Key Variables
rxnRate=par.k0*exp(-par.E/T)*Ca;
hXfer=par.UA*(T-Tj);
FRhoCp=par.F*par.rho*par.cp;
FjRhoC_j=par.Fj*par.rhoj*par.cj;
%% Model Equations
gVal=zeros(3,1);
gVal(1)=par.F*(C0-Ca)-rxnRate*par.V;
260 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
gVal(2)=(T0-T) + (par.DH*rxnRate*par.V-hXfer)/FRhoCp;
gVal(3)=(Tj0-Tj)+hXfer/FjRhoC_j;
>> cstrFun(x0,modelPar)
ans =
32.1916
-70.3663
-42.0000
Since the three elements of g(x) are of the same order of magnitude as the other,
this function will be used in solving the CSTR model equations. The driver script,
cstrDriver.m, is given below:
When the above script is executed, fsolve converges and value of X returned is:
( )
The value of gval at the above conditions is of the order ~ O 10-10 , indicating that
the solution is converged.
Nonlinear Algebraic Equations ◾ 261
Furthermore, there isn’t just one solution at this value of T0 = 350 K. There are, in fact, three
solutions, as demonstrated in the example below.
This leads to the third and final solution, x = [2450.2 mol/m3; 476.4 K; 387.2 K].
(
In all the three cases, the value of gVal returned by fsolve was O 10-8 , indicating that the )
function g(x⋆) is converged successfully.
6.5.4 Recap: Chemostat
The chemostat example was covered in Section 6.4.2 (Example 6.9). It involved solving
three equations in three unknowns simultaneously:
0 = D ( S f - S ) - rg
0 = -DX + rg Yxs
0 = -DP + rg Yps (6.39)
SX
rg = m max
Ks + S
* At this stage, this is being presented as trial and error. The case study in Chapter 9 will provide specific details on how to
obtain a reasonable initial guess close to the solution.
262 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
ò
f ( x ) = K ( x ,s ) j ( s ) ds
a
(6.60)
Note that f(x) is a function of x only, and not of s. A related equation is Volterra integral
equation of the first kind:
ò
f ( x ) = K ( x ,s ) j ( s ) ds
a
(6.61)
The main difference is that one of the limits of integration is also the variable x. The main
crux of integral equations is that the left-hand side f(x) is known. The function K(x, s) is
called kernel function and is also known.
Consider the design equation of a plug flow reactor (PFR) (see Chapter 3) with an inlet
concentration of reactant as 1 mol/m3:
Cout
V dC
Q
=-
ò r (C )
1
(6.62)
In Chapter 3, we integrated the above to find the volume of the PFR that achieves 75% con-
version. In other words, Q and Cout were known, whereas V was unknown. This was there-
fore a straightforward numerical integration problem. We used an ODE solver in Chapter 3
when we asked the question, “find the conversion from a reactor of volume 0.125 m3.” Let
us now see an alternative way to pose the problem as nonlinear integral equation and solve
the problem using techniques from this chapter.
If the volume is given, the right-hand side is known, whereas the upper limit of integra-
tion (Cout) is the unknown to be found. Thus, x ≡ Cout in Equation 6.61. Rearranging the
above equation
Cout
V dC
Q
+
ò r (C )
=0 (6.63)
1
g ( Cout )
The aim of the nonlinear equation solver is to compute the Cout that is achieved in a PFR
with residence time, V/Q = 1.25 s. We will do this for two cases: first-order reaction and a
Nonlinear Algebraic Equations ◾ 263
complex kinetic model from Chapter 3. In either case, the nonlinear equation (6.63) to be
solved may be written as
V
g ( Cout ) º + I ( Cout ) = 0 (6.64)
Q
In the first example, since the reaction is first-order, the integral I(Cout) can be solved ana-
lytically. The analytical and numerical ways of solving for first-order reaction will be con-
sidered in the next section for pedagogical reasons. Thereafter, complex kinetic expression
that requires numerical solution is covered in Section 6.5.5.2. A more experienced reader
may skip Section 6.5.5.1 and go directly to Section 6.5.5.2.
6.5.5.1 First-Order Kinetics
When the reaction is a first-order reaction (r = kC), the integral is given by
1 Cout 1
I ( Cout ) = éë ln ( C ) ùû = ln ( Cout ) (6.65)
k 1 k
When substituted in Equation 6.63, we get a nonlinear equation in one unknown (Cout),
which can be solved using fzero, as shown in the next example.
X=(1-Csol)/1; % Conversion
X=round(X*100,1); % ... in percentage
disp(['Conversion = ',num2str(X),'%']);
We can compute the true value of concentration and error in numerical computation as
Ctrue = exp(-V*k/Q);
err=abs(Csol-Ctrue);
The above driver calls the function pfrIntFun.m, which calculates g(Cout) from
Equation 6.64. This function is given below:
function gVal=pfrIntFun(Cout,par)
% To calculate g(Cout) for simulating a PFR
% using integral equation approach
% where, g = V/Q + I(Cout)
% V/Q is the residence time
% I is integral [dC/r] at given Cout
%% Calculating integral
intValue=log(Cout)/par.k;
%% Calculate g(Cout)
gVal=par.V/par.Q + intValue;
The conversion is found to be 46.5%. The error using the above approach is ~ O 10-16 . ( )
One of the important features of numerical techniques is that the function (g(Cout) or
I(Cout) in this example) does not need to be explicit; it can equivalently be implicitly cal-
culated. The design problem was solved in Chapter 3, wherein the volume was obtained
after calculating the integral using MATLAB function trapz.* Thus, the numerical
solution that employs trapz can be used instead of the analytical solution. This would
mean replacing the underlined sections of the function file pfrIntFun.m (in the code
given above) with a numerical approach that uses trapz. This is shown in the example
below.
* See Appendix D for more information on the trapezoidal rule for integration. The MATLAB function trapz that implements
the trapezoidal rule is also discussed in the appendix.
Nonlinear Algebraic Equations ◾ 265
function gVal=pfrIntFun(Cout,par)
% To calculate g(Cout) for simulating a PFR
% using integral equation approach
% where, g = V/Q + I(Cout)
% V/Q is the residence time
% I is integral [dC/r] at given Cout
%% Calculating integral using trapz
n=50; % number of intervals
h=(1-Cout)/n; % step-size for integration
C=1:-h:Cout;
r=par.k * C;
intValue=trapz(C,1./r);
%% Calculate g(Cout)
gVal=par.V/par.Q + intValue;
Compare this modified code with the previous one to notice that the underlined sec-
tion of the previous code (titled “Calculating integral”) is replaced with the
section titled “Calculating integral using trapz”). There is no other
change. With this change, the error in computing Cout is expected to be higher, since
it will be governed by the error in computing the integral. With n = 50 points used to
compute the integral using trapezoidal rule, the following results were obtained:
Conversion = 46.5%
>> disp(err)
9.5955e-06
It is also possible to check the effect of the number of intervals on the solution. When
the number of intervals was increased by one order of magnitude to n = 500, the error
in calculation of Cout decreased by two orders of magnitude. With n = 500
>> disp(err)
9.5967e-08
This observation may be attributed to the fact that the global truncation error in trap-
( )
ezoidal rule* is O h2 , where h = (Cin − Cout)/n.
The primary take-home message from this example is that a numerical technique is agnos-
tic to how g(Cout) was computed, either as a closed-form expression as per Equations 6.64
and 6.65, or whether integral required in g(Cout) is calculated numerically. Equipped with
this, we proceed to solve the original problem of simulating a PFR with complex kinetics.
* See Appendix D for more information on the trapezoidal rule for integration. The MATLAB function trapz that implements
the trapezoidal rule is also discussed in the appendix.
266 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
6.5.5.2 Complex Kinetics
We are now ready to solve the original problem, which was to find the variation in out-
let concentration from a PFR as a function of length of the PFR, given the kinetic rate
expression:
kC
r=
1 + KrC2 (6.66)
-1 -2 6
where k = 2 s , K r = 1 mol × m
The nonlinear equation to be solved, (Cout) = 0 , is given by Equation 6.64, where the integral
I(Cout) is computed numerically using trapz:
Cout
dC
I ( Cout ) =
òr (C )
(6.67)
1
trapz ( C ,1./r )
modelParam.k=2;
modelParam.Kr=1;
In the function file, the reaction rate term from first-order kinetics is modified for the
rate expression (6.66), as was covered in Chapter 3. The modified code is given below,
with the changes underlined:
function gVal=pfrIntFunLH(Cout,par)
% This function calculates g(Cout) for simulating
% a PFR using integral equation approach
% where, g = V/Q + I(Cout)
% V/Q is the residence time
% I is integral [dC/r] at given Cout
k=par.k;
Kr=par.Kr;
Nonlinear Algebraic Equations ◾ 267
The solution obtained using the above code is same as that obtained in Chapter 3:
Conversion = 87.8%
Now that the above code gives the appropriate conversion from PFR, we now need to mod-
ify the driver script. We will solve the integral equation problem for 10 different lengths
(L=0:0.05:5), store the solution in a vector (XOUT in the following script), and plot the
conversion vs. length. The resulting script is given below.
The results are shown in Figure 6.7. These results are the same as those obtained for
the same problem using the ODE-solving approach in Chapter 3. This demonstrates
the validity of the integral equation solving approach shown here.
I will end this section with some perspective about this problem. Simulating the behavior
of reacting species in a PFR is more conveniently handled using ODE solution techniques.
268 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
100
80
% Conversion
60
40
20
0
0 0.1 0.2 0.3 0.4 0.5
PFR length (m)
FIGURE 6.7 Conversion vs. length of PFR for LH kinetics using integral equation solving approach.
The intent of this case study was to demonstrate alternative routes to solving the same
problem. However, more importantly, it was to introduce readers to the fact that nonlin-
ear equation solving techniques are more general than what we would typically catego-
rize as “root finding”; these methods can be used for problems that can be cast into the
form g(x) = 0, where one needs to find x that satisfies these equations. The nature of g(⋅)
need not be explicit; it could be implicitly specified or obtained as a solution using another
numerical procedure. Finally, this case study also introduces the reader to integral equa-
tions. Although uncommon in chemical engineering, integral equations sometimes arise in
some problems involving radiation.
6.6 EPILOGUE
Solving ODEs and nonlinear algebraic equations form two of the most important classes of
examples of interest to chemical engineers. While Chapter 3 focused on ODEs, this chapter
focused on solving nonlinear equations of the form
g ( x ) = 0 (6.1)
I used the same approach in this chapter as in Chapter 3. An example of finding the molar
volume of a real gas using EOS was used to motivate the problem. This is an example of a
single-variable nonlinear equation solving problem.
Thereafter, several methods for solving algebraic equations were discussed in Section 6.2.
Starting with single-variable problem, theoretical results for selected numerical methods
were discussed. Rather than focusing on derivations, the aim of this chapter was to high-
light the practical implications in solving problems of interest. Numerical methods for solv-
ing general algebraic equations of the type (6.1) fall into two categories: bracketing methods
and open methods. In the case of the former, two initial guesses always “bracket” a solution,
whereas this requirement is not applicable for open methods.
Nonlinear Algebraic Equations ◾ 269
MATLAB provides several algorithms for solving nonlinear equations. The two more
popular and versatile algorithms provided in MATLAB are fzero and fsolve. The
former is a single-variable solver that preferably uses a bracketing method, whereas the
latter is a multivariable general purpose solver. The single-variable bracketing meth-
ods were introduced in Section 6.2 to help readers make informed choices while using
fzero algorithm. Newton-Raphson and its variants, which are the most popular non-
linear equation solving algorithms, were discussed in some detail. This highlights features
and limitations of nonlinear equation solving algorithms and several variations used to
address those limitations.
Solving coupled nonlinear equations for large-dimensional problems can sometimes be
a tough problem to tackle. A few practical guidelines that I personally follow for MATLAB
are summarized below.
(B) Choosing appropriate step tolerances: Convergence metrics is an important point that
needs to be stressed in this summary. The importance of using solver options was
introduced in Section 6.4. Convergence is verified using tolerance on the step
x( ) - x( ) < e
i +1 i
(6.48)
( ) ( )
f x( ) - f x( ) < d
i +1 i
(6.49)
The default values for both the tolerances in MATLAB are 10−6. In the EOS e xample,
since the molar volume was ~1.6 × 10−4, a more stringent stopping criterion for ε
(tolX optimizer option in MATLAB) was used. Similar care is required with respect
to the stopping criterion δ as well (tolFun optimizer option in MATLAB).
(C) Choosing appropriate function tolerances: Consider the following:
0.5
g(x) = (x–1)3
0
0.5 1 1.5
x
–0.5
–1
>> opt=optimset('tolFun',1e-10);
* Several users implicitly do not trust fsolve for its “inability” to solve such apparently simple problems. A problem that is
easy to solve for humans may not be easy for numerical technique. Problems of the nature g(x) = (x − 1)n or g(x) = xn − 1 are
textbook examples for exercising caution while using gradient-based solvers (Newton-Raphson, secant, or fsolve). This
book is intended to equip readers to understand why the method appears to fail and how to make it work.
Nonlinear Algebraic Equations ◾ 271
g(x) computed was small, experimenting with different values of tolFun indicates
whether solution returned by fsolve is reasonable.*
(E) When solver does not converge: Solver may not converge even after attempting to solve
the problem with several different initial guesses. In such a case, the problem may be
reformulated. For example, material balance equation in CSTR
é æ E ö ù
0 = F ( C A0 - C A ) - êk0 exp ç - ÷ CA ú V (6.58)
ë è RT ø û
may be rearranged as
æ E ö æ CA ö
0 = 1 - k0 texp ç - ÷ç ÷ (6.68)
è RT ø è C A0 - C A ø
This brings us to the end of this chapter, where we focused on solving a generic system of
equations. Discretization of ODE-BVPs (boundary value problems) or partial differential
equations (PDEs) result in a set of nonlinear equations with a specific structure. Methods
that exploit the special structure and iterative methods that build upon the concept of
under-relaxation will be the focus of Chapter 7.
EXERCISES
Problem 6.1 Peng-Robinson Equation of State
RT a
P= - (6.50)
V - b V (V + b ) + b (V - b )
* Changing the tolFun will only affect the results when the function tolerance criterion causes fsolve to converge
(i.e., change in the function value is less than the function tolerance). If instead fsolve convergence is due to other rea-
sons, it is sufficient to verify if the function value at the solution, g(x⋆) is lower than a desired threshold.
272 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
x = G ( x ) (6.18)
Starting with the initial condition, x(0), the next iteration value is obtained as
x(1) = G(x(0)). Thereafter, subsequent iteration values are obtained as
i +1 i
( )
x ( ) = qx ( ) + (1 - q ) G x ( )
i
G ( x ) - G ( x( ) )
(i ) (6.69)
i -1
S
q= , S=
S -1 x( ) - x( )
i i -1
7.1 GENERAL SETUP
Numerical methods for solving algebraic equations were covered in Chapter 6, wherein a
generic problem of finding root(s) of the following set of equations was discussed:
g ( x ) = 0 (7.1)
where
x ∈ R n is an n-dimensional column vector
g : R n → R n is an n-dimensional vector function
A specific case is when the above system of equations is linear. One could then write
g = Ax − b, and the equations to be solved are a set of linear equations:
Ax = b (7.2)
with
A ∈ R n´n is a matrix
b ∈ R n is a column vector
The standard methods for solving linear equations that are based on Gauss Elimination are
introduced in Appendix C. A significant amount of time is devoted to Gauss Elimination
and related methods in typical numerical techniques books. Since MATLAB® provides
powerful tools for solving linear equations, we are simply going to use them in this book.
The methods for nonlinear equations in Chapter 6 and linear equations in Appendix
C are general-purpose methods. These methods do not make any assumption about the
structure of the function g(x) or the matrix A, respectively. However, in a large number of
273
274 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
engineering problems of interest, g(x) and/or A have a particular sparse structure. These
structures of the functions/matrix can be exploited to solve the problems efficiently. This
chapter focuses on problems where g(x) and/or A have banded structures.
d 2f æ df ö
a = f ç x , f, (7.3)
dx 2
è dx ÷ø
Two conditions are required for solving the above problem. If the two conditions are pro-
vided at the same initial location, the above can be converted into a set of two first-order
ODE-IVP. However, there are several occasions when the conditions for the above are spec-
ified at two different boundaries. This leads to ODE-BVP.
The boundary conditions are of three types:
f ( x0 ) = a (7.4)
f¢ ( x0 ) = b (7.5)
f¢ ( x0 ) = af ( x0 ) + b (7.6)
Either of these boundary conditions is applied at each end of the computational domain.
7.1.2 Elliptic PDEs
Another set of problems that have qualitatively similar numerical features are elliptic PDEs,
which were introduced in Chapter 4. Systems that are governed by elliptic PDEs are those
where diffusive behavior is important, and therefore, conditions at all the boundaries affect
the solution value at any point within the domain. Owing to these common physical fea-
tures, a similar approach to solving and analyzing these systems can be used. A typical
elliptic PDE in 2D is given by
¶2f ¶2f æ ¶f ¶f ö
Gx 2
+ G y 2
= f ç x ,y ,f, , ÷ (7.7)
¶x ¶y è ¶y ¶y ø
Special Methods for Linear and Nonlinear Equations ◾ 275
The above PDE is solved subject to two boundary conditions in each direction. As in the
case of ODE-BVP, these may be Dirichlet (Equation 7.4), Neumann (Equation 7.5), or
mixed (Equation 7.6) boundary conditions.
If convection dominates in one of the directions (e.g., Γx ≈ 0), the problem reduces to par-
abolic PDE; if diffusion is negligible in both directions, we get first-order hyperbolic PDEs.
Methods for solving hyperbolic and parabolic PDEs were discussed in Chapter 4.
Ai ,i - xi - + + Ai ,i xi + + Ai ,i + xi + = bi (7.8)
Thus, the elements in the ith equation depend only on the previous and subsequent ℓ ele-
ments. Extending this idea to nonlinear equations, the ith function gi(x) is such that it
depends only on (xi − ℓ, … , xi + ℓ), and it does not depend on (x1, … , xi − ℓ − 1) or (xi + ℓ + 1, … , xn).
Alternatively, the Jacobian matrix, at any x ∈ ℛn, has the sparse structure with bandwidth ℓ.
* The number of subdiagonal and superdiagonal elements need not be equal. I have taken the two to be equal for ease of
discussion. The band could span i − ℓsub to i + ℓsuper for ith row.
276 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
7.2.1.1 Tridiagonal Matrix
A tridiagonal matrix is a matrix with ℓ = 1, that is, a matrix where only the diagonal element,
one subdiagonal element, and one superdiagonal element are nonzero. An example of a
tridiagonal system was seen in Chapter 4, when we implemented method of lines, which
used the finite difference formula on the spatial derivative ∂/∂z and ∂2/∂z2.
Sometimes, it becomes important to view the structure of a banded/sparse matrix while
solving or debugging a problem. MATLAB provides a command, spy(A), that can be
used to visually inspect the structure of a matrix. This generates a plot with n rows and
columns, with the nonzero elements of A indicated with markers on the plot. I will use this
command to investigate some of the common banded structures we are likely to encounter
in chemical engineering problems.
Figure 7.1 shows some common sparse structures. Tridiagonal and banded structures
were discussed earlier. For the specific band-diagonal example in the figure, ℓ = 3. The third
common example is that of block-diagonal structure. This is quite similar to the band-diag-
onal structure but with some additional elements zero as well. A closer look at Figure 7.1
(bottom left) reveals that the overall structure contains a single block that is repeated four
times. The reason it is called block-diagonal is because one can imagine a 4 × 4 block, each
block containing a matrix. Only the diagonal blocks are nonzero, whereas the others are
a matrix of zeros. These repeating diagonal blocks have full or sparse structures. Finally, a
multibanded structure is self-explanatory in that it contains additional nonzero bands, as
seen in Figure 7.1.
(a) (b)
(c) (d)
FIGURE 7.1 Some of the common sparse structures, examined by MATLAB® spy command.
(a) Tridiagonal, (b) banded, (c) block-diagonal, and (d) multiband.
Special Methods for Linear and Nonlinear Equations ◾ 277
discuss an example and use it to set up the problem. Thereafter, the TDMA will be dis-
cussed in the context of this problem.
T ( 0 ) = T0 , T ( L ) = T¥ (7.10)
where
T is the temperature at any location in °C
T∞ is the temperature of the ambient
k is the thermal conductivity of the rod
h∞ is the heat loss coefficient
av is the external area of the rod per unit volume
For a cylindrical rod, av = 2/r. The above problem is an ODE-BVP, since the conditions are
specified at two separate boundary points, z = 0 and z = L. Before moving to the numerical
solution, let us discuss the true solution that can be obtained analytically since the above is
a linear ODE.
T0
T∞
T∞
1 2 3 4 n n+1
FIGURE 7.2 Heat conduction in a metal rod with heat loss to the surroundings and the domain with
finite difference discretization.
* The modern refrigerators use evaporative cooling, the model for which would be somewhat more complex than what we
need in this chapter.
278 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
7.2.2.1.1 Analytical Solution The above problem can be solved analytically by defining a devia-
tion variable, θ = (T − T∞), which results in the following ODE-BVP:
d 2 q h¥ av
- q=0
dz 2 k (7.11)
m2
q ( 0 ) = q0 , q ( L ) = 0
q = q0
(
sinh m ( L - z ) ) (7.13)
sinh ( mL )
T = T¥ + (T0 - T¥ )
(
sinh m ( L - z ) ) (7.14)
sinh ( mL )
The numerical solutions can be compared against the above analytical solution of the
ODE-BVP. In order to solve the original problem (7.9) numerically, the domain is
discretized into n-intervals using the finite difference approach. Figure 7.2 shows the dis-
cretized domain. Since the array indices in MATLAB start with 1, the vertices of the
discretized domain are numbered from i = 1 to i = (n + 1). The interval size is given by
Δ = L/n, and application of the central difference formula (see Appendix B) results in the
following equations:
Ti +1 - 2Ti + Ti -1
k = h¥ av (Ti - T¥ ) , i = 2 to n (7.15)
D2
The two boundary conditions are applied at i = 1 and i = (n + 1). Defining parameter
β = h∞avΔ2/k, the resulting (n + 1) linear equations are given by
T1 = T0
Ti +1 - 2Ti + Ti -1 = b (Ti - T¥ ) , i = 2 to n (7.16)
Tn+1 = T¥
* Please refer to any standard engineering math text for the derivation of the solution for this and related problems.
Special Methods for Linear and Nonlinear Equations ◾ 279
7.2.2.1.2 Tridiagonal Matrix Equation The above set of equations can be written in the standard
linear equations form by defining the solution vector x = [T1 T2 ⋯ Tn Tn + 1]T:
é1 0 0 0 0 ù é T1 ù é T0 ù
ê úê ú ê ú
ê1 - ( 2 + b ) 1 0 0 ú ê T2 ú ê -bT¥ ú
ê0 1 - (2 + b) 0 0 ú ê T3 ú ê -bT¥ ú
ê úê ú=ê ú (7.17)
ê úê ú ê ú
ê0 0 0 1 - ( 2 + b ) 1 ú ê Tn ú ê -bT¥ ú
ê úê ú ê ú
êë0 0 0 0 0 1 úû êëTn+1 úû êë T¥ úû
A x b
The above problem can be solved by specifying the matrix A and vector b and solving the
resulting equations in a standard manner: X = A\b. This is demonstrated in the next example.
A(1,1)=1; b(1)=T_0;
for i=2:n
A(i,i-1)=1;
A(i,i)=-(2+beta);
A(i,i+1)=1;
b(i)=-Ta*beta;
end
A(n+1,n+1)=1; b(n+1)=T_L;
X=A\b;
plot(Z,Ttrue,'-k',Z,X,'-.m');
err=max(abs(Ttrue-X)./Ttrue);
The results are plotted in Figure 7.3. The solid (black) line is the true solution, whereas
the two lines are the solutions for n = 5 and n = 40 nodes, respectively. The latter solu-
tion closely matches the true solution. The solution with n = 5 nodes is also of reason-
able accuracy.
The relative errors in temperature values were also calculated. Starting with n = 5, the
number of nodes was doubled until reasonable solution was obtained. The errors were
n err
5 0.0060
10 0.0015
20 3.8759e-04
40 9.7041e-05
The solution with n = 20 and n = 40 nodes are both reasonably accurate. It may also
be of interest to the reader to note that each time the step-size is halved (i.e., the
number of nodes is doubled), the error decreases by a factor of 4. This is because
( )
the O h2 accurate central difference formula governs the overall errors in this
problem.
100
80
Temperature (°C)
60
40
FIGURE 7.3 Temperature profile in a rod with heat conduction and heat losses. Dirichlet boundary
conditions are applied, that is, the temperatures at the two ends of the rod are specified.
Special Methods for Linear and Nonlinear Equations ◾ 281
The standard method of solving a general linear equation was used in the previous
e xample. This method works, and for a small-dimensional system as the one in Example 7.1,
the computational requirements are manageable. However, for larger-dimensional systems
and in cases where the problem needs to be solved multiple times, solving the full problem
is very inefficient. There are several methods that exploit the banded structure of matrix A
to solve the linear equation more efficiently. The TDMA, which is applicable to tridiagonal
systems, will be discussed in the following text.
7.2.2.2 Thomas Algorithm
Thomas algorithm, or TDMA, can be used to efficiently solve a linear system of equations
of the type (7.2), where A has a tridiagonal structure. A tridiagonal structure is when only
three diagonals of matrix A are nonzero:
éd1 u1 0 0 0 ù é b1 ù
ê ú ê ú
ê l2 d2 u2 0 0 ú ê b2 ú
ê0 l3 d3 0 0 ú ê b3 ú
ê úx = ê ú (7.18)
ê ú ê ú
ê0 0 dn-1 un-1 ú êbn-1 ú
ê ú ê ú
êë 0 0 ln dn ûú êë bn úû
Let the three diagonals be represented as vectors l, d, u. Thus, any equation may be written as
li xi -1 + di xi + ui xi +1 = bi (7.19)
As in the case of Gauss Elimination (see Appendix C), the TDMA aims to convert matrix A
into a lower triangular matrix using row operations, starting with the first row:
d1 x1 + u1 x2 = b1 (7.20)
In the first step, this is the pivot row and d1 is the pivot element. The following two-row
operations are done: (i) divide the entire row with the pivot element, and (ii) use the pivot
row to make subdiagonal element in the first column 0. These operations yield
The first row operation requires only two computations on the first row: u1 = u1/d1 and
b1 = b1/d1, and the diagonal element is then set to unity, d1 = 1. The only nonzero element in
the pivot column is the element l2, while all other elements are already zero. Thus, the row
operation with the row R1 as the pivot row is R2 = R2 - l2 R1. The only computations required
282 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
in this row are d2 = d2 - l2u1 and b2 = b2 - l2b1. Thus, at the end of the first sequence of opera-
tions, we have the first two rows as
é1 u1 0 0 0 b1 ù
ê ú
ê0 d2 u2 0 0 b2 ú (7.22)
ê úû
ë
d2 x2 + u2 x2 = b2 (7.23)
This modified row has the exact same form as Equation 7.20. Hence, the exact same proce-
dures are to be repeated. Generalizing, the operations involved in the ith step are
Ri
Ri ¬ , Ri +1 ¬ Ri +1 - li +1Ri (7.24)
di
I have dropped the over-bars and replaced the equality with an assignment operator “←”
to indicate that the right-hand side replaces the values on the left-hand side. Recall that the
above two operations have two calculations each:
The above calculations are repeated for all pivot rows, that is, i = 1 to (n − 1).
The number of mathematical operations required for TDMA is as follows: Equation 7.25
requires two division operations (the third one is simply an assignment since the result
di ← 1 is known a priori), whereas Equation 7.26 requires two multiplication and two sub-
traction operations (again li + 1 ← 0 is known a priori). Thus, the total number of operations
in the forward elimination steps is 4(n − 1) multiplication/division and 2(n − 1) addition/
( )
subtraction operations. This is significantly lower than O n3 operations required in Gauss
Elimination.
At the end of elimination operations, the equations are converted to the following form:
é1 u1 0 0 0ù é b1 ù
ê ú ê ú
ê0 1 u2 0 0ú ê b2 ú
ê0 0 1 0 0ú ê b3 ú
ê úx = ê ú (7.27)
ê ú ê ú
ê0 0 1 un-1 ú êb ú
ê ú ê n-1 ú
êë0 0 0 dn ûú êë bn úû
Special Methods for Linear and Nonlinear Equations ◾ 283
Now that the equations are in lower triangular form, the solutions can be obtained using
back-substitution. Starting for the last row, the back-substitution method (again see
Appendix C and compare with full back-substitution that follows Gauss Elimination) first
provides the solution xn:
xn = bn /dn (7.28)
Also notice that all the rows, starting from the second-last row, have the same structure.
Thus, the following back-substitution steps are used to obtain the solution:
xi = bi - ui xi +1 , i = ( n - 1) to 1 (7.30)
Thus, the overall back-substitution steps require n multiplication/division steps and (n − 1)
addition/subtraction steps. The next example shows implementation of TDMA.
function x=thomasSolve(l,d,u,b)
% Implementation of Thomas algorithm
% BACKGROUND
% ----------
% Solves the equation of the form
% Ax = p
% Where A is a tri-diagonal matrix:
% [d(1) u(1) ]
% [l(2) d(2) u(2) ]
% [ l(3) d(3) u(3) ]
% [ ... ... ... ]
% [ ... ... ... ]
% [ l(n-1) d(n-1) u(n-1)]
% [ l(n) d(n)]
% The function takes in three vectors for A:
% l (n*1) sub-diagonal vector
% d (n*1) diagonal element vector
% u (n*1) super-diagonal vector
284 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
% Forward elimination
for i=1:n-1
% Divide row by diagonal
u(i)=u(i)/d(i);
b(i)=b(i)/d(i);
d(i)=1;
% Get zeros in pivot column
d(i+1)=d(i+1)-u(i)*l(i+1);
b(i+1)=b(i+1)-b(i)*l(i+1);
l(i+1)=0;
end
% Back-substitution
x=zeros(n,1);
x(n)=b(n)/d(n);
for i=n-1:-1:1
x(i)=b(i)-u(i)*x(i+1);
end
Solving the above equations results in the exact same solution as that obtained in
Example 7.1, albeit with lower computational effort.
dT
=0 (7.31)
dz z =L
whereas if it loses heat to the surroundings, the following mixed boundary condition is
obtained:
dT
-k = h¥ (TL - Ta ) (7.32)
dz z =L
For implementing the boundary conditions at the boundary, two approaches can be used:
(i) three-point backward difference formula, which is used to discretize Equations 7.31 and
7.32, or (ii) ghost-point approach. We will use the latter in this example. Recall that the
discretized equation for the (N + 1)th node is
As discussed in Chapter 4, this introduces a “ghost-point” Tn + 2 outside the solution domain.
The central difference formula applied to the boundary condition will be used to rewrite
Equation 7.33. In case of an insulated boundary, Equation 7.31 reduces to Tn+2 = Tn. On the
other hand, substituting the central difference formula for Equation 7.32 yields
h¥ ( 2D )
Tn+2 - Tn =
-k
(Tn+1 - Ta ) (7.34)
h¥ ( 2D )
Tn+2 = Tn -
k
(Tn+1 - Ta ) (7.35)
The equation above replaces the last equation in Equation 7.16. The resulting system of
linear equations is
é1 0 0 0 0 ù é T1 ù é T0 ù
ê ú
ê1 - ( 2 + b ) 1 0 0 ú êê T2 úú êê -bT¥ úú
ê0 1 - (2 + b) 0 0 ú ê T3 ú ê -bT¥ ú
ê úê ú=ê ú (7.37)
ê úê ú ê ú
ê0 0 0 1 - (2 + b) 1 ú ê Tn ú ê -bT¥ ú
ê úê ú ê ú
êë0 0 0 0 2 - ( 2 + g ) úû êëTn+1 úû êë - gT¥ úû
A x b
Note that only the last row of matrix A has changed compared to Equation 7.17. Let us solve
this problem in the next example.
* A more appropriate way to solve this problem is to use the TDMA code from Example 7.2. However, the reader is urged to
solve this using TDMA as an exercise (and compare with results in Example 7.3). The TDMA code from Example 7.2 takes
in the three diagonal elements of the matrix A and the vector b as four vectors, l,d,u,b.
Special Methods for Linear and Nonlinear Equations ◾ 287
b=zeros(n+1,1);
A(1,1)=1; b(1)=T_0;
for i=2:n
A(i,i-1)=1;
A(i,i)=-(2+beta);
A(i,i+1)=1;
b(i)=-Ta*beta;
end
A(n+1,n) =2;
A(n+1,n+1)=-(2+gamma);
b(n+1)=-Ta*gamma;
T=A\b;
plot(Z,T); hold on
xlabel('distance, L (m)'); ylabel('temperature (^oC)')
%% Solving using TDMA
T1=newThomasSolve(A,b);
% <newThomasSolve.m is left as student exercise>
plot(Z,T1,'ro');
Figure 7.4 shows the variation in temperature profile for an axial rod with mixed
boundary condition. The end temperature is slightly higher than 30°C that was
assumed in the previous example.
Finally, Figure 7.5 shows the structure of matrix A, investigated using the MATLAB
spy command. As is clear from the figure, the matrix A indeed has a tridiagonal
structure.
100
Temperature (°C)
75
50
30
0 0.5 1 1.5 2 2.5
Distance, L (m)
FIGURE 7.4 Temperature profile in the rod with heat transfer to the surroundings. The line shows
solution using the general linear equation solved (A\b in MATLAB®), whereas the circles show the
solution using the TDMA solver from Example 7.2.
288 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
10
15
20
0 10 20
nz = 60
FIGURE 7.5 The structure of matrix A observed through MATLAB® spy(A) command.
100
Temperature (°C)
75
50 W/m · K
50
15 W/m · K
5 W/m · K
30
0 0.5 1 1.5 2 2.5
Distance, L (m)
pivot row, and only ℓ rows have nonzero elements in the pivot column. Thus, the first
row operation is executed as follows:
Ak ,k +i
Ak ,k +i ¬ , i = 1 to (7.38)
Ak ,k
bk
bk ¬ , Ak ,k ¬ 1 (7.39)
Ak ,k
Note that Equation 7.38 is implemented for only ℓ elements and not for all elements since
the remaining elements of the row are already zero. Since only ℓ subdiagonal elements are
nonzero, the row operations for eliminating subdiagonal elements in the pivot column need
to be done for only ℓ operations:
Rk +i ¬ Rk +i - Ak +i ,k Ri , i = 1 : (7.40)
At this stage, it should be noted that the only nonzero elements in the kth row are elements
Ak,k to Ak,k + ℓ. Thus, the row operations need to be performed for only these elements and
not all. Thus, the expansion of Equation 7.40 in terms of individual elemental operations is
For i = 1 : a k +i ,k = Ak +i ,k (7.41)
bk +i ¬ bk +i - a k +i ,k bk (7.43)
Note that Equation 7.41 is different from standard Gauss Elimination (Appendix C) because
we had already used Equation 7.39 to ensure Ak , k ← 1.
Writing a code for the above operations is left as an exercise for the reader. An example
of band-diagonal system will be taken in Case Studies section.
Block-diagonal system: The block-diagonal systems can be treated as band-diagonal. For
the example in Figure 7.1, the bandwidth is ℓ = 2 for superdiagonals and ℓ = 1 for subdiago-
nals. The procedure described above can be used for a block-diagonal system as well.
0 = kÑ2T + S (T ) (7.44)
290 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
æ ¶ 2T ¶ 2T ö
0 = k ç 2 + 2 ÷ + S (T ) (7.45)
è ¶x ¶y ø
sÑf = GÑ2 f + S ( f )
where
σ is the coefficient of convection
Γ is the diffusive coefficient
As long as the Péclet number, Pe = σL/Γ (where L is the characteristic length), is not very
large, diffusion may not be neglected and the PDE is elliptic in nature. Besides heat transfer
problems, elliptic PDEs are encountered in a large number of problems: diffusion in mul-
tiple dimensions, diffusion and reaction, momentum balance (Navier-Stokes), and so on.
Due to qualitative similarity in the behavior of elliptic PDEs, we take a similar approach
as the in previous section to solve them. The domain is discretized in both x and y direc-
tions. The temperature at any location (i, j) is related to its neighbors through the central
difference formula, which yields the following discretized equation:
where Δx = L/nx and Δy = H/ny, where L and H are length and height of the domain, respec-
tively. Including the boundary nodes, this will lead to (nx + 1)(ny + 1) equations in as many
unknowns. The rest of the solution procedure follows on similar lines as discussed in the
previous sections.
7.3 ITERATIVE METHODS
I now switch my attention to a completely different class of numerical methods: iterative
methods. As seen in Chapter 6, iterative methods start with an initial guess of the solution
and use a sequence of equations to iteratively improve the solution. Fixed point iteration,
Special Methods for Linear and Nonlinear Equations ◾ 291
Newton-Raphson, secant method, etc., from Chapter 6 are examples of iterative methods.
We now focus our attention back to iterative methods, for both linear and nonlinear systems.
7.3.1 Gauss-Siedel Method
Gauss-Siedel method (and related Jacobi method) is the linear equations equivalent of fixed
point iteration. In case of fixed point iteration, the nonlinear equation was modified appro-
priately, and this was then used as an update equation to iteratively improve the current
solution guess:
x( ) = h x( )
i +1 i
( )
The Gauss-Siedel works in a similar manner for linear systems. Consider the kth equation:
Ak ,k x k = bk - ( Ak ,1 x1 + + Ak ,k -1 x k -1 + Ak ,k +1 x k +1 + + Ak ,n xn ) (7.47)
Similar to the fixed point iteration, the above equation is used as an iterative update equa-
tion if the overall system of equations is linear. Based on how the update equation is used,
there are two possibilities: (i) the previous values x(I) is used for all the update equations in
the Jacobi method, or (ii) the most recent values are used for the update equations in the
Gauss-Siedel method.
First consider the Jacobi iteration. When Equation 7.45 is used to update xk, we have the
previous iteration values x k( +)1 ,¼, xn( ) because the updated values have not yet been calcu-
i i
lated. However, for the preceding elements, both the updated values, x1( ) , , x k( -1 ) , or the
i +1 i +1
(i ) (i )
previous iteration values, x1 , , x k -1 , are available. Jacobi iteration uses the previous itera-
tion values for all elements of x, thus resulting in the following update equations:
å
n
Ak , j x (j )
i
bk - j =1,
x k( ) =
i +1 j¹k
(7.48)
Ak ,k
Gauss-Siedel, on the other hand, uses the most updated values. Thus, x1( ) is computed
i +1
using x2( ) ,¼, xn( ). However, computation of x2( ) uses x1( ) since that is the most updated
i i i +1 i +1
value of iterated solution vector. Likewise, x k( ) is computed using x1( ) ,¼, x k( -1 ) (most
i+1 i +1 i +1
recently update values) as well as x k( ) ,¼, xn( ) (which are not yet updated). Mathematically,
i i
å å
k -1 n
Ak , j x (j ) - Ak , j x (j )
i +1 i
bk -
x k( ) =
i +1 j =1 j = k +1
(7.49)
Ak ,k
292 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
In Chapter 6, I derived the error propagation equation for fixed point iteration in a single
variable. This can be extended to multivariate fixed point iteration as
¶h i ¶h i
ek( ) = k e1( ) + + k en( )
i +1
(7.50)
¶x1 ¶xn
Extending the arguments from the single-variable case of Chapter 6, the criterion for
convergence of fixed point iteration is
¶hk
å ¶x i
i
<1
(7.51)
This leads to the following diagonal dominance criterion for the Gauss-Siedel method:
Ak ,i
åA k ,k
<1
i ¹k (7.52)
Ak ,k > åi ¹k
Ak ,i
The following example, which can be worked out with hand calculations, demonstrates this
criterion for stability of the Gauss-Siedel method.
x1 + 2 x2 = 1
x1 - x2 = 4
Solution-1: The above system of equations is not diagonally dominant. The Gauss-
Siedel iterations for this system of equations are
x1( ) = 1 - 2 x2( )
i +1 i
x2( ) = x1( ) - 4
i +1 i +1
The first five iterations of Gauss-Siedel method give the following results:
0 1 7 -5 19 -29
0 -3 3 -9 15 -33
Special Methods for Linear and Nonlinear Equations ◾ 293
As can be concluded from the above results, the iterations are diverging. Hence, the
equations can first be rearranged to make them diagonally dominant, followed by
using the Gauss-Siedel method.
Solution-2: The two equations are switched to make the system diagonally domi-
nant. With this change, the Gauss Siedel iterations are given by
x1( ) = 4 + x2( )
i +1 i
x2 =
1
2
(
1 - x1( )
i +1
)
The first five iterations for this diagonally dominant system of equations is given by
Clearly, the above is converging to the solution, x = [3 -1]T . This example shows the
improved convergence behavior of the Gauss-Siedel method when the same system of
equations is rearranged in a diagonally dominant form.
Let us turn to the implications of the above observation to solving a sparse system of
equations. The tridiagonal system of Section 7.2.2 already satisfies the diagonal dominance
condition. This is true for a large number of systems of practical interest.
Since only one subdiagonal and superdiagonal elements are nonzero, the Gauss-Siedel
iteration for the kth element becomes
(i +1) (i )
b -l x - uk x k +1
x k( ) = k k k -1
i +1
(7.53)
dk
å
+1
A1, j x (j )
i
b1 -
x1( ) =
i +1 j =2
A1,1
å
+1
b2 - A2,1 x1( ) - A1, j x (j )
i +1 i
(i +1) j =3
x 2 =
A2,2 (7.54)
å å
k -1 k +
Ak , j x (j ) - Ak , j x (j )
i +1 i
bk -
(i +1) j =k - j = k +1
x k =
Ak ,k
The next example demonstrates the use of Gauss-Siedel method for a heat conduction problem.
294 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
The above method took 486 iterations to converge to the desired tolerance. The solu-
tion obtained using this approach is close to the temperature values obtained using
the TDMA-based approach of the previous example.
The matrix A in this example was diagonally dominant, and the criterion in Equation 7.50
is satisfied. If the system is not diagonally dominant, it is possible to “slow down” the itera-
tions using an under-relaxation factor. Conversely, an over-relaxation factor may be used to
“speed up” the convergence. These relaxation methods are introduced next.
x ( ) = x ( ) + wDx
i +1 i
(7.55)
The update equation (7.46) in Jacobi iteration can be used to define Δxk:
å
n
Ak , j x (j )
i
bk - j =1,
Dx k( ) = - x k( )
i +1 j¹k i
(7.56)
Ak ,k
å
n
Ak , j x (j )
i
bk - j =1,
x k( ) = (1 - w) x k( ) + w
i +1 i j¹k
(7.57)
Ak ,k
å å
æ k -1 n
i ö
Ak , j x (j ) - Ak , j x (j ) ÷
i +1
ç bk -
x k( ) = (1 - w) x k( ) + w ç
i +1 i j =1 j = k +1
÷ (7.58)
è Ak ,k ø
The term successive over-relaxation is used for the Gauss-Siedel method with the relaxation
factor, 1 < ω < 2.
296 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
d 2f æ df ö
a = f ç x , f, (7.59)
dx 2
è dx ÷ø
d 2f df
a 2
- u - f ( x, f) = 0 (7.60)
dx dx
f ( x , T ) = h¥ av (T - T¥ ) (7.61)
is linear. For a narrow range of temperature (303 K ≤ T ≤ 373 K), the thermal conductivity
was considered constant. This led to a linear ODE-BVP with linear boundary conditions,
as seen in Equation 7.11. Discretizing using finite difference approximation resulted in a
tridiagonal set of linear equations, as described in the chapter so far.
At times, the thermal conductivity may not be a constant but expressed as a power law or
a polynomial function of temperature:
(
Flux = es T 4 - T¥4 ) (7.63)
where
ε is the emissivity factor
σ = 5.67 × 10−8 W/m2 · K4 is Stefan’s constant
In fact, nonlinear source terms are common in the majority of chemical engineering exam-
ples, due to nonlinear heat loss as radiation, a nonlinear reaction or heat release term, bilinear
(or nonlinear) interaction/adsorption term in two-phase systems, etc.
Special Methods for Linear and Nonlinear Equations ◾ 297
When the nonlinear differential equation(s) described here are discretized, we obtain a
set of nonlinear algebraic equations. These equations have a tridiagonal (sparse) structure,
which can be exploited for efficient solving of the nonlinear equations. Modifications to the
TDMA, Gauss-Siedel, or Jacobi method and fsolve options are discussed in this section.
The nonlinearities are of two types: (i) variable diffusive/convective coefficients and (ii)
source term nonlinearity. The latter will be discussed at length in this section. The methods
discussed are rather general and are often applicable when coefficients are varying as well.
However, an example of variable coefficient will be left as an exercise for the reader.
d 2T
k
dz 2
(
= esav T 4 - T¥4 )
(7.64)
T ( 0 ) = T0 (7.65)
T ( L ) = T¥ (7.66)
esD 2av 4
Tk -1 - 2Tk + Tk +1 =
k
( )
Tk - T¥4 , k = 2 to n (7.67)
b(Tk )
Unlike the previous example, β(T) is a nonlinear function of T. The discretized boundary
conditions are
T1 = T0
(7.68)
Tn+1 = T¥
é1 0 0 0 0 ù é T1 ù é T0 ù
ê úê ú ê ú
ê1 -2 1 0 0 ú ê T2 ú êb (T2 ) ú
ê0 1 -2 0 0 ú ê T3 ú êb (T3 ) ú
ê úê ú=ê ú (7.69)
ê ú ê ú ê ú
ê0 0 0 1 -2 1 ú ê Tn ú êb (Tn ) ú
ê úê ú ê ú
êë0 0 0 0 0 1 úû êëTn+1 úû êë T¥ úû
A x b( x )
In the next couple of sections, we will discuss numerical techniques to solve this problem.
298 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
i
( )
x( ) = G x( )
i +1
(7.70)
The above is used as an update equation. In the same manner, it is easy to rewrite
Equation 7.66 as
( )
x ( ) = A -1 b x ( )
i +1 i
(7.71)
Instead of inverting the matrix A, a TDMA solver can be used to solve the problem (7.66)
with a fixed value of b(i) to obtain x(i + 1), substitute this value to obtain b(i + 1) = b(x(i + 1)) and
iterate until convergence, that is, ‖x(i + 1) − x(i)‖ < Etol.
It is also possible to combine the above approach with under-relaxation. The above
TDMA-based successive substitution is a rather naïve way to solve a nonlinear system of
equations, and it may not work for examples with highly nonlinear source term. In such
cases, TDMA with under-relaxation may be used, where the updated equation becomes
i
( )
x ( ) = (1 - w) x ( ) + wA -1 b x ( )
i +1 i
(7.72)
In the above, the same relaxation factor ω was used for all variables xk. However, it is pos-
sible to choose different values of ω for different elements of x:
i
( )
x ( ) = ( I - W ) x ( ) + WA -1 b x ( )
i +1 i
(7.73)
(
W = diag ëéw1 w2 wn+1 ùû )
Unlike the standard fixed point iteration, the above methods do not have a general require-
ment that guarantees convergence. So, this is a rather heuristic combination of linear equa-
tion solving with method of successive substitution. The next example demonstrates the use
of this approach to heat conduction with radiative heat loss.
and vector b(x(i)) is computed at the current guess of temperature. The Thomas solver
explained earlier is used to obtain the next guess, and the process is repeated iteratively.
The above code took 30 iterations to converge to a solution, starting with initial guess
of T = T∞, for ε = 0.1. While the temperature could be expressed either in °C or K in
the convective heat loss case, we have to use K in this example. The temperature pro-
file is plotted as solid line in Figure 7.7. The temperature profiles with heat loss via
300 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
300 є = 0.1
є = 0.9
h∞ = 2
Temperature (°C)
None
200
100
0
0 0.5 1 1.5 2 2.5
Distance, L (m)
FIGURE 7.7 Heat conduction in a metal rod with heat loss to the surroundings through radiation
for two different emissivity values (ε = 0.1: solid; ε = 0.9: dashed). The temperature profile when heat
loss is only by convection is plotted as a dotted line for comparison. Also plotted is the case with no
heat loss (straight line).
convection (solved using code from Example 7.2) and with no heat loss are also plot-
ted as dotted line and solid straight line for comparison.
The above code does not converge for the more relevant emissivity value of ε = 0.9.
The temperature diverges and we get NaN as the resulting temperature profile. This is
because the fixed point iterations do not converge owing to the nonlinear radiation
term. Consequently, under-relaxation-based approach, shown in Equation 7.69, was
used with the under-relaxation factor ω = 0.25.
Modification of the code above for implementing this under-relaxation is left to the
reader (see Problem 7.3).
The above approach combined linear equation solving with successive linearization in a
heuristic algorithm. Alternatively, a regular fixed point iteration may be used, treating each
equation in Equation 7.66 as a regular nonlinear equation. Thus, the update expression
i
( )
x k( ) = Gk x ( )
i +1
(7.74)
xk (i +1)
=
( )
b x k( ) - Ak ,k -1 x k( -1 ) - Ak ,k +1 x k( +)1
i i +1 i
(7.75)
Ak ,k
xk(i +1)
= (1 - wk ) x k + wk (i ) ( )
b x k( ) - Ak ,k -1 x k( -1 ) - Ak ,k +1 x k( +)1
i i +1 i
(7.76)
Ak ,k
* In the same manner, Jacobi iteration or Jacobi under-relaxation may also be used.
Special Methods for Linear and Nonlinear Equations ◾ 301
In either case, the iterations are repeated until convergence, that is, ‖x(i + 1) − x(i)‖ < Etol. This
is the fixed point iteration approach (either with or without under-relaxation), which has
been modified for the tridiagonal problem represented in Equation 7.66.
Previously, the diagonal dominance of matrix A guaranteed convergence for a linear
tridiagonal system. However, due to the nonlinear source term, the Gauss-Siedel iterations
for Equation 7.66 do not have convergence guarantees. If the Gauss-Siedel method diverges,
the successive under-relaxation approach may be used to improve the convergence behav-
ior of the system. The nonlinear heat conduction with a radiative heat loss problem is solved
using Gauss-Siedel (with under-relaxation) in the next example.
Example 7.7 Heat Conduction with Radiative Heat Loss Using Gauss-Siedel
Problem: Solve the problem from Example 7.6 using Gauss-Siedel method.
Solution: The code for Gauss-Siedel for this problem is given below:
if (k==1)
dT=(b(1)-A(1,2)*T(2))/A(1,1);
elseif (k==n+1)
dT=(b(n+1)-A(n+1,n)*T(n))/A(n+1,n+1);
else
dT=( b(k)-A(k,k+1)*T(k+1)-...
A(k,k-1)*T(k-1) ) / A(k,k);
end
T(k)=(1-omega)*T(k) + omega*dT;
end
TSTORE(:,iter)=T;
err=max(abs(T-Told));
if (err<tol)
break
end
end
The above code gives the exact same results as those shown in Figure 7.7. It took 394
Gauss-Siedel iterations for convergence when ε = 0.9 and 1059 iterations when ε = 0.1.
These reports are for ω = 1 (i.e., without under-relaxation).
When the above code was modified for Jacobi iteration, the iterations did not con-
verge for ε = 0.9, whereas it took as many as 8089 iterations for convergence for ε = 0.1.
In order to improve convergence behavior, Jacobi under-relaxation was used. With the
under-relaxation factor of ω = 0.9, the method converged in 796 iterations for ε = 0.9.
The modification of the above code for Jacobi iterations is left as an exercise to the
reader.
x = G ( x ) (7.77)
( ) ( )
G ( x ) » G x ( ) + G¢ x ( ) é x - x ( ) ù
i i
ë
i
û
= G ( x ( ) ) + x ( )G ¢ ( x ( ) ) + G ¢ ( x ( ) ) x
i i i i (7.78)
G0 G1
Special Methods for Linear and Nonlinear Equations ◾ 303
G
x( ) = 0
i +1
1 - G1 (7.79)
( )
where G0 = G x ( ) - x ( )G ¢ x ( )
i i
( ) i
( )
and G ¢ x ( )
i
Using equivalence between this and the Newton-Raphson method from the previous
chapter, it can be shown that the linearization-based method has the quadratic rate of con-
vergence for problems in single variable. This is left as an exercise to the reader.
The same concept can be expanded to a multivariable case. The equation with a non-
linear source term (such as that in Equation 7.64) can be written in the general form of
Equation 7.19 as
lk x k -1 + dk x k + uk x k +1 = b ( x k ) (7.80)
Using a single-variable Taylor’s series expansion for linearizing the nonlinear source term
around xk yields
( ) ( )
b ( x k ) » b x k( ) + b¢ x k( ) é x k - x k( ) ù
i i
ë
i
û
= b ( x ( ) ) + x ( )b¢ ( x ( ) ) + b¢ ( x ( ) ) × x
i i i i
k k k k k (7.81)
b0 , k b1, k
( )
b¢ x k( ) =
i ¶b
¶x k x =x( )
i
(7.82)
lk x k -1 + ( dk - b1,k ) x k + uk x k +1 = b0,k
( ) ( ) ( )
(7.83)
where b0,k = b x k( ) - x k( )b¢ x k( ) , b1,k = b¢ x k( )
i i i i
Thus, source term linearization often results in improved convergence properties by ensur-
ing the tridiagonal system is diagonally dominant. For small- to medium-size problems,
this approach may not provide significant benefits over the regular Gauss-Siedel approach.
However, this approach is quite popular in large-scale commercial solvers, as it tends to
make the iterative solver more robust without slowing down the rate of convergence.
304 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
>> opt=optimoptions('jacobPattern',J);
7.5 EXAMPLES
7.5.1 Heat Conduction with Convective or Radiative Losses
The problem of heat conduction in a rod with convective or radiative heat losses was dis-
cussed earlier in this book. Recall that the problem solved in Example 7.1 was that of heat
conduction in a rod, modeled as the following ODE-BVP:
d 2T
k = h¥ av (T - T¥ ) (7.9)
dz 2
T ( 0 ) = T0 , T ( L ) = T¥ (7.10)
where
T is the temperature at any location in °C
T∞ is the temperature of the ambient
k is the thermal conductivity of the rod
h∞ is the heat loss coefficient
av is the external area of the rod per unit volume
This problem was modified in Section 7.2.3 by replacing the Dirichlet boundary condition
with either a Neumann or a mixed boundary condition. If the far end of the rod is insulated,
a Neumann boundary condition is used:
dT
=0 (7.31)
dz z =L
On the other hand, if it loses heat to the surroundings, a mixed boundary condition is used
instead:
dT
-k = h¥ (TL - Ta ) (7.32)
dz z =L
Example 7.3 solved the problem (Equation 7.9) with a mixed boundary condition at one
end of the rod.
Special Methods for Linear and Nonlinear Equations ◾ 305
Both these problems were linear; Section 7.4 extended this to a nonlinear system.
Specifically, the problem of heat conduction with radiative heat losses was solved in
Example 7.6. The ODE-BVP model for this system, along with the relevant Dirichlet bound-
ary conditions, is given by
d 2T
k
dz 2
(
= esav T 4 - T¥4
) (7.64)
T ( 0 ) = T0 (7.65)
T ( L ) = T¥ (7.66)
The interested reader is referred to these sections for details on these three simulation case
studies on the 1D heat conduction problem.
De d æ 2 dC A ö
0= ç r dr ÷ - rA (7.84)
r 2 dr è ø
where
CA is the concentration of the reactant in the pellet
De is the effective diffusivity
rA is the rate of reaction
d 2C A 2De dC A
De + - rA = 0 (7.85)
dr 2 r dr
We will consider two different cases for solving and analyzing the system: (i) first-order
reaction and (ii) Langmuir-Hinshelwood (LH) rate kinetics.
d 2 y 2 dy 2
+ -f y = 0 (7.86)
d x2 x d x
where
f2 = kR2 /De is known as the Thiele modulus
ψ = CA/CA0 is the dimensionless concentration
ξ = r/R is the dimensionless radius
306 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
( )
The boundary conditions are no-flux condition at the center éëC A¢ ùû = 0 and the con-
r =0
centration at the outer surface of the pellet is [CA]R = CA0. The interested reader may refer to
any reaction engineering text (see Fogler, 2008) to verify that the concentration at any point
in the pellet is given by
1 sin h ( fx )
y= (7.87)
x sin h ( f )
We will write a MATLAB code to simulate mass transfer and reaction in the spherical pellet
using TDMA and compare the results with the above solution. As Equations 7.85 and 7.86
are equivalent, either of them can be simulated to obtain the same results. We will simulate
the dimensionless model in this example, and the original model (Equation 7.84) will be
simulated in the next section.
yi +1 - 2yi + yi -1 yi +1 - yi -1
2
+ - f2 k = 0 (7.88)
h xi h
where h = 1/n, since we are working with dimensionless quantities, and ξi = (i − 1)/n.
Collecting appropriate terms together, the tridiagonal equation for the internal
nodes is
æ 1 1 ö æ 2 2ö æ 1 1 ö (7.89)
ç 2- ÷ yi -1 + ç - 2 - f ÷ yi + ç 2 + ÷ yi +1 = 0
èhxih ø
èhø èhxih ø
b1 b2 b3
At the first node, the ghost-point boundary condition yields ψi + 1 = ψ0. Thus, β1 is
not relevant and β3 = 2/h2.
The following code solves the ODE-BVP:
k=6.25;
C_A0=5e-5;
phi=sqrt(k/diffCoeff)*R;
%% Discretization and Precomputing
n=20;
h=1/n; % Dimensionless
xi=0:h:1; % Dimensionless
The results are shown in Figure 7.8. Solid lines represent the numerical solution, and
dots represent the analytical result as per Equation 7.87. The effect of varying the
Thiele modulus is seen in the figure. Since the Thiele modulus is the ratio of reac-
tion rate to diffusion rate, large values of the modulus result in diffusion limitation.
Conversely, when ϕ ≲ 1, diffusion is fast and there is no diffusion limitation in the
system.
1
φ=1
0.8
0.6 φ=2
CA/CA0 (–)
0.4
0.2 φ=4
φ = 10
0
0 0.5 1
r/R (–)
FIGURE 7.8 Simulation of reaction and diffusion in a spherical pellet for various values of the
Thiele modulus. Lines show numerical results, and symbols show the analytical solution.
308 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
kC A
r= (3.92)
1 + K r C A2
This problem will be solved using the dimensional Equation 7.85 in the CGS system.
The concentration at the surface of the pellet is CA0 = 5 × 10−5 mol/cm3, the reaction rate
constant is 100 s–1, and the constant Kr = 109 cm6/mol2. We will solve this using fsolve.
The equations after discretization are given by
De D
2 ( i +1
C - 2Ci + Ci -1 ) + e ( Ci +1 - Ci -1 ) - r ( Ci ) = 0, for i = 2 to n
h ri h
De
( 2Ci +1 - 2Ci ) - r (Ci ) = 0, for i = 1
h2
Cn+1 - C A0 = 0 (7.90)
An important aspect while writing codes is to benchmark them with known solutions.
Since the ODE-BVP with LH kinetics is complex, it is difficult to benchmark them. So, we
will take an indirect approach here. We will write the code for the original problem and
execute it with Kr = 0. With this change, the LH kinetics will be equivalent to the first-order
reaction that we saw in Example 7.7. After comparing these results with those from the
previous numerical code, we will analyze the effect of various parameters. The next example
demonstrates this approach.
function gval=thieleLHfun(Y,par)
%% Getting parameters
n=par.n;
h=par.h;
k=par.k;
Kr=par.Kr;
Da=par.diffCoeff;
Special Methods for Linear and Nonlinear Equations ◾ 309
The driver script will be written in a similar manner as before, using fsolve to obtain
the solution as the concentration profile.
% Simulation of porous catalyst pellet
% with Langmuir-Hinshelwood kinetic model
%% Model parameters
R=0.2;
n=100;
h=R/n;
catParam.R=R;
catParam.n=n;
catParam.h=h;
catParam.diffCoeff=0.25;
catParam.k=100;
catParam.Kr=1e10;
catParam.C0=5e-5;
%% Initializing and Solving
Yguess=catParam.C0*ones(n+1,1);
opt=optimset(‘tolx’,1e-12,’tolfun’,1e-12);
Ysol=fsolve(@(Y) thieleLH1(Y,catParam), Yguess);
%% Plotting the results
xi=0:1/n:1;
psi=Ysol/catParam.C0;
plot(xi,psi,’-.r’);
xlabel(‘r/R (-)’); ylabel(‘C_A/C_{A0} (-)’);
Figure 7.9 shows the results of benchmarking the nonlinear code, by changing the
underlined line of code to vary Kr. The results with Kr = 0 are compared with the first-
order reaction case (with ϕ = 4). The numerical results from the nonlinear solver
match the analytical solution very well.
310 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
0.8
0.6
CA/CA0 (–)
Kr = 1e10
0.4
Kr = 1e9
0.2
φ=4
0
0 0.5 1
r/R(–)
FIGURE 7.9 Benchmarking of code with LH kinetics with Kr = 0 (solid line) by comparing it with
the analytical solution for ϕ = 4 (symbols). The effect of varying Kr is also shown.
Figure 7.9 also presents the effect of varying Kr on the concentration profile in the pel-
let. Note that the baseline case considered in Chapter 3 was Kr = 1 m6/mol2, which equals
Kr = 1012 cm6/mol2. At this value of Kr, the diffusion is very fast, and nearly the entire pellet
concentration is the same as the surface concentration. This is because of the small pel-
let size. For larger pellets (or smaller pores, which reduces De ), mass transfer effects gain
importance. However, with highly porous catalyst particles of size ~1 mm, internal diffu-
sion may usually be assumed to be fast.
Finally, Figure 7.10 shows the effect of concentration at the pellet surface on concen-
tration profiles inside the pellet. Four cases are considered, with CA0 calculated using the
ideal gas law at 300 K and partial pressure, pA, in the gas phase being 1, 2, 5, and 10 bar.
pA0 = 10 bar
0.8
0.6
CA/CA0 (–)
pA0 = 5 bar
0.4
pA0 = 2 bar
0.2
pA0 = 1 bar
0
0 0.5 1
r/R (–)
As the concentration increases, the inhibition term (denominator in the kinetic rate
expression) increases and the pellet increasingly becomes reaction-limited.
The above example brings us to the end of this section. Irrespective of the coordinate
system, the key approach in solving ODE-BVPs, elliptic PDEs, and other similar problems
remains the same: using appropriate numerical differentiation to convert the differential
equations into a set of algebraic equations and solving these equations either directly or by
exploiting their sparse structure.
7.6 EPILOGUE
This chapter dealt with numerical techniques for solving linear or nonlinear equations with
sparse structure. This implies that the matrix A in the linear equation
Ax = b
or the Jacobian, in case of the nonlinear function, J = ∇g(x) has a banded structure, as
described in Figure 7.1. A banded system of equations most commonly arises in process
simulation due to the discretization of BVPs.
In the case of linear system of equations, we focused on two “special” methods to effi-
ciently solve the linear equations: (i) the TDMA and (ii) the Gauss-Siedel method. From a
numerical techniques perspective, we discussed how TDMA (or a banded solver) is more
efficient than a general-purpose Gauss Elimination–based solver. Gauss-Siedel is intro-
duced as one of the key iterative methods for linear systems.
From a nonlinear systems perspective, a method of successive substitution was intro-
duced, which iteratively employs a TDMA algorithm to solve a tridiagonal set of model
equations. This method was heuristically motivated and is straightforward to use; however, it
may not provide a desirable convergence behavior when solving several nonlinear problems.
The Gauss-Siedel method, on the other hand, was introduced for both linear and non-
linear equations. This iterative method is a general-purpose method, but it is also easy to
exploit banded structure of matrix A to improve the computational requirement on Gauss-
Siedel. This method benefits from the fact that the band-diagonal matrix obtained after
finite difference discretization is often diagonally dominant. Convergence properties can
further be improved using the under-relaxation approach or source term linearization.
Finally, we introduced the option 'jacobPattern' in MATLAB’s fsolve solver.
Based on the discussion in this chapter and experience, I make the following recommen-
dations: (i) TDMA or banded solver for linear systems, (ii) fsolve as a method of choice
for nonlinear banded systems, and (iii) Gauss-Siedel with appropriate under-relaxation if
the problem does not give the desired results with fsolve.
EXERCISES
Problem 7.1 Consider a nonlinear equation x = G(x), which is to be linearized and solved
using fixed point iteration. The linearized equation is given by
x = G0( ) + G1( ) x
i i
312 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
( ) i i
( ) i i
( )
G0( ) = G x ( ) - x ( )G¢ x ( ) , G1( ) = G¢¢ x ( )
i i
Show that solving the fixed point iteration of a linearized model at each
iteration
G0( )
i
x( ) =
i +1
1 - G1( )
i
Implicit Methods
Differential and Differential
Algebraic Systems
8.1 GENERAL SETUP
The underlying theme of this chapter is implicit methods for solving differential equations.
Examples of systems covered in this chapter include ordinary differential equations (ODEs)
that are stiff in nature, partial differential equations (PDEs) that require implicit time step-
ping, and systems governed by differential algebraic equations (DAEs, such as differential
equations with algebraic constraints). However, DAEs will remain a major focus of this
chapter.
This is an advanced chapter that builds upon the analyses of ODEs in Chapter 3, PDEs
in Chapter 4, linear system dynamics in Chapter 5, and algebraic equations in Chapter 6.
The final case study of Chapter 3 presented an example where ode45 is unable to con-
verge for default ODE solver options. This is an example of a stiff system of ODE-IVP. The
Crank-Nicolson method was introduced in Chapter 4 for solving hyperbolic PDEs, but we
deferred it until this chapter. Finally, DAEs, which consist of both differential and algebraic
equations, are a class of problems of interest to chemical engineers. Traditionally, these
three topics are covered separately in textbooks. Since a unifying theme in this broad spec-
trum of topics is the use of implicit methods, I present a unified coverage in this chapter.
The remainder of this section will provide more details on each of the three topics. Since
ODE-IVP and PDEs were covered at length in Chapters 3 and 4, respectively, most of this
chapter will be devoted to DAEs.
313
314 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
response for the latter value. I mentioned in the example that this unstable behavior of
ode45 was because the set of ODEs becomes stiff.
This issue and its implication on numerical methods are analyzed presently.
A linear approximation of this nonlinear chemostat model around its steady state was
obtained in Chapter 5. The linearized model is given by
d
y = Jy (8.1)
dt
T
where y represents deviation variables, that is, y = ëéS - Sss X - Xss ùû , where Sss and Xss
are the steady state exit concentrations of the substrate and biomass, respectively. For the
stiff ODE case (i.e., Ks = 0.005), the steady state concentrations are Sss = 0.002 mol/L and
Xss = 3.749 mol/L. Under these conditions, the Jacobian is
é -201.7 -0.133ù
J =ê ú (8.2)
ë 151.2 0 û
The eigenvalues of the Jacobian are −201.6 and −0.1, and the ratio of the two eigenvalues is
~2000. The system is stable because the two eigenvalues are negative.
Recall that the transient solution of a system y′ = λy is y(t) = y0eλt. We discussed in Chapter
5 that, for a linear ODE, each eigenvalue governs the dynamic response of the system along
the corresponding eigenvector. The larger the eigenvalue, the faster is the response. Thus, the
eigenvector corresponding to λ1 = −201.6 is the fast eigenmode. The time required to attain
steady state is ~50 min, since it is determined by the slow eigenvalue, λ2 = −0.1. Compared
to the overall process dynamics, the dynamics along the first eigenvector is almost instanta-
neous. The phase-plane plot for this system is shown in Figure 8.1. Note that the dynamics
along the direction v1 ≡ [−0.8; 0.6] stabilize rapidly, and the system then moves along the
slow eigenvector v2 ≡ [0.0007; 1].
Fast Slow
y2
y1
FIGURE 8.1 Phase-plane plot for a moderately stiff system obtained by linearizing the chemostat
bioreactor model at its steady state value.
Implicit Methods ◾ 315
The upper limit on the step-size for an explicit ODE-IVP method is determined by the
largest eigenvalue. For example, the largest step-size for Euler’s explicit method is given by
h < 2/λ1. Thus, the step-size, h, needs to be less than 0.01, for Euler’s explicit method to be
stable. However, the time to reach steady state is ~50 min, since it is governed by the slower
λ2. Therefore, at least 5000 steps will be required by an explicit solver to reach the final solu-
tion. In fact, for the above example, ode45 required ~12,000 steps.
An implicit solver, ode15s, on the other hand, required only 93 steps. I will use Figure 8.1
to qualitatively explain the “mechanism” behind this observation. In the first 0.05 s, the sys-
tem responds along the first eigenvector (dashed line marked as “fast”), and the solver reaches
the slow eigenvector (dashed line marked as “slow”). The solver takes small steps during this
phase for accuracy. Now the response along eigenvector v1 has settled, and the system now
moves along the slow eigenvector v2. It is now possible to take larger steps, and still maintain
the desired accuracy. This is exactly what ode15s does. In contrast, ode45 takes a very
large number of steps because there is a stringent upper limit on step-size for stability.
The analysis of the original nonlinear chemostat from Section 3.5.4 follows the same
arguments as above, but is complicated by the nonlinear reaction term. After the evolution
of the system along the fast eigenvector, the solver attempts to take larger steps. Unlike the
linear case, ode45 does not account for the rapidly changing nonlinear reaction term and
fails to converge. Thus, the presence of fast and slow dynamics simultaneously made the
implicit ode15s the preferred solver for stiff ODEs.
y ( t ) = e -1000t + e -t
dy
= -1000 y + 999e -t , y ( 0 ) = 2 (8.3)
dt
Solving the above ODE using ode45 takes 12000 steps, whereas ode15s takes only 67
steps. The cause for this is the same as the multivariable case: λ = − 1000 dictates the step-
size while e−t determines the time required for the dynamics to settle to a steady state.
The original chemostat problem of Section 3.5.4 showed how a stiff system of ODEs can
make an explicit solver unstable. Several examples of practical interest may be significantly
more stiff than chemostat. I used this section to define stiffness and to demonstrate it from
a practical perspective. As in Chapter 3, we will devote Section 8.2 to implicit methods for
solving ODEs.
316 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
¶f ¶f
+u = S ( f) (8.4)
¶t ¶z
Finite difference approximation is used in both spatial and temporal directions. The
spatial domain is split in n intervals and the interval size is D z = L n ; a step-size of
Δt is used in time, and ϕp,i represents the solution at time p and location i. The solution is
known at all locations at time p, and this information is used to obtain the solution at
next time, p + 1.
The Crank-Nicolson method was derived in Chapter 4. A central difference formula is
used for the spatial derivative and an implicit formula similar to the trapezoidal rule is used
in time. This results in the following implicit formula for the Crank-Nicolson method:
d
y = f ( t ,y ,x ) (8.6)
dt
0 = g ( y ,x ) (8.7)
The variables, y, that appear in differential form in Equation 8.6 are differential variables,
whereas the remaining variables, x, are algebraic.
DAEs of the form of Equations 8.6 and 8.7 form one class of DAEs, known as semiexplicit
DAEs. A general form for representing DAEs is
F ( t,Y ,Y ¢ ) = 0 (8.8)
Implicit Methods ◾ 317
While several numerical methods are available to solve the above general form of DAE, we
will restrict ourselves to semiexplicit DAEs in this chapter.
Typically, fully implicit methods are used to solve DAEs. Some of the popular, so-called
linear multi-step methods will be discussed in Section 8.2. Analysis and classification of
DAEs and their implementation in MATLAB will be discussed in Section 8.4.
y k +1 = ëéa1 y k + a2 y k -1 + + an y k ±n+1 ùû + b0 hf ( t k +1 ,y k +1 )
The above equation leads to an explicit method if b0 = 0; if b0 is nonzero, then the method
is implicit. The most popular multistep methods are the following:
While there are other solvers, we will primarily focus on these three classes of methods.
I will end this discussion by mentioning one more key property of multistep methods.
* This section is added for completeness. It may be skipped without any loss in continuity.
318 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
Non-self-starting property: The term “multistep” implies that the information at past n
points is used for computing the next point, yk+1. Let us say that we are at the initial time
t0 with y0 as the initial value. In order to compute the next solution, y1, we need the previ-
ous data points y0, y−1, and so on. Since the values y−1 (and earlier) are not available, it is
not feasible to directly compute y1. Thus, the higher-order multistep methods are non-
self-starting; they require a procedure to compute past values and start the ODE solution.
I will now briefly summarize the three classes of methods. The focus of this chapter is
not on numerical techniques per se; rather, we are interested in using differential equa-
tion solvers in MATLAB® for solving problems of practical interest. A brief discussion
of these methods is provided to equip readers to make an educated choice of solvers in
one’s work.
k +1 ( k +1)h
ò dy = ò f ( t,y ( t)) dt
k kh
(8.10)
The left-hand side evaluates simply as (yk + 1 − yk). The polynomial on the right-hand side
varies depending on the number of past terms included in interpolation, giving rise to dif-
ferent Adams-Moulton methods of varying order.
The first-order method, AM-1, is simply the Euler’s implicit method since it uses a single
point, f = f k +1, for interpolation, resulting in
y k +1 = y k + hf k +1 (8.11)
fi f ( t i ,yi ) (8.12)
( k +1)h
æ f k +1 - f k ö
y k +1 = y k +
ò
kh
ç fk +
è h
( t - kh ) ÷ d t
ø
(8.13)
Implicit Methods ◾ 319
æ ( k +1)h ö
æ f k +1 - f k ö ç é t2 ù ( k +1)h ÷
y k +1 = y k + éë f k h ùû + ç ÷ç ê 2 ú - kh é t ù
ë û kh ÷ (8.14)
è h ø ë û kh
è ø
æ ( k + 1)2 - k 2 ö
h ç
2
-k÷
ç 2 ÷
è ø
Since this evaluates to h2/2, the final equation for AM-2 method is given by
h
y k +1 = y k +
2
( )
f ( t k ,y k ) + f ( t k +1 ,y k +1 )
(8.15)
The second-order Adams-Moulton method is also known as the trapezoidal rule since it
looks like the trapezoidal rule from numerical integration (see Appendix E).
t - t k +1
a= (8.16)
h
Thus, tk + 1 → (α = 0), tk → (α = − 1), tk − 1 → (α = − 2), and so on. In the case of AM-2 method,
a second-order polynomial is used. Using Newton’s backward difference formula
a ( a + 1) ( a + n - 1)
f B ( t ) = f k +1 + aÑf k +1 + + Ñn f k +1 + RB (8.17)
n!
where
a ( a + 1) ( a + n )
RB = hn+1 f n+1 ( x ) (8.18)
(n + 1)!
320 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
AM-2 method is derived with n = 1, that is, by retaining the first two terms in Equation 8.17.
Thus, the equation for AM-2 is given by substituting this in Equation 8.10. Since dτ = h dα
a ( a + 1) 2
0
é ù
ò
y k +1 = y k + h ê f k +1 + aÑf k +1 +
ê
-1 ë
2
h f ¢¢ ( x ) ú da
úû
(8.19)
é é a2 ù
0
h2 é a3 a2 ù ù
0
y k +1 = y k + h f k +1 éëa ùû -1 + ( f k +1 - f k ) ê ú + f ¢¢ ( x ) ê + ú ú
0
ê
ê 2
ë û -1 2 ë 3 2 û -1 ú
ë û
é æ 1ö h 2
æ 1 ö ù
= y k + h ê f k +1 + ( f k +1 - f k ) ç - ÷ + f ¢¢ ( x ) ç - ÷ ú (8.20)
ë è 2 ø 2 è 6 øû
Thus, the AM-2 method with the local truncation error is given by the following equation:
h æ h3 ö
2
( )
y k +1 = y k + f ( t k ,y k ) + f ( t k +1 ,y k +1 ) + ç - f ¢¢ ( x ) ÷
12
(8.21)
è ø
AM-2
( )
O h3
Thus, AM-2 has a local truncation error as shown above. As we observed in Chapter 3, the
global truncation error (GTE) in the ODE solver drops by one order, making AM-2 method
( )
O h2 accurate with respect to the GTE.
Higher-order Adams-Moulton formulae are derived by progressively including an addi-
tional term in the backward difference formula of Equation 8.17. The first few Adams-
Moulton formulae are given in Table 8.1. Note that the AM-n method is nth order accurate
since the GTE is ~O hn . ( )
8.2.3 Explicit Adams-Bashforth Method
Explicit Adams-Bashforth methods follow a similar principle as Adams-Moulton methods,
except that b0 in Equation 8.9 is 0, making them explicit ODE-solving methods. Thus, a
Only methods up to AM-5 are shown. Interested readers are referred to numerical ODE
texts for coefficients of higher-order methods.
Implicit Methods ◾ 321
Newton’s backward difference polynomial is fitted to the past n points. Readers can easily
notice that the first-order Adams-Bashforth method (AB-1) is simply the explicit Euler’s
method. As in Adams-Moulton methods, the following real variable will be defined for
deriving Adams-Bashforth formulae:
t - tk
a= (8.22)
h
ò
y k +1 = y k + f B ,k ( a ) h da
0
(8.23)
a ( a + 1) ( a + n - 1)
f B ( t ) = f k + aÑf k + + Ñ n f k + RB (8.24)
n!
The mathematical contrast between AB-n and AM-n methods will be evident on comparing
Equation 8.16 with (8.22) and Equation 8.17 with (8.24). The practical contrast indeed is
that they are explicit and implicit methods, respectively. Notice also that integrating from k
to k + 1 implies integrating from α = 0 to 1, in AB-n method. Substituting and integrating,
we get
é 1
é a2 ù h2 é a2 a3 ù ù
1
y k +1 = y k + h ê f k éëa ùû 0 + ( f k - f k -1 ) ê ú + f ¢¢ ( x ) ê + ú ú
1
(8.25)
ê ë 2 û0 2 ë 2 3 û0 ú
ë û
h æ 5h3 ö
2
( )
y k +1 = y k + 3 f ( t k ,y k ) - f ( t k -1 ,y k -1 ) + ç f ¢¢ ( x ) ÷ (8.26)
è1 2
ø
AB -2
( )
O h3
Since the right-hand side does not depend on (k + 1), AB-2 is an explicit method. AB-2 is
not a self-starting method because y1 depends on y−1, and hence the method can only start
from computing y2 onward.
Higher-order AB-n methods include a larger number of terms in the formula as well as
yield higher accuracy. These are summarized in Table 8.2.
322 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
Interested readers are referred to numerical ODE texts for coefficients of higher-order methods. Since
AB-n methods are explicit, the coefficient b0 is zero.
y k +1 - y k
y B = y k +1 +
t k +1 - t k
(t - t k +1 ) (8.27)
Clearly, y B¢ is just the slope of the above line and the denominator is simply h. Thus, we
obtain the first-order BDF as simply the Euler’s implicit formula:
y k +1 - y k
= f ( t k +1 ,y k +1 )
h (8.28)
i.e., y k +1 = y k + h ( t k +1 ,y k +1 )
How does one derive higher-order BDF? They are derived in quite a similar manner as the
Adams-Moulton method, except that (i) y B ( t ) is fitted instead of f B ( t ), and (ii) the deriva-
tive y B¢ is substituted in the original ODE instead of integrating f B . The derivation of error
for first-order BDF is left as an exercise.
Second-order Newton’s backward difference formula is given by
a ( a + 1) a ( a + 1) ( a + 2 )
y B ( t ) = y k +1 + aÑy k +1 + Ñ2 y k +1 + h3 y ¢¢¢ ( x ) (8.29)
2 6
* Recall that nth-order polynomial requires (n + 1) points: yk + 1 and the previous n points.
Implicit Methods ◾ 323
Since
dy 1 dy
=
dt h da
(8.30)
1æ 2a + 1 2 h3 ö
y B¢ = ç Ñy k +1 +
hè 2
Ñ y k +1 +
6
(
3a2 + 6a + 2 y ¢¢¢ ( x ) ÷ )
ø
Note that the derivative is computed at time (k + 1), that is, α = 0. Substituting in the ODE
yields
3
( yk +1 - yk ) + 12 ( yk +1 - 2 yk + y(k -1) ) + h3 y ¢¢¢ ( x ) = hf ( t k +1 ,y k +1 ) (8.31)
4 y k - y k -1 2h 2h3
y k +1 = + f ( t k +1 ,y k +1 ) + y ¢¢¢ ( x ) (8.32)
3 3 9
BDF -2
( )
O h3
4 y - y k -1 2h 2h3
y k +1 - k = f ( t k +1 ,y k +1 ) + y ¢¢¢ ( x ) (8.33)
3 3 9
BDF - 2 O ( h3 )
Most commercial solvers used BDF methods of orders 1–5. Methods beyond BDF-6 are not used
since they are not zero stable.
324 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
Ñqyk +1
Pq = a ( a + 1) ( a - q + 1) (8.34)
q!
and yB = åqn =0 Pq . Recall that in deriving the BDF, we differentiated yB since we required
éd ù
êë dt yB úû
( k +1)
in the derivation of BDF. Moreover, at (k + 1), α = 0. Differentiating Equation 8.34, we get
d Ñqyk +1
Pq = é( a + 1) ( a + 2) ( a - q + 1) + ( terms with a ) ùû (8.35)
da q! ë
Since we compute the above at α = 0, all the terms in α disappear. The remaining term in the
bracket at α = 0 is (q − 1)!. Thus
éd ù 1 æ Ñq yk +1 ö
P = ç ÷ (8.36)
êë dt úû
q
h è q ø
( k +1)
n
Ñqyk +1
åq =1
q
= hf ( tk +1,yk +1 )
(8.37)
å
Ñqyk +1
q
(
= hf ( tk +1,yk +1 ) + kg y k +1 - y k( +)1
0
) (8.38)
q =1
where
κ is a constant
γ = [1 + (1/2) + ⋯ + (1/n)] and
yk( +)1 is an initial guess of the solution
0
The final term on the right-hand side is added to improve the convergence behavior of the
numerical method.
MATLAB’s ode15s solver uses a combination of BDF and NDF.
Implicit Methods ◾ 325
-2 < lh £ 0 (8.39)
where λ denotes eigenvalues of the multivariable system. The largest eigenvalue determines
the limit on step-size, h. Eigenvalues can be complex numbers. Thus, if we plot λh on a
complex plane, the stability region for Euler’s explicit method is a circle in the left-half
plane, with the circle intersecting the real axis at −2.
A narrow stability region limits the applicability of the ODE solver, as observed for Euler’s
explicit method. Unfortunately, the stability region for Adams-Bashforth methods shrinks
with the increasing order of the method. Thus, higher-order Adams-Bashforth methods
are not very stable. While their higher accuracy can be exploited for nonstiff systems, they
are unsuitable for stiff ODEs and DAEs. They can be used as an alternative to RK methods
when a high degree of accuracy is required.
8.2.5.4 BDF/NDF Methods
Both BDF-1 (implicit Euler’s method) and BDF-2 are A-stable. Like AM-n method,
the stability region of BDF-n also shrinks as the order n increases. Stability regions for
BDF-3 and BDF-4 are also very large, but shrink for BDF-5 and BDF-6. Interestingly,
BDF methods beyond the sixth order are not zero-stable.* Consequently, BDF beyond
BDF-6 cannot be used. In fact, most commercial solvers use BDF methods of order
BDF-1 to BDF-5.
* Zero stability implies that a numerical technique is stable as h → 0. This is a very important property as a very small step-size
may be needed for higher accuracy. Hence, a method that is not zero stable is not useful in practice.
326 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
h æ h3 ö
2
( )
y k +1 = y k + f ( t k ,y k ) + f ( t k +1 ,y k +1 ) + ç - f ¢¢ ( x ) ÷
12
(8.21)
è ø
AM-2
( )
O h3
At time k, the solution yk is known and yk+1 is the unknown quantity to solve for. Thus, this
is a nonlinear equation, the solution of which is the value yk+1. Rewriting the above equation
in the form used in Chapter 6 gives
é h ù
ë 2
( )
y k +1 - ê y k + f ( t k ,y k ) + f ( t k +1 ,y k +1 ) ú = 0 (8.40)
û
g ( y k +1 )
The stiff ODE (8.3) in single variable will be solved using AM-2 method in the next exam-
ple. The function f(t, y) in the above equation is simply the ODE function in y′ = f(t, y) and h
is the step-size, exactly as was seen in the explicit methods in Chapter 3.
Example 8.1 Implicit Trapezoidal Method for Stiff ODE in Single Variable
Problem: Solve the ODE-IVP in Equation 8.3 using the AM-2 trapezoidal method.
Compare the results with ode45.
Solution: The function f(t, y) = − 1000y + 999e−t will be defined using MATLAB
anonymous function definition:
f=@(t,y) -1000*y+999*exp(-t);
This ODE function is used to obtain g(yk + 1) in Equation 8.40. If the current value yk is
known at time tk, then
fk=f(tk,yk);
will give the value f(tk, yk), which is a known quantity. This quantity can be precom-
puted and used in g(yk + 1) defined in Equation 8.40. If we denote yk + 1 as variable yNew,
then function g(yk + 1) ≡ g(yNew) is given by the following function:
g=@(yNew) yNew-yk - h/2*(fk+f(tNew,yNew));
328 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
Note that tk, yk, and fk are known since they are the past values; while tNew,
the time point at which yNew is calculated is also known (tNew=tk+h). Thus, the
unknown is yNew. We will use fsolve to find the root of Equation 8.40.
The function f(t,y) remains the same throughout the simulation, whereas the
function g(yNew) changes at each time point: The function is updated with most
recent values of tk, yk, and fk. The complete code for this problem is given below:
Figure 8.2 shows the results from the AM-2 trapezoidal method at two different step-
sizes. Since the fast dynamics stabilize within t = 0.003, a small step-size is required
for accuracy. When h=0.01 was used as the step-size, oscillatory behavior was
observed initially. Thus, h=0.001 was required for stability. With this step-size,
AM-2 required 1000 iterations. Despite being an adaptive step-size solver, ode45
required 1229 iterations.
Implicit Methods ◾ 329
2
ode45
AM-2 (h = 0.001)
1.5 AM-2 (h = 0.01)
y(t)
1
0.5
0
0 0.5 1
t
FIGURE 8.2 Comparison of AM-2 trapezoidal method (with two different step-sizes) and ode45
solution for stiff ODE (8.3) in single variable.
8.3.1.1 Adaptive Step-Sizing
Adaptive step-size methods exist for trapezoidal and other Adams-Moulton methods.
These can be used to give better accuracy. The faster dynamics, in this example, have a time
constant of 0.001 (since λ = 1000). An adaptive step-size ode45 was unable to take large
steps. We now investigate whether it is possible to increase step-size of AM-2 method and
achieve stable performance. Adaptive step-size AM-2 methods are beyond the scope of this
text. But I will give a flavor of adaptive step-sizing in using this example.
An “adaptive step-sizing” will be implemented in an ad hoc manner. We know that a
small step-size is needed for accuracy until the fast dynamics settle down, after which we
are at liberty to increase the step-size* for implicit AM-2 method. We will use this knowl-
edge for the next simulation, which shall be performed in three steps:
• Run the code in Example 8.1 with h=0.0005 for 20 iterations to yield solutions of
the ODE until tN=0.01.
• Now that the solutions at t = 0 and t = 0.01 are available, use a larger step-size of
h=0.01 for the next 10 iterations.
• Now with the solutions at t = 0 and t = 0.1, use a still larger step-size of h=0.1 for the
remaining 10 iterations until we reach tN.
The results from this procedure are not shown since they exactly overlap with the solution
from AM-2 with a step-size of h=0.001. No oscillatory behavior is obtained, even in the
third stage. With this procedure, an excellent solution can be obtained within 40 iterations.
* If we were using an explicit method, such as explicit Euler or RK-2, stability requirement will still limit us from using a
large step-size. For example, RK-2 method will require h ≤ 2λ, that is, (h < 0.002) for stability. Implicit method such as
trapezoidal method allows a larger h.
330 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
MATLAB’s ode15s, ode23s, ode23t, and ode23tb are all adaptive step-size solv-
ers. Adaptive step-size implicit solvers can be used to obtain accurate solutions with good
computational efficiency.
8.3.1.2 Multivariable Example
We are now equipped to solve the bioreactor example from Section 3.5.4 using the AM-2
trapezoidal rule. The case with smaller value of saturation coefficient, Ks = 0.001, was stiff
and resulted in an unstable solution with ode45. The same example (with same values of
coefficients and initial condition) will now be simulated using the AM-2 trapezoidal rule.
The function funMonod.m remains the same as in the previous example. The anony-
mous function f can be defined as
f=@(t,y) monodFun(t,y,monodParam);
Since MATLAB naturally works with vectors, the definition of the function “g” remains
unchanged. Since the nonlinear functions to be solved are represented by g(yk + 1) = 0, one
simply needs to ensure that g(y) takes a 3 × 1 vector argument yNew and returns a 3 × 1
vector. Thus, g(yNew) is defined as
This is exactly the same as before. The only difference is that yNew, yk, and fk are all 3 × 1
vectors. The actual code for this example is left as an exercise.
Figure 8.3 compares the simulation results using the AM-2 trapezoidal method
and ode15s. The two results are very close to each other. Since the trapezoidal
method is implicit, it captures the sharp change in process trends observed ~19 h. A step-
size of h=0.05 was used for the trapezoidal method; a lower value of h gives unstable
results.
4
[S],[X],[P] (mol/L)
3 Solid: ode15s
Dashed: AM-2
0
0 5 10 15 20 25
Time (h)
FIGURE 8.3 Bioreactor with Monod kinetics simulated using AM-2 trapezoidal method (dashed
lines). The solution using ode15s is plotted as solid lines for comparison.
Implicit Methods ◾ 331
¶f ¶f
+u = S ( f) (8.4)
¶t ¶z
were discussed in Chapter 4. The most versatile method is to convert the above PDE into a
set of ODEs in time by discretizing the spatial derivative. With the spatial domain split in
n intervals, with interval size Δz = L/n, the ODE for the solution variable at the ith location
is given by
d
fi = -Ci ( f ) + S ( fi ) (8.41)
dt
The convection term, uϕz, is represented with finite difference approximation. The above
expression for Ci ( ×) is written in a very general form. In reality, Ci ( ×) does not depend on an
entire vector ϕ but only on a few terms ϕi − ℓ, … , ϕi, … , \ phii + ℓ (as we had seen in the previ-
ous chapter). For example, if central difference formula is used, then ℓ = 1 and
fi +1 - fi -1
Ci ( f ) = u (8.42)
2D z
Let the right-hand side of Equation 8.41 be represented as fi(ϕ). Thus, the set of ODEs
obtained after discretization in space is given by
é f1 ( f ) ù
ê ú
d ê f2 ( f )ú
f= (8.43)
dt ê ú
ê ú
êë fn ( f ) úû
One of the options for solving the above set of ODEs is the AM-2 trapezoidal method. If Δt
is the time-step size and ϕp represents the solution vector at time p, then
Dt
fp +1 = fp +
2
(
f ( fp +1 ) + f ( fp ))
(8.44)
Dt
f p +1,i = f p,i +
2
(
fi ( fp +1 ) + fi ( fp )
) (8.45)
332 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
Substituting the central difference formula for Ci ( ×) from Equation 8.42 in the above equa-
tion yields us the Crank-Nicolson equation:
¶C A ¶C
+ u A = -r
¶t ¶z (8.47)
C A ( t = 0 ) = C A0 , C A ( t ,z = 0 ) = C A,in
A first-order reaction, with r = kCA and k = 0.2 min−1, takes place in the reactor.
ì ( u/2D z ) ( f p,i +1 - f0 ) i = 1
ï
Cp,i = í ( u/D z ) ( f p,i - f p,i -1 ) i = n (8.48)
ï u/2D f
î( z ) ( p ,i +1 - f p ,i -1 )
whereas the reaction term is rp, i = kϕp, i. A function file can be written to return the
reaction and convection terms for all locations, given the vector ϕp at any time p.
Implicit Methods ◾ 333
(i) The function to calculate Cp,i and rp, i (pfrDiscreteFun.m) is given below:
Since rp and Cp are computed at known (previous) values of ϕp, they can be cal-
culated once and passed to the function g(ϕ) as constants.
Let us use phi to represent the previous solution, ϕp, and Y to represent the
function argument in Equation 8.49, that is, g(Y). The MATLAB function that
calculates g(Y) consists of the following calculation:
g ( Y ) = (Y - f ) +
Dt
(
éf old ù + éf new ù
2 ë û ë û )
f old = éëCp + rp ùû , f new = éëCp +1 + rp +1 ùû
334 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
The above function is self-explanatory, where the steps shown in various equa-
tions are implemented sequentially. Since pfrCrankNicolFun is an embed-
ded function, the variables in pfrDiscreteFun are accessible to it; thus,
allowing us to avoid transferring the values fOld, Phi, and par to this function.
Implicit Methods ◾ 335
(iv) Overall driver function for solving the PFR problem follows a similar structure as
before. The function, pfrCN_Stepper, is called at each time-step to compute
ϕp + 1 in a loop, starting from the initial condition:
%% Plotting results
% Steady state plot
CModel=C0*exp(-modelParam.k/modelParam.u0*Z);
C_ss=PHI(:,end);
figure(1)
plot(Z,C_ss); hold on
plot(Z,CModel,'.');
xlabel('Length (m)'); ylabel('C_A (mol/L)');
336 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
0.8
CA (mol/L)
0.6
0.4
0.2
0 0.2 0.4 0.6
Length (m)
FIGURE 8.4 Steady state results using the Crank-Nicolson method, and comparison with analytical
solution (as symbols; see Chapter 4 for derivation).
0.8
CA (mol/m3)
0.6
0.4
0.2
0 5 10
Time (min)
FIGURE 8.5 Transient variation in outlet concentration for two different step-sizes: Δt = 0.1 (solid
line) and Δt = 0.2 (dashed line).
Implicit Methods ◾ 337
opt=optimoptions(@fsolve);
This structure opt can be used to provide various options for fsolve. Since Equation 8.5
resulting from the Crank-Nicolson method is tridiagonal, one of the methods discussed
in the previous chapter can be used to improve the computational efficiency by exploiting
the sparse nature of Equation 8.5. This can be done by providing “JacobPattern” option for
fsolve. The above underlined code is modified as
opt=optimoptions(@fsolve,'jacobPattern',par.Jpat);
The variable Jpat was defined in the driver script (see underlined part) as
Jpat=diag(Jvec,-1)+eye(n)+diag(Jvec,1);
Recall from Chapter 7 that the above command indicates to the solver that the problem has
a tridiagonal structure. The sparsity pattern can be verified using spy(Jpat).
The original code was executed again with Δt = 0.01 (to highlight the computational
gains clearly). Without Jacobian pattern, it took 33.7 s for the 1000 steps to reach tN = 10.
With the Jacobian pattern included, the computation time reduced* to 29.8 s. The computa-
tion time can be reduced even further by suppressing the display at the end of each iteration
using
opt=optimoptions(opt,'display','none');
æ dY ö
F ç t ,Y , =0 (8.8)
è dt ÷ø
* fsolve uses various different algorithms; only some of these can exploit Jacobian pattern. The codes were run on Macbook
Pro with 2.6 GHz Intel i5 processor, running MATLAB R2016a.
338 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
where Y ∈ R n. Fully implicit methods are used to solve the generic DAE of the form above.
Rather than discussing a general form of DAEs as above, this chapter focuses on the most
important class of DAEs for chemical engineers—semiexplicit DAEs:
d
y = f ( t ,y ,x ) (8.6)
dt
0 = g ( t,y ,x ) (8.7)
As described before, y ∈ R nd are differential variables since they are governed by ODEs,
whereas the remaining variables, x ∈ R n-nd , are algebraic. When nd = 0, we have a purely
algebraic system of equations, whereas when nd = n, we have ODEs. Since these were cov-
ered in Chapters 6 and 3, respectively, we focus solely on DAEs.
d F
C A = ( C A,in - C A ) - r ( C A )
dt V
C A ( 0 ) = C A 0 (8.50)
C A,in = x ( t ) (8.51)
8.4.1.1 Direct Substitution
It is possible to substitute Equation 8.51 in Equation 8.50:
d F
dt
CA =
V
( )
x (t ) - CA - r (CA )
(8.52)
leading to ODE-IVP. This was an approach we took in Chapter 5. For example, we consid-
ered a case study of chemostat in Section 5.4 where inlet concentration varied with time,
whereas a boundary constraint in the form of PFR temperature profile was imposed in
Section 5.5. This approach may be used if algebraic variables are explicit functions of time
and/or differential variables, or if they can be converted to explicit functions with some
algebraic manipulations.
Implicit Methods ◾ 339
0 = C A,in - x ( t ) (8.53)
g ( t ,C A ,C A , in )
The above equation along with Equation 8.50 defines DAE, with CA as the differential vari-
able and CA, in as the algebraic variable.
8.4.1.2.1 Case Where Algebraic Variable Is Specified Implicitly Instead of being specified as
an explicit function, the differential variable, CA, in, may be specified implicitly in the form,
g(t, CA, CA, in) = 0. There are several instances when such a case arises. During the start-up of
a reactor, the inlet condition may need to be varied based on a certain desired profile, borne
out of some design optimization procedure. In highly exothermic systems or in bioreactors
to avoid toxicity, CA, in may be specified implicitly. For example
h
éë F ( C A,in - C A ) ùû + aC A,in = 0 (8.54)
for an arbitrary η ≠ 1 is a case where the DAE comprising Equations 8.50 and 8.54 needs to
be solved using a combined approach.
8.4.1.2.2 Case Where Algebraic Variable Is Specified Indirectly There are several problems
where the algebraic constraints do not contain the algebraic variables, x. For example, the
inlet composition may be varied so that the exit concentration meets a specific profile. The
DAE consists of differential Equation 8.50 and algebraic Equation 8.55 below:
d F
C A = ( C A,in - C A ) - r ( C A ) (8.50)
dt V
0 = C A - x ( t ) (8.55)
Since CA appears in the differential equation, it is the differential variable; the algebraic vari-
able is CA, in even though it does not appear in Equation 8.55. Thus, the algebraic variable
gets specified indirectly due to the algebraic constraint in Equation 8.55.
Such cases are routinely seen in design optimization problems and (off-line) control
formulations. For example, during reactor start-up, the concentration or temperature of
CSTR may need to follow a certain trajectory; inlet concentration CA, in is determined by
solving the DAE (8.50) with (8.55). Another example is that of a surge tank or blender kept
upstream of a critical reactor or equipment. The aim of this tank or blender is to dampen the
variations and ensure the downstream reactor or equipment is provided the process stream
at desired conditions. This is achieved by varying the algebraic variable CA, in to achieve the
constraint in Equation 8.55.
340 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
The first is an example of ODE if direct substitution approach is followed, the second is
index-1 DAE, and the third is a higher-index DAE (specifically, index-2).
d
C A,in - x¢ ( t ) = 0 (8.56)
dt
Since a single differentiation was required to convert the DAE into ODE, this is an example
of index-1 DAE.
The second case is where the outlet concentration is specified. Differentiating
Equation 8.55
d
C A = x¢ ( t ) (8.57)
dt
The variable CA, in is still an algebraic variable. Substituting Equation 8.50, we get
F
V
(CA,in - CA ) - r (CA ) = x¢ (t ) (8.58)
d V ¶r ¢
C A,in - C A¢ - C A = x¢¢ ( t ) (8.59)
dt F ¶C A
Implicit Methods ◾ 341
d æ V ¶r ö
C A,in = x¢ ç 1 + ÷ + x¢¢ ( t ) (8.60)
dt è F ¶C A ø
Differentiating the algebraic equation twice converts the original DAE into a set of two
ODEs—Equations 8.50 and 8.60. Therefore, this is an example of index-2 DAE.
Here, sin θ = x/L and cos θ = y/L. Let u represent the velocity in x-direction and v represent
the velocity in y-direction. Tension in the connecting string/rod is also unknown and is to
be found out. However, T is indirectly specified because of the constraint that the string
is rigid, and hence the pendulum traverses a circular path with constant radius. Thus, the
algebraic equation for pendulum is given by Equation 8.63 below and the differential equa-
tions (8.61) are written as a set of four ODEs in (8.62) below:
x¢ = u
Tx
u¢ = -
L (8.62)
y¢ = v
y ¢ = g - Ty /L
0 = x 2 + y 2 - L (8.63)
T
The overall solution vector is Y = ëé x u y v T ùû , tension T is the algebraic vari-
able, whereas the remaining four are differential variables. Clearly, this is an example of
higher-index DAE. Computing the index of the DAE is left as an exercise to the reader.
J diff = k g ( C A - C As ) (8.64)
342 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
The balance for bulk gas gives rise to the differential equation
dC A
u = -k g a ( C A - C As ) (8.65)
dz
0 = k g a ( C A - C As ) - r ( C As ) a (8.66)
With this introduction, we now focus on methods to solve semiexplicit DAEs in MATLAB.
We further assume that the algebraic equation g is such that its Jacobian
¶g
¶x
is nonsingular. Most common DAE problems in chemical engineering are either in this
form or can be converted easily to this form.
dC A
u = -k g a ( C A - C As ) (8.65)
dz
0 = k g ( C A - C As ) - kC As (8.66)
The above problem can be easily solved because Equation 8.66 is linear in the algebraic
variable CAs. The first step in solving the DAE by hand is to obtain CAs for any given value of
concentration CA by solving Equation 8.66 analytically to yield
k g CA
C As = (8.67)
kg + k
Thus, the value of CAs can be known for any value of CA. Knowing the value of CAs, Equation
8.65 becomes
d 1 æ kk g ö
C A = - çç a ÷÷ C A (8.68)
dz u è k + kg ø
Implicit Methods ◾ 343
1 æ kk g ö
C A ( t ) = C A 0 e - az , a = ç a÷
u çè k + k g ÷ø
(8.69)
kg
C As ( t ) = C a 0 e - az
kg + k
This solution will be used in subsequent examples to compare the solution obtained using
numerical approaches.
Since the reaction term, r(CAs), can be complex for most practical problems, numerical
techniques need to be general enough to solve in cases where Equation 8.66 cannot be rear-
ranged to give an explicit expression for CAs. Two approaches for solving the DAE will be
discussed presently: nested approach and combined approach.
The above anonymous function call is passed to fsolve, which then computes CAs for the
specified value of CA, given the parameters in modelParam structure.
The versatility of numerical techniques is that the function to calculate the algebraic vari-
able need not be explicit x = ℊ(y); it can equally easily be presented as a numerical solution to
g(y, x) = 0. The following example demonstrates this nested approach of solving DAE.
Calculation of CAs by solving nonlinear equation: The nonlinear equation (8.66) will be
solved using MATLAB fsolve solver to compute CAs for any value of CA. This can
be compared with the analytically calculated CAs as per Equation 8.67. The function to
compute the algebraic function g(·) is given below:
In my experience, the underlined line of the code above is the source of a majority of
errors. The reaction rate is calculated at CAs; a common error is when the rate is calcu-
lated at CA instead. Another potential source of error is the order of input arguments.*
Please be careful when setting up the problem!
Let us now verify how this code works. Defining the parameters at the command line
>> par.k=0.01;
>> par.kg=0.02;
>> Ca0=1; CaTry=Ca0;
The analytically calculated CAs as per Equation 8.67 is 2/3, which is what we obtained. In
the fsolve call above, the underlined entity (Cas) is a variable that fsolve solves
for, and the result (Cas0) is the solution found for any given constant value of Ca0. To
summarize what fsolve does: (i) Ca0 is treated as a constant within the fsolve call,
(ii) the value of Cas is varied using the root-finding algorithm, (iii) the result Cas0 is
the solution that satisfies g(y, x) = 0 for that particular value Ca0, and (iv) CaTry is the
initial guess for the solver. If CA0 is changed to another value, CA, fsolve will find the
corresponding value of CAs as the root of the algebraic equation (8.66).
Function for differential equation with nested algebraic solver: The differential equa-
tion (8.65) will be provided to the solver. Let us say that CAs was calculated analytically
as per Equation 8.67. The ODE function would simply involve the following two lines:
Cas=kg/(k+kg) * Ca;
dC=-(kg*abyv/u)*(Ca-Cas);
* This is another example why I personally recommend using the powerful anonymous function approach of MATLAB. In the
older versions of MATLAB, the first variable in the input arguments was the variable solved for by fsolve. The anonymous
function allows us to write a MATLAB function in a manner similar to how we write analytically.
Implicit Methods ◾ 345
Since the code developed needs to be general for any other forms of reaction rate
r(C), the underlined line will be replaced by the fsolve solution described on
the previous page. The final function file used for the differential equation is given
below:
function dC=pfrNestedFun(t,Ca,par)
% Differential function for heterogeneous
% reactor using nested approach
k=par.k; kg=par.kg;
u=par.u0; abyv=par.abyv;
Driver script and results: This completes the main part of our code. The driver script to
run this code and the results are given below:
The computation of the CA values analytically is not shown in the code above.
Figure 8.6 shows the concentration profile calculated numerically (solid line) and is
compared with the analytically calculated value of CA as per Equation 8.69.
The nested approach is easy to follow and is a convenient learning tool. However, it is com-
putationally intensive since the nonlinear equation is solved at each integration step.
8.4.3.2 Combined Approach
The combined approach uses an appropriate implicit ODE-solving method to convert the
differential equations (8.6) into algebraic equations (as was done in Sections 8.2.1, 8.2.3,
and 8.3). Since the AM-2 trapezoidal method was used in Section 8.3.1, I will use it to
346 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
Concentration, CA
0.75
0.5
0.25
0 0.5 1
Location, z
FIGURE 8.6 Solution of DAE for heterogeneous reactor using nested approach (solid line), and
comparison with the analytical solution (symbols).
illustrate the combined approach. The same arguments work for other implicit methods for
ODE-IVP also. Applying the trapezoidal method to Equation 8.6 yields
h
2ë
(
y k +1 = y k + éfk + f ( k + 1) h,y k +1 ,x k +1 ù
û ) (8.70)
where h is the step-size and fk = f(hh, yk, xk) is a known value. Equation 8.7 is written as
(
0 = g ( k + 1) h,y k +1 ,x k +1 ) (8.71)
é h
ê
F(Y) ê 2ë
(
y k +1 - y k - éfk + f ( k + 1) h,y k +1 ,x k +1 )ùû ùú = 0 (8.72)
ú
(
êëg ( k + 1) h,y k +1 ,x k +1 ) úû
can be solved to obtain the solution Yk+1 at each time. Viewed thusly, both ODE and non-
linear algebraic equations may be considered as a subset of DAEs.
Function to compute f(.) for the differential function: As per Equation 8.65, the dif-
ferential function is given by
kg a
f (CA ) = - (CA - CAs )
u
AM-2 Solver for forward stepping: The solver that uses AM-2 trapezoidal rule will be
built on the same lines as Example 8.1. The first element of the vector F(Y) will use
the differential model above within an AM-2 discretization, whereas the second ele-
ment is the algebraic equation surfaceBal. Specifically, the two elements of F(Y)
will be
function [zNew,YNew]=pfrDAE_AM2(z,YOld,par)
% Function for single step for solving DAE PFR
% problem using AM-2 trapezoidal method
CaOld=YOld(1);
CasOld=YOld(2);
fOld=gasBalance(CaOld,CasOld,par);
% Compute next values using AM-2
zNew=z+par.h;
YNew=fsolve(@(Y) pfrAM2Fun(Y), YOld);
The driver script and results: Only the part of the driver script that solves the DAE
using AM-2 method is shown below. The script is self-explanatory.
The results obtained in this example were the same as that in Example 8.4 (Figure 8.6).
Hence, they are skipped for brevity.
The above example showed the feasibility of using implicit method for solving a DAE. While
this approach indeed works very well, it is also possible to use MATLAB implicit solvers,
such as ode15s, to solve semiexplicit DAEs. This will be discussed in Section 8.4.4.
dY
M = F ( t ,Y ) (8.73)
dt
When M is an identity matrix (which is a default option in ode15s, when M is not speci-
fied), the above is an ODE-IVP. For a DAE consisting of nd differential equations and
na = n − nd algebraic equations, the mass matrix is
é In 0nd ´na ù
M=ê d ú (8.74)
ë0na ´nd 0na ´na û
where Ind is an nd × nd identity matrix and 0m × n is a zero matrix whose size is specified in the
subscript. The mass matrix is specified using the option 'Mass'.
Implicit Methods ◾ 349
é k a ù
é1 0 ù d é C A ù ê - g ( C A - C As ) ú
ê ú ê ú = u (8.75)
0 0 û dt ëC As û ê ú
ë
êë k
g ( C A - C As ) - kC As ú
û
M
f (Y)
The above equation is treated in a similar manner as the ODEs, the difference being
that an appropriate mass matrix is specified, and we have consistent initial conditions.
Since we have already created functions for gas and surface balance earlier, we will
simply use them. The function file for the DAE is given below:
function dY=pfrDAEFun(t,Y,par)
Ca=Y(1);
Cas=Y(2);
dY(1,1)=gasBalance(Ca,Cas,par); % Differential
dY(2,1)=surfaceBal(Ca,Cas,par); % Algebraic
end
Driver script: The mass matrix is specified in the driver script as follows:
MassMat=[1 0; 0 0];
opt=odeset('Mass',MassMat);
whereas the consistent initial condition for the algebraic variable is obtained by solving
the algebraic equation for inlet concentration, CA0:
Cas0=kg/(k+kg)*Ca0;
Thus, the overall driver script is given below (for the sake of completeness, the entire
script is given here):
kg=0.02; modelParam.kg=kg;
abyv=200; modelParam.abyv=abyv;
alpha=k*kg/(k+kg) * (abyv/u);
ZModel=[0:0.1:1]';
Ca_Model=Ca0*exp(-alpha*ZModel);
Cas_Model=kg/(k+kg)*Ca_Model;
plot(ZModel,[Ca_Model, Cas_Model],'o');
The results are shown in Figure 8.7. The solid line represents concentration CA and
the dashed line represents surface concentration CAs. The analytical solution is pro-
vided as symbols for comparison. The numerical results are very close to the analyti-
cal solution. Since kg ≈ k, the system is governed by both mass transfer and surface
reaction.
This completes our discussion on DAEs and numerical methods to solve the DAEs.
Specific case studies will be shown in the next section to demonstrate the principles
described in this chapter.
0.75
Concentration, CA
0.5
0.25
0.1
0 0.5 1
Location, z
FIGURE 8.7 Solution of the 1D heterogeneous reactor model using ode15s (lines), with analytical
solution (symbols) presented for comparison.
Implicit Methods ◾ 351
kC A
r (CA ) = (8.76)
1 + K r C A2
dC A
u = -k g a ( C A - C As ) (8.65)
dz
0 = k g ( C A - C As ) - r ( C As ) (8.66)
The results of the 1D heterogeneous model will be compared with the PFR model from
Section 3.5.1. The reaction rate above is in terms of mol/(m2cat · s), which needs to be con-
verted to mol/(m3 · s) as required in the PFR model. Thus, the PFR model is given by
dC A
u = -a r (CA ) (8.77)
dz
The main difference between Equations 8.65 and 8.77 is that the reaction rate is computed
at the surface concentration CAs, in the former, and at bulk concentration CA, in the latter.
function r=catalRxnRate(C,par)
k=par.k; Kr=par.Kr;
r=k*C/sqrt(1+Kr*C^2);
end
352 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
Additionally, the parameter, Kr, must be specified in the driver script. No other
change is required. Note that the DAE was initialized in Example 8.5 by specifying
kg
C As ,0 = C A ,0 (8.78)
k + kg
In this problem, ode15s is able to solve the DAE even when the problem is initial-
ized as Y0 = [CA, 0; CAs, 0]. Although this CAs does not provide consistent initial condi-
tion, the ode15s solver computes it internally before solving the DAE.
Figure 8.8 shows the results of heterogeneous reactor model. The solid and dashed
lines represent bulk concentration CA and surface concentration CAs, respectively. The
results from a PFR simulation (see Section 3.5.1 for details) are shown as symbols
for comparison. Due to mass transfer effects, the conversion from the heterogeneous
reactor model is less than that from a PFR, where complete mixing (uniformity) in the
radial direction is assumed.
The mass transfer affects the net rate of reaction, as evident from Figure 8.8. The effect of
mass transfer coefficient, kg, on reactant concentration along the length of the reactor is
shown in Figure 8.9. The bulk concentration CA is shown as solid line and the surface con-
centration CAs as dashed line for two different values of kg. As the mass transfer coefficient
increases, the bulk and surface concentrations get closer to each other.
Furthermore, conversion of A also increases with an increase in the value of kg. This is
evident in Figure 8.10, where bulk concentration is plotted for various values of kg. The
value of kg increases in the direction of the arrow. As the value of kg is increased, conversion
from the heterogeneous reactor model progressively increases (i.e., CA decreases), with the
results approaching that of the PFR. Indeed, when the Damköhler number Da ∼ k/kg ≪ 1,
the bulk concentration from the heterogeneous model closely follows that from the PFR.
The concepts developed in this example will be further used in Section 9.4 to solve the
problem of simulating methanol synthesis in a tubular reactor.
1
CA (PFR)
Concentration, CA (mol/m3)
0.8 CA (hetero)
CAs (hetero)
0.6
0.4
0.2
0
0 0.2 0.4 0.6
Location (m)
FIGURE 8.8 Simulation of heterogeneous reactor (lines) and comparison with PFR (symbols).
Implicit Methods ◾ 353
kg = 0.05
0.5
Concentration, CA (mol/m3)
0
1
kg = 0.05
0.5
0
0 0.1 0.2 0.3 0.4 0.5
Location (m)
FIGURE 8.9 Bulk and surface concentrations (CA and CAs) for two different values of kg.
1
Concentration, CA (mol/m3)
0.8
0.6
Increasing kg
0.4
0.2
0
0 0.1 0.2 0.3 0.4
Location (m)
FIGURE 8.10 Effect of varying kg on bulk concentration, CA. The four lines in the direction of the
arrow represent kg = 0.005, 0.01, 0.02, and 0.5 m/s, respectively.
P, T
yB
F, zB
M, xB
FIGURE 8.11 Schematic of flash separation, which will be used to develop binary distillation model.
Due to familiarity with a single stage of distillation (from Chapter 5), let us consider the
overall process as shown in Figure 8.11. If M is the molar holdup in the distillation tank, and
D and B are the molar flowrates, then the overall balance is given by
d
M = F -D-L (8.79)
dt
d
Mx B = Fz B - Dy B - Lx B (8.80)
dt
We have assumed that both the liquid and vapor phases are well mixed so that the outlet
conditions are the same as that existing in the flash/distillation tank. The mole fractions in
gas and liquid phases are in equilibrium. By Raoult’s law
where Pi sat represents the saturation pressure of species i. The latter equation is for binary
mixture, as xEB = 1 − xB. The saturation pressure is related to the temperature by Antoine’s
equation:
Bi
( )
log10 Pi sat = Ai -
Ci + T
(8.82)
The mixture in the tank will be at its bubble point during the entire operation. For a given
liquid composition, xB, the temperature can be calculated as shown in Section 6.5.2. The
vapor is in equilibrium at that temperature and satisfies the following condition:
P = pB + pEB (8.83)
Implicit Methods ◾ 355
P - x B PBsat - (1 - x B ) PEB
sat
=0 (8.84)
g (T )
x B PBsat
yB = (8.85)
P
The differential equations (8.79) and (8.80), the algebraic equation (8.84), and the expres-
sion (8.85) together form the model for any of these processes: continuous flash separation,
single-stage continuous distillation, and batch/semibatch binary distillation.
Batch distillation model can be further simplified because there is no feed or liquid out-
let. Thus
dM
= -D (8.86)
dt
d dM
M xB + xB = -Dy B
dt dt (8.87)
dx B D
= ( xB - yB )
dt M
The solution variables for the binary batch distillation are defined as
éM ù
ê ú
xB
Y =ê ú
êT ú
ê ú
êë y B úû
with the first two being the differential variables and the last two algebraic variables. Note
that Equation 8.85 is simply an expression for calculating yB. It is therefore not necessary to
include yB as an algebraic variable.
Antoine’s coefficients for B and EB are
dPhi(1)=-D;
dPhi(2)=D/M*(xB-yB);
dPhi(3)=P - xB*PsatB-(1-xB)*PsatEB;
dPhi(4)=yB - xB*PsatB/P;
where D is the rate at which the distillate is drawn from the system. Antoine’s equa-
tion is used to compute PsatB and PsatEB, the saturation pressures of B and EB,
respectively, at the operating temperature, T. The operating temperature is equal to
the bubble point of the liquid in the distillation tank, whereas the operating pressure,
P, is taken as the atmospheric pressure.
The function file, batchDistFun.m, for this problem is given below:
function dPhi=batchDistFun(t,phi,par)
% Model for binary batch distillation
% of Benzene-Ethyl Benzene mixture
%% Model Parameters
P=par.P; % Operating pressure (bar)
D=par.D; % Distillate rate
% Antoine's coefficients
Ab=par.Ab; Bb=par.Bb; Cb=par.Cb;
Aeb=par.Aeb; Beb=par.Beb; Ceb=par.Ceb;
% Model variables
M=phi(1);
xB=phi(2);
T=phi(3);
yB=phi(4);
%% Model equations
PsatB=10^(Ab-Bb/(T+Cb));
PsatEB=10^(Aeb-Beb/(T+Ceb)); % Saturation P
dPhi=zeros(4,1);
dPhi(1)=-D;
dPhi(2)=D/M*(xB-yB);
dPhi(3)=P - xB*PsatB-(1-xB)*PsatEB;
dPhi(4)=yB - xB*PsatB/P;
Implicit Methods ◾ 357
The above function requires pressure, distillate rate, and Antoine’s coefficients as the
model parameters. These will be defined in the driver script and passed to the func-
tion above. The driver script for batch distillation is given below:
As time progresses, the vapors in the tank are drawn out at the rate of D mol/min.
The vapor phase is in equilibrium with the liquid in the tank. The vector Phi0
358 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
contains the guess for initial conditions. Since the boiling point of B is 80°C and that
of EB is 136°C, we chose 105°C = 378 K as the initial guess for the temperature. The
ode15s solver internally computed the correct initial conditions as
>> disp(PhiSol(1,:))
100.0000 0.5000 371.8551 0.8256
The first two elements are M = 100 and xB = 0.5, which are the initial conditions pro-
vided. The value of temperature that satisfies the algebraic Equation 8.84 is T = 371.86 K.
The corresponding value of yB = 0.826. Thus, the equimolar mixture of B and EB has its
bubble point at 372 K, and the vapor in equilibrium with the liquid contains 82.6% B.
When this vapor is drawn out, the distillate contains more amount of B and the
liquid left in the distillation tank is richer in EB. Thus, the bubble point temperature
increases, as shown in the top panel of Figure 8.12. Since both B and EB are drawn out
from the distillation tank, the amount of B and EB falls. The amount of B in the tank
equals MxB, whereas that of EB is M(1 − xB). After the 100 min distillation, 12.85 mol
of B and 37.15 mol of EB are remaining in the tank.
Since there is more B in the distillate than EB, the amount of B in the tank falls
faster than that of EB. Equivalently, the flowrate of B in the distillate stream is higher
than that of EB. D is the net distillate flowrate and the vapors exiting the distillation
tank are in equilibrium with the liquid in the tank, the flowrates of B and EB in the
distillate are DyB and D(1 − yB), respectively. Figure 8.13 shows the flowrates of B and
EB in the distillate.
385
Temperature (K)
380
375
370
50
EB
Moles left in tank
40
30
Benzene
20
10
0 50 100
Time (min)
FIGURE 8.12 Temporal variations in equilibrium temperature and the amount of B and EB left
behind in the distillation tank.
Implicit Methods ◾ 359
0.5
0.4 Benzene
Distillate flowrate
0.3
0.2
0.1 EB
0
0 50 100
Time (min)
The total number of moles of B left in the distillation tank equals MxB = 12.85 mol. Since the
batch distillation process started with 50 mol each of B and EB, the rest 37.15 mol of B are
collected as distillate over the 100 min operation.
The flowrates of B and EB in distillate are plotted in Figure 8.13. The total number of
moles collected in the distillate over the 100 min operation is also equal to the area under
the respective curves. Using numerical integration (trapezoidal rule with trapz com-
mand), the amount of B in the distillate is computed as 37.13 mol.
Distillate flowrate was used as a design parameter. Distillate flowrate is increased to
1 mol/min by changing D = 1.0, which means that 50 mol of the mixture will be collected
in 50 min. Executing the above code for 50 min, one can observe that the same amount of
B (i.e., 37.15 moles) is distilled as before. Likewise, the temperature at the end of 50 min is
T = 385 K; this is the same temperature reached at the end of the batch distillation process
in Example 8.7.
Finally, I would like to draw attention to the initial conditions. The initial conditions are
specified for the differential variables, M and xB. Since the algebraic variables, T and yB, are
not free variables, but depend implicitly on M and xB, they cannot be chosen arbitrarily. In
general, it is a good idea to provide consistent initial conditions to the DAE solver ode15s.
This involves solving the algebraic Equation 8.84 to obtain T and using Equation 8.85 to
obtain yB. Since this problem was not very stiff, it worked even when the initial conditions
were arbitrary.
With this, we come to the end of the case studies for this chapter. Next section summa-
rizes the findings and provides some tips based on my experiences.
8.6 EPILOGUE
The focus of this chapter was solving various problems that require an implicit time-stepping
solver to determine how the solution evolves in time.
The trifecta of Chapters 3 and 4 and the current one covers problems involving differ-
ential equations that evolve in time. Chapters 3 and 4 focused on explicit time-stepping
approaches to solve differential equations: ODE initial value problems in Chapter 3, whereas
360 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
hyperbolic and parabolic PDEs in Chapter 4. That left us with three types of problems in
this chapter: stiff ODEs, hyperbolic/parabolic PDEs, and DAEs that have both differential
and integral components. The glue connecting these seemingly disparate examples was that
all of them require implicit-in-time solution procedure, which forms the central theme of
this chapter.
The ODE-BVPs (boundary value problems) and elliptic PDEs form the final set of
differential equation problems. Discretization of the domain led to conversion of these
differential equations to linear or nonlinear equations. Methods to solve these equations,
which have special sparse matrix structure, were analyzed in Chapter 7.
I end this chapter with a brief from my personal experience. I have found one of the two
methods as most reliable in MATLAB: using ode15s as described in several examples in
this chapter or writing my own time-stepping solver based on the Crank-Nicolson method.
While ode15s has worked for me in a majority of examples,* there are typical stum-
bling blocks when one attempts to solve a new problem. Assuming that the code is free of
errors and typos, the first issue is “poor” problem formulation. Sometimes, reformulating
the problem in dimensionless or normalized quantities makes it numerically more tracta-
ble, for example, converting from partial pressures to mole fractions. Alternatively, a change
of coordinate system or a linear transformation may make the problem more tractable. It is
also possible to use a physical constraint to replace one of the ODEs. Problem 8.9 shows one
such example, where one may use an algebraic constraint in lieu of a differential equation.
The second issue may be poorly specified initial conditions (in case of DAEs). It is there-
fore a good strategy to solve the algebraic equations
g ( y 0 ,x 0 ) = 0
EXERCISES
Problem 8.1 Derive the error estimate for Euler’s implicit method using the procedure for
the Adams-Moulton method (Section 8.2.1) and BDF methods (Section 8.2.3).
This can be done by treating Euler’s implicit method as AM-1/BDF-1.
* As explained in Chapters 3 and 4, and earlier in this chapter, ode45 is the preferred ODE-IVP solver. Thus, ode15s is a
“go-to solver” if I suspect a problem to be stiff or if the nature of the problem (PDE with highly nonlinear source term or
DAE) calls for an implicit solver.
Implicit Methods ◾ 361
Problem 8.2 Solve the nonlinear chemostat problem from Section 8.3.1 using the second-
order Adams-Moulton method (also known as trapezoidal method). You
may modify the code from Example 8.1 or write a fresh code. The chemostat
model is
S¢ = D ( S f - S ) - rg
X ¢ = -DX + rg Yxs (8.88)
P ¢ = -DP + rg Yps
æ S ö
where rg = ç mm ÷ X , μ = 0.5, KS = 0.005, Yxs = 0.75, and Yps = 0.65. The
m
è K S + S ø
feed concentration of the substrate is Sf = 5, and the dilution rate is D = 0.1 h−1.
The initial conditions at t = 0 are S(0) = 5, X(0) = 0.02, and P(0) = 0.
Problem 8.3 Modify the Crank-Nicolson solution of Example 8.2 for packed bed reactor
with axial dispersion from Section 4.4. The nondimensional model for the
packed bed reactor is given by
¶f ¶f 1 ¶ 2 f
+ = - Da × f
¶t ¶z Pe ¶z 2
where
ϕ = C/C0 is dimensionless concentration
τ = tu/L is dimensionless time
ζ = z/L is dimensionless length
Pe is Péclet number
Da is Damköhler number
dC
= -0.1x0.7
dt
362 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
x2 - 3.45 ( C - x ) = 0
with the initial condition C(0) = 1. Solve the above problem in two ways:
(i) Rewrite the second equation to get ξ = ξ(C) and substitute in the ODE.
(ii) Solve the DAE using ode15s and compare the results.
Problem 8.6 Repeat Example 8.7 for a transient flash separation process. Assume the tank
is initially at bubble point of equimolar benzene-ethylbenzene mixture. An
equimolar mixture enters the flash as feed at 10 mol/min. The amount of
vapor drawn is D = 5 mol/min and liquid drawn is B = 5 mol/min. Compute
the transient behavior of the binary flash column.
Problem 8.7 Repeat the above problem for various values of the distillate D = {1, 3, 5, 7, 9}
and with B = F − D.
Problem 8.8 Formulate and solve transient flash separation problem for a ternary mix-
ture of benzene-toluene-ethylbenzene. Antoine’s coefficients for toluene are
AT = 4.1, BT = 1344, and CT = − 53.8.
Problem 8.9 The following reaction takes place in a batch reactor:
A ® B C
where k1 = 0.01, k2 = 106, and k3 = 104. The reactor initially has CA0 = 1 mol/L of
species A, and no B or C. The mass balance equations lead to three ODEs in
three unknowns, CA, CB, and CC.
1. Solve the set of three ODEs simultaneously.
2. The three species also satisfy the condition: CA + CB + CC = CA0. Replace
the ODE for CB with the above algebraic constraint and solve the result-
ing DAE.
CHAPTER 9
Section Wrap-Up
Nonlinear Analysis
This chapter will provide a wrap-up of concepts discussed in Section II of the book. The
concepts for solving algebraic and differential equations, covered in the preceding chapters,
will be used for simulation and analysis of nonlinear systems. The first three sections of this
chapter will focus on methods to analyze the dynamic behavior of nonlinear systems, and
the subsequent sections will focus on specific simulation examples of relevance to process
engineers.
The first three sections will use specific examples of nonlinear dynamical systems to
introduce the readers to nonlinear analysis techniques. Nonlinear systems display very rich
dynamic behavior. Strogatz’s book Nonlinear Dynamics and Chaos is an excellent intro-
ductory text on this topic. The linear stability analysis (Section 5.2) and the examples in
the next three sections cover some basic and important concepts of nonlinear analysis.
Specifically, this involves studying the stability, dynamics, and bifurcation behavior of non-
linear systems. The examples of nonisothermal continuous stirred tank reactor (CSTR) and
chemostat (from Chapter 3) that show multiple steady state behavior will be discussed at
length to introduce the turning point and transcritical bifurcations, respectively. The third
section will study systems with cyclic dynamics (limit cycle).
Finally, two simulation case studies will culminate this part of the book. A tubular/fixed
bed reactor has been discussed at several points throughout this text. An example of gas-
phase reactor with multiple species and multiple reactions will be discussed in Section 9.4.
Both steady state and dynamical models, in the presence of volume change due to reaction,
will be contrasted. The final section will solve the problem stated in the first chapter: to
compute trajectory of a cricket ball.
363
364 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
y ¢ = f ( y ) (9.1)
é mSX ù
é f1 ù ê D ( S f - S ) - K + S ú
where ê ú = ê ú (9.2)
ë f 2 û ê -DX + mSX Yxs ú
ëê K +S úû
At steady state, since y′ = 0, the steady state solutions can be found by solving the nonlinear
equations f(y) = 0. The standard method in undergraduate textbooks involves using the
second equation to get the two steady state values. Recall from Chapter 5 that these two
values are given by
DK
S= , X =0 (9.3)
mYxs - D
Here, we will follow a slightly different procedure that will allow us to qualitatively perform
the so-called bifurcation analysis later.
Adding (Yxsf1 + f2) at steady state gives us
0 = D ( S f - S )Yxs - DX (9.4)
é DK ù
ê
y ss = ê m Y - D ú , y = éS f ù (9.7)
xs
ú ss ê ú
êë( f
S - S ) xs úû
Y ë0û
Section Wrap-Up ◾ 365
The solutions in the above equation are the same as those obtained earlier in Equation 9.3.
Numerically, the two solutions exist for all conditions. However, negative values of either
substrate or biomass concentration are not physical. Thus, we will get nonphysical values
for steady state if 0 ≤ S ≤ Sf is not satisfied.
The linear stability analysis was performed in Chapter 3. The Jacobian was calculated as
é mXK mS ù
ê -D - - ú
(K + S)
2
K +S
ê
J =ê ú (9.8)
mXK mS ú
ê Y
2 xs
-D + Yxs ú
êë ( K + S ) K +S ú
û
The nominal value of the dilution rate was chosen as D = 0.1. The two steady states under
these conditions are
é0.091ù é5 ù
y ss = ê ú , y ss = ê ú (9.9)
ë 3.68 û ë0 û
whereas the eigenvalues of the Jacobian corresponding to the two steady states are
The linear stability analysis indicates that the first steady state is a stable node, whereas the
second is an unstable (saddle) node.
The linear stability analysis above provides a local dynamical behavior of the system in
the vicinity of the steady state. We will use these results to present analysis of the dynamic
behavior of the overall system, followed by the effect of varying the parameter (dilution rate,
in this example) on qualitative behavior of system dynamics.
9.1.2 Phase-Plane Analysis
A phase-plane plot is a plot that visually depicts dynamics of the nonlinear system on a 2D
plane, with the two state variables as the two axes. Starting at arbitrary initial points within
the space, the ordinary differential equations (ODE) will be solved to obtain variation in the
state variables with time. A trajectory in the 2D phase plane then tracks the dynamic evolu-
tion of the system starting from that particular initial condition. A phase-plane plot is a col-
lection of such trajectories, which thus maps out the system dynamics in a visual manner.
The substrate concentration can take values between 0 and Sf (=5), whereas the biomass
concentration takes values between 0 and 3.75. We will choose 25 points on a 5 × 5 grid in
the S–X plane as initial conditions, solve the ODE, and plot the trajectory on the phase plane.
The phase-plane plot is shown in Figure 9.1. The circles represent starting points (initial
conditions), and the stars represent the two steady states. Based on the linear stability analysis,
the first steady state is linearly stable, implying that all trajectories starting from the vicinity of
this steady state converge toward it. The second steady state (“trivial” steady state because X = 0
366 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
Biomass, X
2
0
0 1 2 3 4 5
Substrate, S
FIGURE 9.1 Phase-plane plot of nonlinear chemostat. Circles represent initial points, and the two
stars represent the two steady states.
and no conversion of substrate) is a saddle node. Recall that the S-axis was the stable subspace
at the saddle node. Thus, if the initial condition lies on this axis, the nonlinear chemostat will go
to the trivial steady state; all other initial conditions in the S–X space will converge to the first
steady state. As can be seen from Figure 9.1, the nonlinear system shows these dynamic pat-
terns as well. Starting at X(0) = 0, all the trajectories converge to the trivial steady state. These
trajectories are shown by dotted lines. This axis remains the stable subspace for the nonlinear
system also. All other initial conditions converge to the first steady state. In fact, even the initial
conditions starting at a very small value of initial X(0) = 0.01 reach the first steady state. It is
also interesting to see that trajectories starting on the eigenvector v1 = [−0.8 0.6]T stay on the
eigenvector itself, while converging to the first steady state.
In summary, we have (i) obtained multiple steady states in chemostat, (ii) analyzed the
linear stability around each steady state, and (iii) used phase portrait to analyze the dynam-
ics of the nonlinear system. The next task is to determine how the steady states and stability
change with change in the operating parameters of the system.
é DK ù
ê
y ss = ê m Y - D ú , y = éS f ù (9.7)
xs
ú ss ê ú
êë( S f - S )Yxs úû ë0û
Section Wrap-Up ◾ 367
The first steady state is physically relevant only when the two concentrations are posi-
tive. For example, when the dilution rate is increased to D > μYxs, substrate concentration
becomes negative. This value is not physically relevant. Although the system attains a physi-
cally irrelevant value, the numerical solution still exists.
Nonetheless, the system shows an interesting behavior numerically as well. Let ε be a small
positive number. According to Equation 9.7, when D = μYxs − ε, S is a large positive number.
However, when the dilution rate is increased slightly to D = μYxs + ε, S becomes a large negative
number. Thus, there is a discontinuity in the solution at D = μYxs. Thus, this point represents a
bifurcation point because there is a change in the qualitative behavior of the system.
DEFINITION
A bifurcation point is defined as the value of the bifurcation parameter where a qualitative
change in the behavior of the system is observed.
The discontinuity in the root is not the only change that happens in the system. Consider
the Jacobian at the second (trivial) steady state:
é mS f ù
ê -D -
K + Sf ú
J =ê ú (9.11)
ê mS f ú
ê 0 -D + Yxs ú
ë K + Sf û
This is just Equation 9.8, with Sss = Sf and Xss = 0. Since D > μYxs, the second diagonal term
is also negative. Thus, both eigenvalues are negative, and the trivial steady state becomes
stable. This is shown in the next example.
mu=0.5; K=0.25;
Yxs=0.75; Sf=5;
D=0.1;
% First solution and Jacobian
S=D*K/(mu*Yxs-D); X=(Sf-S)*Yxs;
a1=mu*S/(K+S);
b1=mu*X*K/(K+S)^2;
J=[-D-b1, -a1;
b1*Yxs, -D+a1*Yxs];
368 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
sol1=[S;X];
l1=eig(J);
% Second solution
X=0; S=Sf;
a1=mu*S/(K+S);
b1=mu*X*K/(K+S)^2;
J=[-D-b1, -a1;
b1*Yxs, -D+a1*Yxs];
sol2=[S;X];
l2=eig(J);
The results from varying the dilution rate are presented in the table below:
D 0.1 0.2 0.3 0.35 0.4 0.45 0.5
SS-1 0.0909 0.286 1 3.5 −4 −1.5 −1
3.682 3.536 3 1.125 6.75 4.875 4.339
𝚲 −3.96 −1.54 −0.3 −0.35
−0.1 −0.2 −0.24 −0.01
SS-2 5 5 5 5 5 5 5
0 0 0 0 0 0 0
𝚲 −0.1 −0.2 −0.3 −0.35 −0.4 −0.45 −0.5
0.257 0.157 0.057 0.007 −0.043 −0.093 −0.143
The bifurcation point is somewhere between D = 0.35 and 0.4 (we know it is at 0.375).
The first steady state becomes unphysical. However, more importantly, the second
steady state exchanges stability; it goes from being unstable to stable when the dilution
rate is increased beyond the bifurcation point.
9.1.4 Transcritical Bifurcation
The bifurcation observed in a chemostat is known as transcritical bifurcation, and the point
D = 0.375 where the trivial solution becomes stable is the bifurcation point for the system (as
we will see in this section, the dynamic behavior of the chemostat is even more complex and
interesting!). The salient features of transcritical bifurcation are that both the solutions continue
to exist numerically on either side of the bifurcation point; however, there is an exchange of
stability in one or both types of solutions. Recall that the two types of solutions for a chemostat
are the regular solution (with generation of biomass) and the trivial solution (no conversion or
generation of biomass). At low values of dilution rate, the former is a stable solution whereas the
latter is unstable. At higher values of dilution rate, however, the trivial solution becomes stable.
A graphical method will be used to further demonstrate transcritical bifurcation. We will
do so in a single dimension since multiple dimensions are difficult to visualize. The treat-
ment here follows on the same lines as that in the nonlinear dynamics book by Strogatz;
however, I have extended his graphical to two dimensions and to a practical system of inter-
est to chemical engineers. At steady state
and
æ mS ö
f1 = ( S f - S ) ç D - Yxs ÷ (9.13)
è K + S ø
Let us say that the system was moved slightly from steady state point by introducing a very
small change from the steady state concentrations. We wish to analyze how the system
responds to this change. The form of the equation
æ mS ö
S¢ = ( S f - S ) ç D - Yxs ÷ (9.14)
è K + S ø
broadly gives qualitative insights into the dynamic response of the system in the vicinity
of the steady state. The points of intersection of this curve f1 ( S ) with the S-axis gives the
steady state solutions of the system. When f1 ( S ) > 0, S′ is positive and substrate concentra-
tion S will increase with time. Conversely, S will decrease with time when f1 ( S ) < 0.
Figure 9.2 shows the plot of f1 ( S ) vs. S for four different values of D. The top panel rep-
resents dilution rates below μYxs = 0.375, whereas the bottom panels represent dilution rates
exceeding this value. The arrows indicate the direction in which S would move for small
deviations from the steady state. If S′ is positive, the value of S increases. At D = 0.3, the two
steady states are S = 0.09 and S = 5. When the value of S is decreased slightly (or conversely,
X is increased slightly) from S = 0.09, the dynamics cause S to increase and reach steady
state. Likewise, increasing S slightly will again move the system back to the steady state.
This makes S = 0.09 a stable steady state. When the dilution rate is increased to 0.3, the
5 D = 0.3
S΄
D = 0.1
D = 0.45 D = 0.6
S΄
5 5
S S
FIGURE 9.2 Plot of f1 ( S ) vs. S for four different values of dilution rate, D. Arrows indicate the
direction in which S moves with time. Stable and unstable steady states are marked by a diamond
and a circle, respectively.
370 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
first steady state moves toward the second steady state. The steady state value of S keeps
increasing with D. At around D = 0.357, the second steady state almost reaches S = 5; when
D is increased further, the solution becomes nonphysical. From a numerical viewpoint, the
trivial steady state becomes stable at this bifurcation point.
(i) There exist two steady states for D < 0.3571. The first steady state is stable and the
trivial steady state is unstable.
(ii) Between 0.3571 < D < 0.375, the trivial steady state is the only physically relevant
steady state. Numerically, however, both the steady states continue to exist, though
the first one is nonphysical because S > Sf and X < 0.
(iii) For 0.375 > D, again the first steady state is nonphysical because S < 0.
This brings me to a nuance, which is skipped in the undergraduate material. We often make
the assumption that the saturation constant K ≪ Sf. In this case, washout is observed very
close to D > μYxs. However, for larger values of K (say 1.0), the washout will happen reason-
ably below the D = μYxs condition.
The question then remains, which of the two points, D = 0.3571, 0.375, is the bifurcation
point. In order to answer this question, let us look at the plot of f1 ( S ) at an intermediate value
of D = 0.36 shown in Figure 9.3.
When the dilution rate is increased from 0.3 to 0.36, the two steady states collide at
the bifurcation point D = 0.3571 and exchange stability and continue. This is a classic
example of transcritical bifurcation. The first steady state is stable and the trivial steady state
unstable (saddle node) below the bifurcation point. At this bifurcation, the two steady states
exchange stability. The first steady state becomes a saddle node, whereas the trivial steady
state becomes stable.
Another interesting phenomenon occurs as D is increased further and approaches 0.375.
As described before, between 0.3571 < D < 0.375, S is positive and X is negative. When the
D = 0.36
5 6
value of D is increased beyond 0.375, the steady state S jumps to a large negative value, X
becomes positive, and the second solution becomes stable.
Let us analyze and verify the results using linear stability analysis in the table below:
Thus, the first steady state goes from stable to saddle node and back to stable when the dilu-
tion rate is increased across the two bifurcation points, D = 0.3571 and D = 0.375.
Figure 9.4 is the bifurcation plot for the system. It represents various steady states of
the system and how they vary with the dilution rate. This plot is thus a locus of all steady
state points as the dilution rate is varied. Solid lines represent stable nodes, and dashed
lines represent unstable (saddle) nodes. The locus of the first steady state is plotted as thick
lines. Before the bifurcation point, this steady state is stable. After the bifurcation point, this
steady state is only numerical; it does not have a physical meaning. It is shown in Figure 9.4
for the sake of completeness. The thin lines represent the trivial steady state. This is unstable
until the bifurcation point. At the transcritical bifurcation, the two steady states exchange
stability and the trivial steady state becomes stable beyond D > 0.3571.
10
5
Substrate, S
–5
0 0.1 0.2 0.3 0.4
Dilution, D
FIGURE 9.4 Bifurcation diagram for the chemostat. Thick lines represent the first steady state and
thin lines represent the trivial steady state. Solid lines imply stable and dashed lines unstable steady
state.
372 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
This is just a brief introduction to the wide world of nonlinear dynamics. The next exam-
ple, nonisothermal CSTR, displays another type of bifurcation behavior, called turning-
point bifurcation.
dC A F
= ( C A,in - C A ) - rA (9.15)
dt V
dT
V rc p = F rc p (Tin - T ) + ( -DH ) rAV - UA (T - Tj ) (9.16)
dt
dTj
Vj r j c j = Fj r j c j (Tj ,in - Tj ) + UA (T - Tj ) (9.17)
dt
The existence of two qualitatively different steady states was briefly presented in Chapter 3.
In this section, a nonlinear analysis of this system will be presented. The standard multiple
steady state behavior observed in this system is categorized as turning-point bifurcation.
The qualitative features of this bifurcation will be analyzed, and the results contrasted with
the transcritical bifurcation from the previous section. Our analysis will follow a familiar
pattern from the previous section.
Tj =
( Fj r j c j )Tj,in + (UA )T (9.18)
Fj r j c j + UA
æ E ö
The reaction is first order so that r = k0 exp ç - ÷ C A . The first equation, f1(y) = 0, is used
è RT ø
to express the concentration as a function of temperature. Substituting the value of Tj and
CA in the second equation, we get
kt UA
f 2 (T ) = ( -DH )
1 + kt
( FC A,in ) + F rc p (Tin - T ) -
1+ b
(T - Tj,in ) = 0 (9.19)
V UA
where t = , b= (9.20)
F Fj r j c j
Section Wrap-Up ◾ 373
The derivation of the above equation is left as an exercise. Depending on the values of the
various parameters, the above equation can have one, two, or three solutions. These will be
investigated shortly.
UA
(T - Tj ,in ) + Frc (T - T )
p in
1+ b (9.24)
X A,EB =
FC A,in ( -DH )
When XA is plotted against T, the former is an S-shaped curve, whereas the latter is a straight
line (see Figure 9.5). It is clear that the roots of the equations are the points of intersection of
0.8
Conversion, XA
0.6
0.4 MB
EB (350)
0.2 EB (298)
EB (398)
0
273 373 473 573 673 773
Temperature, T
FIGURE 9.5 Conversion vs. temperature from material balance (“MB” from Equation 9.23)
and energy balance (“EB” from Equation 9.24). The inlet temperature, Tin is shown in brackets.
374 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
TABLE 9.1 Model Parameters and Operating Conditions for a Jacketed CSTR
CSTR Inlet Cooling Jacket Other Parameters
F = 25 L/s Fj = 5 L/s CA , in = 4 mol/L
V = 250 L Vj = 40 L k0 = 800 s−1
ρ = 1000 kg/m3 ρj = 800 kg/m3 E/R = 4500 K
cp = 2500 J/kg · K cp = 5000 J/kg · K (−ΔH) = 250 kJ/mol
Tin = 350 K Tj , in = 25°C UA = 20 kW/K
the two curves. The parameters used in Chapter 3 are given in Table 9.1. The solid line and
solid S-shaped curve in Figure 9.5 represent the above two equations for the nominal values
of parameters shown in the table. The two curves intersect at three different points, which
are the three steady states of the system. When the inlet temperature is decreased to 298 K,
the material balance curve (denoted as MB) remains the same, while the energy balance
curve shifts to the left (dashed line, “EB (298)”). Now, there is only a single point of intersec-
tion of the two curves (and hence a single solution). Likewise, when the inlet temperature
is increased to 398 K (dash-dot line, “EB (398)”), again the MB and EB curves intersect at a
single point.
This analysis from standard textbooks is nothing but an analysis of multiple steady states
with the CSTR inlet temperature as a bifurcation parameter.
The discussion in the box is a popular approach to introduce multiple steady state
behavior in higher dimensions taken in standard undergraduate texts in reaction
engineering.
We will use another approach, similar to the one in the previous section, by plotting
f 2 (T ) vs. T. The points of intersection of this curve with the T-axis give the steady state
solutions of the original nonlinear system of equations. The curve f 2 (T ) vs. T is plotted
(see Table 9.1 for nominal values of parameters) in Figure 9.6. The curve intersects T-axis at
three points, {350, 478, 648 K}, which represents the three steady states. The corresponding
values of CA , Tj can be calculated from Equation 9.18.
Based on the approximate graphical method, the three steady state values are
The actual steady state values can be computed using fsolve with these as initial guesses.
T (K)
f2(T)
FIGURE 9.6 Function f 2 (T ) from Equation 9.19 plotted against temperature. Steady state solutions
are indicated by roots of f 2 (T ). The arrows represent the direction of change in T for small variations
from steady state value.
CSTR state variables are moved slightly from their steady state values, the approximate
dynamics of the system is qualitatively governed by
f 2 (T )
T¢ =
V rc p
This is because assuming Equation 9.15 to be at steady state yielded CA(T) and assuming
Equation 9.17 at steady state yielded Tj(T). The function f 2 (T ) was obtained by substituting
these expressions in Equation 9.16. Thus, if the value of f 2 (T ) is positive in the vicinity of
a steady state, the temperature is likely to increase, whereas if it is negative, then the tem-
perature will decrease. As can be seen in Figure 9.6, f 2 (T ) is positive for T < 350 and nega-
tive between 350 < T < 478. So, if the temperature is nudged away from the low-conversion
steady state in either direction, it will return back to the steady state. This qualitatively indi-
cates that the first steady state is stable. Similarly, f 2 (T ) is positive slightly below 648 K and
is negative slightly above 648 K. Again, this high-conversion steady state is stable because
nudging the system from this steady state in either direction will cause the system to return
to the steady state. Finally, the steady state at 478 K is unstable. This is because f 2 (T ) is neg-
ative between 350 < T < 478 and positive between 478 < T < 648, indicating that nudging the
system away from the steady state will make the system diverge away from the steady state.
This is the qualitative analysis of system stability. Since the 3D space is projected on a
single dimension, this analysis is only approximate. This needs to be corroborated with
stability analysis at each steady state.
The function cstrFun.m, discussed in Chapter 3, can be used with the fsolve
solver to find the actual steady state values. The approximate values in Equation 9.25 will
be used as initial guesses for fsolve solver. The reader can verify that the steady state
solutions are indeed very close to the values shown in Equation 9.25. Once these values are
known, the next task is computing the Jacobian at each of the three steady states. One may
376 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
either compute the Jacobian analytically or numerically. In case of the latter, the function
cstrFun.m itself can be used to compute the numerical Jacobian as
fi ( y ss + d j ) - fi ( y ss + d j )
J ( i ,j ) = (9.26)
2e
where δj is a 3 × 1 vector such that its jth element is ε and the other elements are zero. For
example
é0 ù
ê ú
d2 = ê e ú
ê0 ú
ë û
The actual steady state was computed using fsolve, followed by computing the Jacobian
numerically for each of the three steady states. The results are given below:
é -0.102 -0.299 0 ù
ê ú
Low J = ê 0.0002 -0.102 0.032 ú , l = {-0.099, - 0.081, - 0.273} (9.27)
ê 0 0.125 -0.25úû
ë
é -0.163 -3.072 0 ù
ê ú
Mid J = ê 0.006 0.175 0.032 ú , l = {0..117, - 0.099, - 0.256} (9.28)
ê 0 0.125 -0.25úû
ë
é -0.872 -3.794 0 ù
ê ú
High J = ê 0.077 0.247 0.032 ú , l = {-0.108, - 0.402, - 0.365} (9.29)
ê 0 0.125 -0.25úû
ë
The qualitative predictions based on Figure 9.6 are confirmed by the linear stability analysis
above. The low- and high-conversion steady states are stable, whereas the middle steady
state is unstable (in fact, a saddle node).
9.2.3 Phase-Plane Analysis
Phase-plane analysis is very useful to get a global picture of the system behavior. The con-
centration values are in the range 0 < CA < 4000, whereas temperature values are in the
range 300 < T < 700. Choosing several points in the state space as initial conditions, simula-
tions are performed. The dynamical trajectories are then plotted on a CA–T plane. Circles
represent initial starting points, and stars represent the three steady states. The phase por-
trait for the nonlinear CSTR is shown in Figure 9.7.
Section Wrap-Up ◾ 377
750
Temperature, T (K)
600
450
300
0 1000 2000 3000 4000
Concentration, CA (mol/m3)
FIGURE 9.7 Phase portrait of a nonlinear CSTR with Tin = 350 K. The filled stars represent stable
steady state, open star represents unstable steady state, and circles represent initial conditions.
9.2.4 Turning-Point Bifurcation
Like in the bioreactor case, the bifurcation analysis of the nonisothermal CSTR with respect
to the inlet temperature, Tin, as the bifurcation parameter will be discussed in this section.
Inlet conditions, which can be varied easily in practice, become the natural choice as bifur-
cation parameters. The curve f 2 (T ) intersects the T-axis at three points, indicating that
three different steady states coexist at the value of parameter Tin = 350 K. Since the inlet
temperature appears in Equation 9.19 as
F rc p (Tin - T )
increasing this value will simply move the curve f 2 (T ) upward. As the temperature Tin is
increased, the curve keeps moving upward. As seen in Figure 9.8, when the inlet tempera-
ture is increased to about 380 K, the lower and middle steady states merge. At this bifurca-
tion point, there are only two steady states. The high-conversion steady state is qualitatively
the same as before.* When the inlet temperature is further increased, the high-conversion
steady state remains the only steady state. Thus, the characteristic of turning-point bifurca-
tion is that two steady states come to merge at the bifurcation point and eventually disap-
pear when the bifurcation parameter is varied further. The turning-point bifurcation occurs
between the low- and mid-conversion steady states; the high-conversion steady state does
not undergo bifurcation when Tin is increased from 350 K.
A similar turning-point bifurcation behavior is seen when the inlet temperature Tin is
decreased from 350 K. Specifically, when this parameter reaches ~313 K, the high-conver-
sion and mid-steady states merge; further reducing the inlet temperature eliminates these
two steady states and only the low-conversion steady state remains. Thus, this is a second
turning point, often called extinction point, where two steady states (mid and high) disap-
pear, leaving only the low-conversion steady state.
* The high steady state remains stable as before. The characteristic of turning-point bifurcation in one dimension is that one
stable node (low steady state) and one unstable node (middle steady state) merge to give a saddle node, according to
Figure 9.7. For this reason, the turning-point bifurcation is also known as saddle node bifurcation. However, I think this is a
misnomer because in multiple dimensions, the middle steady state was not unstable; it was already a saddle node!
378 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
f2(T)
500 700 500 700
FIGURE 9.8 Effect of varying Tin on the multiple steady state behavior of CSTR. The top panels
show increase in Tin, and the bottom panels show decrease in Tin from the nominal value of 350 K. At
the bifurcation point, two steady states merge (represented as star in left panels) and then disappear.
We are now equipped to make a bifurcation diagram, which is shown in Figure 9.9. The
actual numerics behind constructing a complete bifurcation diagram is beyond the scope
of this text. Nonetheless, we will use the tools studied so far to make this plot. Between
the two bifurcation points, 313 ≲ Tin ≲ 380, all the three steady states exist simultane-
ously; below the lower bifurcation point, only the low steady state exists, whereas above
the higher bifurcation point, only the high steady state exists. Starting with Tin = 298, Tin is
increased progressively, each time using fsolve to obtain the low steady state. The previ-
ous fsolve solution may be used as an initial guess. This continues until Tin ≈ 380, beyond
which the solution jumps to the higher branch (denoted by the upward arrow in Figure 9.9).
Increasing Tin further, the solution continues marching along the high-conversion branch.
700
Temperature, T (K)
600
500
400
300
300 320 340 360 380 400
Inlet temperature, Tin (K)
FIGURE 9.9 Bifurcation plot for a nonlinear CSTR with Tin as the bifurcation parameter. The two
turning points are marked as stars.
Section Wrap-Up ◾ 379
After reaching 398 K, the temperature is decreased again. The procedure of incrementally
decreasing Tin, each time using fsolve to obtain the steady state value, may be repeated.
At this stage, a hysteresis is observed; the system does not jump to the low-conversion
steady state at 380 K. Instead, it continues along the high-conversion branch until the other
turning point is reached at Tin = 313 K.
The high- and low-conversion branches correspond to stable steady states and are there-
fore shown as solid lines. The bifurcation points are marked as stars in Figure 9.9. The mid-
dle steady state branch is denoted by dashed lines since the middle steady state is unstable.
Obtaining the middle steady state branch can be a little tricky. At the first bifurcation point
of 380 K, the steady state solution jumps from T = 402 K to T = 685 K. Thus, at Tin = 379,
fsolve can be used to obtain the middle steady state by assuming the initial guess of the
temperature to be T(0) = 450 K. Indeed, the solution at the middle branch is obtained as
T = 420.8 K. Thereafter, the inlet temperature Tin may be progressively decreased, each time
using fsolve to chart out the middle steady state branch.
This procedure to chart out the bifurcation plot is known as parametric continuation.
Recall that we were already introduced to natural parameter continuation in Chapter 3.
There, we increased the inlet temperature Tin in the same step-wise manner but used an
ODE solver instead. That procedure will allow us to trace the two stable branches; the alge-
braic solver with the procedure elaborated in this section is required to trace the unstable
branch.
This completes our introduction to steady state multiplicity. There are two other types of
local bifurcation—subcritical and supercritical pitchfork bifurcation. We will skip them in
this book. The next interesting example is that of systems exhibiting oscillations.
mx ¢¢ + cx ¢ + kx = 0 (9.30)
where
x″ is the acceleration
v = x′ is the velocity
The ODE models the motion of a body with mass m attached to a spring. The mass is dis-
placed initially to x = x0 and released. The second-order ODE is converted to the following
set of first-order ODE-IVP:
é v ù
y¢ = ê ú , y ( 0 ) = é x0 ù (9.31)
c k
ê- v - x ú ê ú
êë m m úû ë0û
380 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
é0 1ù
A=ê ú
ë -k - c û
- c ± c 2 - 4k
l=
2
Consider the case where the spring constant k = 1 and the damping coefficient is varied.
When the damping coefficient is large, that is, c > 2k , the system is overdamped. The ori-
gin is a center and trajectories are attracted to the center. When the damping coefficient
is decreased, the response becomes oscillatory. For example, with c = 0.5, the eigenvalues
are λ = − 0.25 ± 0.968i. The system is a stable spiral. As the damping coefficient value is
decreased, system response becomes increasingly underdamped and oscillations increase.
This happens until the damping coefficient becomes zero, in which case, the system oscil-
lates without any damping. We had analyzed such linear dynamical systems in Section 5.2.
The time response and phase portrait for the system are shown in Figure 9.10.
v
x
v
x and v
x
v
FIGURE 9.10 Response of mass-spring-damper system for various values of damping coefficient.
(a) Show location (solid line) and velocity varying with time and (b) phase-plane plots. The top,
middle, and bottom rows indicate c = 3, c = 0.5, and c = 0, respectively.
Section Wrap-Up ◾ 381
The closed circles are a characteristic response of linear systems with purely imaginary
eigenvalues. The oscillating system charts out multiple concentric circles. Thus, with an ini-
tial displacement of 1, the system oscillates between −1 ≤ x ≤ 1 without any attenuation, and
the amplitude remains equal to 1 even after a long time. Likewise, if the initial displacement
were 0.5, the system would oscillate between −0.5 ≤ x ≤ 0.5, and so on.
( )
x ¢¢ + m x 2 - 1 x ¢ + x = 0
(9.32)
A Dutch electrical engineer, Balthasar van der Pol, proposed this model and showed that
the system exhibited stable oscillatory behavior. He called these dynamics “relaxation oscil-
lations,” as will be described after the following example. In this example, the transient and
phase-plane plots of a van der Pol oscillator will be obtained, followed by a discussion of
their dynamics.
x¢ = v
(9.33)
( )
v ¢ = -m x 2 - 1 v - x
function dy=vanderPolFun(t,y,mu)
dy(1,1)=y(2);
dy(2,1)=-mu*(y(1)^2-1)*y(2)-y(1);
end
The resulting problem is solved using ode45 for two different values of μ, 1 and 10.
The initial condition is chosen as [x v]T = [0.1 0]T. Figure 9.11 shows the transient
response and phase-plane plot of the van der Pol oscillator for two different values
of μ, that is, μ = 1 and μ = 10. Periodic oscillations are observed for both the values of
parameter μ.
Note the difference between these oscillations and those in Figure 9.10: The van der
Pol oscillator has a single stable closed trajectory. This is shown in the top right panel,
where the response to three different initial conditions is plotted in the phase plane
382 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
2
2
x΄
0
x
–2 2
–2 x
–2
2 10
x΄
0
x
–2 2
–2 x
–10
0 25 50
(a) Time (b)
FIGURE 9.11 Transient response of x (a) and phase-plane plots (b) for van der Pol oscillator. The
top row represents μ = 1 and bottom row represents μ = 10.
(as solid, dashed, and dash-dot lines). Irrespective of the initial condition, the system
settles at the same closed trajectory. This is known as limit cycle behavior.
The top row represents μ = 1. Notice how the trajectory is nearly uniform. However,
when the parameter value is increased to μ = 10, the transient trajectory gets a peculiar
qualitative behavior. There are long stretches of slowly changing response followed by
a short period of rapid change. This observation is discussed in this section.
é0 1ù
J =ê ú
ë -1 m û
l 2 - ml + 1 = 0 (9.34)
Section Wrap-Up ◾ 383
2 2
–2 2
–2 µ = 0.2 –2 µ = 0.5
FIGURE 9.12 Phase portrait of a van der Pol oscillator for small values of μ = 0.2 and 0.5.
indicates that the origin is an unstable node. Starting from a point close to the origin, the
trajectory will diverge. However, the nonlinear damping coefficient, c ≡ μ(x2 − 1) grows as x
increases. Thus, the state variables are not allowed to grow in an unbounded manner. This
is the physical interpretation of the closed cyclic trajectories seen in Figure 9.11.
Figure 9.12 shows the phase portrait when the damping coefficient is small. When μ = 0.2,
the damping term is small. The eigenvalues from Equation 9.34 are 0.1 ± 0.995i. The origin
is now an unstable spiral. As seen in the left-hand panel of Figure 9.12, the trajectory spirals
out from the origin. As the value of x increases beyond 1, the damping, c ≡ μ(x2 − 1) increases.
The trajectory spirals multiple times around the origin and settles at the limit cycle. When
μ is increased to 0.5 (and then to 1), the system still grows slowly away from the origin and
settles at the limit cycle. When the coefficient is increased further, damping is increased and
the shape of the limit cycle becomes highly skewed (Figure 9.11, bottom-right panel).
Finally, let’s discuss what the term “relaxation oscillation” implies. Consider the case
when μ = 10, indicating that the damping is quite high. Around time t = 15, as seen in
the bottom-left panel of Figure 9.11, x ≈ 2. The damping term is large, making the sys-
tem response highly overdamped (see previous example) and hence very sluggish. As x
approaches 1, the damping term reduces and the speed of response increases. When x falls
below 1, the damping c ≡ μ(x2 − 1) becomes negative and the system tries to become unsta-
ble. Thus, there is a brief period when the value of x falls rapidly. Once it goes below –1, the
damping term becomes positive, the van der Pol system becomes stable, and the response
again becomes overdamped when x < −2. Thus, the system is characterized by a very long
“relaxation time” (when the response is overdamped), followed by a short “impulsive
period” where the state variable changes rapidly. The relaxation time governs the period of
oscillations for the system.
The van der Pol system shows limit cycle behavior, where a unique stable closed tra-
jectory exists. This manifests as constant-amplitude oscillations, irrespective of the initial
condition.
community. This work was picked up by Anatol Zhabotinsky, who further investigated the
reaction. Their work spawned a lot of activity in similar systems. Their oscillating reaction
between citric acid, bromate, and cerium as catalyst is now popular as Belusov-Zhabotinsky
(B-Z) reaction.
Since then, several other oscillating chemical reactions have been discovered. The
Briggs-Rauscher reaction is a popular reaction for demonstrating this phenomenon. It is
a reaction between malonic acid, potassium iodate, and hydrogen peroxide in presence of
Mn+2 catalyst. Bromate solution from the original B-Z system is replaced with an iodide
solution. With starch added as indicator, the demo reaction is visually striking because it
oscillates between dark-blue color in the presence of iodide ion and amber color when it
gets converted to iodine.*
The basic mechanism that reproduces oscillations observed in the B-Z reaction is
A ¾k¾
1
®X
2 X + Y ¾k¾
2
® 3X (9.35)
B + X ¾k¾
3
®Y + C
X ¾k¾
4
®D
This mechanism is called the Brusselator scheme. The second reaction in the above mecha-
nism is an autocatalytic reaction. The overall reaction scheme remains oscillatory so long
as A and B are in significant excess. When A and B are in excess, we can assume CA , CB to
be constant. Since C and D are products, they do not affect the material balance of X and Y.
The ODEs for the two species X and Y are
Dividing throughout by k1CA0 and defining x = CX/k1CA0 and y = CY/k1CA0, the two ODEs can
be written as
x ¢ = 1 + ax 2 y - ( b + k4 ) x
(9.37)
y ¢ = bx - ax 2 y
where
a = k2 ( k1C A0 ) ,
2
b = k3CB 0
The following example shows the code for the Brusselator system.
* A nice demonstration of this and other oscillating reactions is available on YouTube. For example, see https://www.youtube.
com/watch?v=IggngxY3riU [Last accessed: January 10, 2017].
Section Wrap-Up ◾ 385
function dY=brusselatorFun(t,Y,par)
% Function file for Brusselator system
a=par.alpha; b=par.beta;
k4=par.k4;
x=Y(1); y=Y(2);
%% Model equations
dx=1+a*x^2*y-(b+k4)*x;
dy=b*x-a*x^2*y;
dY=[dx;dy];
The above function can be passed on to an ode45 solver to obtain the tran-
sient dynamics. The transients of the system obtained for the initial conditions
x0 = 1 , y0 = 0.5 are shown in the top panel (Figure 9.13). The bottom panel shows
the phase portrait, which was obtained starting from various initial conditions in
the phase plane. The thick dashed line in the phase-plane plot corresponds to the
initial conditions [1 0.5]T of the top panel. The point, (1, 1) is a stable steady
state point. It is a spiral since the trajectories show some oscillations before settling
down to the steady state.
1.2
1
x and y
0.8
0.6
0.4
0 5 10 15 20
Time
1
y
0
0.5 1 1.5 2 2.5 3
x
FIGURE 9.13 Transients of the Brusselator reactor and the phase portrait for α = 1 , β = 1 , k4 = 1. In
the top panel, the initial conditions are x0 = 1 , y0 = 0.5.
386 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
Let us analyze the Brusselator system. The steady state for the system can be obtained by
equating the two transient equations to zero and adding them up. For constant values of
α = 1, k4 = 1, the steady state is
Linearizing the Brusselator equation at this steady state, we get the Jacobian as
éb - 1 1ù
J =ê ú (9.39)
ë -b -1û
l2 + l ( 2 - b ) + 1 = 0 (9.40)
When β = 1, as in Example 9.3, the steady state [1 1]T is a stable attractor. The eigenvalues
of the Jacobian are λ = − 0.5 ± 0.866i, which confirms our observation that the steady state
is a stable spiral.
As in Section 9.1, we will analyze the effect of varying the inlet concentration CB0 on
the dynamics of the system. Changing CB0 affects the parameter β. When the bulk concen-
tration of B is increased threefold, the value of the parameter β is increased to 3. Under
these conditions, the system shows interesting dynamics. The steady state can be calculated
from Equation 9.38 as YSS = [1 3]T. From the characteristic equation (9.40), one can com-
pute the eigenvalues at the steady state to be λ = 0.5 ± 0.866i. This indicates the steady state
is an unstable spiral. Although this steady state is unstable, it is circumscribed by a limit
cycle, as shown in Figure 9.14. Any trajectory starting close to the steady state diverges
4
x and y
0
0 10 20 30 40 50
Time
4
y
0
0 1 2 3 4 5
x
FIGURE 9.14 Transient response of the Brusselator system and phase portrait for β = 3.
Section Wrap-Up ◾ 387
outward. However, as we have seen in the van der Pol oscillator, the trajectory does not
grow unbounded. Instead, it settles as a single closed trajectory, the limit cycle.
Thus, from β = 1 to 3, the system goes from a stable spiral to an unstable spiral, with
the unstable spiral being enclosed by a stable limit cycle. At β = 2, the eigenvalues of the
Jacobian are λ = ± i. When β < 2, the real part of the eigenvalue is negative and the steady
state is a stable spiral. When β > 2, the steady state becomes an unstable spiral. Thus, when
the eigenvalues of the Jacobian are plotted in a complex plane, they shift from the left-half
plane (stable) to the right-half plane (unstable), with the eigenvalues crossing over from
stable to unstable at the bifurcation point β = 2. Such a crossing-over of eigenvalues at the
imaginary axis (instead of origin) is indicative of the presence of limit cycle oscillations.
Specifically, the above qualitative behavior of a stable spiral transitioning to an unstable
spiral combined with the appearance of a stable limit cycle is an example of supercritical
Hopf bifurcation. When CB0 is increased across the Hopf bifurcation point, the stable system
suddenly transitions into limit cycle oscillations.
This ends our brief introduction to oscillations and limit cycles.
CO + 2H2 → CH3OH
Since there is carbon dioxide also present in the syn gas, the water-gas shift reaction
CO2 + H2 → CO + H2O
CO2 + 3H2 → CH3OH + H2O
also take place on the catalyst. The three reactions above are not linearly independent. The
third reaction is the sum of the first two reactions. Various kinetic models for different cata-
lysts, operating conditions, and reactor types have been developed in the past. In this case
study, the kinetic model of van den Bussche and Froment (1996) will be used.
Their model considers the methanol synthesis process comprising direct CO2 reduction
and water-gas shift reactions. The kinetics of the two reactions are given by
k1 pCO2 pH2 æ 1 ö
r1 =
G3
ç1 -
ç K eq,1
è
Õp i
ni 1
i ÷
÷
ø
(9.41)
k2 pCO2 æ 1 ö
r2 = ç1 -
G ç K eq,2
è
Õp i
ni 2
i ÷
÷
ø
(9.42)
388 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
where pi are partial pressures in bar, the reaction rate is in mol/kgcat · s, and the rate c onstants
are given by
3066 2073
log10 ( K eq ,1 ) = - 10.592, log10 ( K eq ,2 ) = 2.029 - (9.44)
T T
pH2O
G = 1 + Ka + K b pH2 + K e pH2O
pH2
æ 17,197 ö
K a = 3453.4, K b = 0.499 exp ç ÷, (9.45)
è RT ø
æ 124,119 ö
K e = 6.62 ´ 10-11 exp ç ÷
è RT ø
In this section, we will consider two cases: a steady state and transient simulations of an
isothermal plug flow reactor (PFR). In the case of the former, pressure drop through the
packed bed is considered, whereas pressure drop will be neglected in the transient simula-
tions. We will adopt a modular approach, so that the codes developed in the first part can
be reused in the transient simulations.
9.4.1.1 Reaction Kinetics
The first step is to write a function that calculates the rates of the two reactions. The reac-
tion rates depend on the temperature of the catalyst and partial pressures of various species.
These two form the input arguments for the function. The output argument could either be
rate of reactions, rj (as per Equations 9.41 and 9.42), or rates of formation/consumption,
Rk = ∑jνkjrj. We choose the latter in this example. The following function computes the rates
as per the model of vanden Bussche and Froment (1996).
The following function, which is common to both cases, calculates reaction rates:
function rConv=surfRxn(T,pPress,par)
% To calculate rate for methanol synthesis
% Ref: vanden Bussche and Froment (1996) J. Catal.
Section Wrap-Up ◾ 389
R=par.R;
nu=par.stoich;
pPress=pPress/1e5; % Pressure in bar
% Rate constants
k1=1.07*exp(36696/(R*T));
k2=1.22e10*exp(-94765/R/T);
% Inhibition term
K1=3453.4;
K2=0.499 * exp(17197/R/T);
K5=6.62e-11 * exp(124119/R/T);
G=1 + K1*pPress(5)/pPress(2) + ...
K2*sqrt(pPress(2)) + K5*pPress(5);
% Forward rate
rate =zeros(2,1);
rate(1)=k1/G^3 *pPress(2)*pPress(3);
rate(2)=k2/G *pPress(3);
% Equilibrium
Keq(1)=10^(3066/T-10.592);
Keq(2)=10^(-2073/T+2.029);
for j=1:2
eqTerm=prod(pPress.^nu(:,j));
rate(j)=rate(j)*(1-eqTerm/Keq(j));
end
% Rate of conversion of each species
rConv=par.rhoCat * (nu*rate);
end
Note the third line of the function. The function accepts the inputs in SI units (Kelvin
and Pascal). Since the reaction rate requires partial pressure in bar, this conversion is done
within the surface reaction code itself. Different sources in the literature compute the reac-
tion rate expressions in different ways (e.g., based on concentration, partial pressures in
Pa, mole fraction, etc.). Hence, a good programming practice is that the main code must
consistently use SI units, whereas the unit conversion must be done in the surface reaction
code. In this case, Rk is in mol/m3 · s.
function param=methanolParam
% Computes parameters for methanol synthesis
390 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
%% Geometric Parameters
d=0.016;
Acs=pi/4*d^2; % Cross-sectional area
L=0.15; % Length
dp=0.0005; % Particle diameter
epsi=0.5; % Void fraction
%% Operating conditions
Tin=523;
Pin=5e6;
mFlow=2.8e-5;
Most of the parameters computed in this function are not dependent on the model assump-
tions, which will get incorporated in the model function file. The discussion on Ergun’s
equation for pressure drop will follow in the next section.
Section Wrap-Up ◾ 391
1 dFk
Acs dz
= rcat ån r kj j (9.46)
j
Fk
Xk = , pk = P (9.47)
å Fi
i
In the examples so far, the total pressure was assumed to be constant. If the pressure drop
along the reactor is significant, it needs to be accounted for using the Ergun’s equation for
pressure drop:
m (1 - e ) rv (1 - e )
2
dP
- = 150 2 3
v s + 1.75 s vs (9.48)
dz dp e d p e3
Erg1 Erg 2
Ergun’s equation may be used for a packed bed. For the second constant above, we use the
fact that mass flowrate remains constant:
dP 2
= - rv 2 f f (9.50)
dz d
0.079
f = (9.51)
Re0.25
n
1 RT
vs =
Acs P åF
i =1
i (9.52)
392 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
Thus, the steady state model for a pseudohomogeneous model of tubular reactor consists of
Equation 9.46, the reaction kinetics are given by Equations 9.41 and 9.42, the pressure drop
by Equation 9.48, and the velocity by Equation 9.52.
The following example demonstrates the steady state simulation of gas-phase catalytic
methanol synthesis in a packed bed reactor.
Note that the constants used in computing the pressure drop using Ergun’s equation
were already computed in the inlet parameter function in the previous section. Since
the flowrates are in the order of 10−4, the default tolerances are not sufficient. Hence,
the tolerance values are specified as ~10−10.
Function file for solving the PFR problem: The next task is to write the function
file for computing dYdz. Here, Y represents the flowrates for all species and the
pressure:
function dYdz=methanolRxtrFun(t,Y,par)
% Function to calculate dY/dx for methanol reactor
% Isothermal reactor with pressure drop
%% Parameters and variables
T=par.Tin; R=par.R;
F=Y(1:6);
Section Wrap-Up ◾ 393
P=Y(7);
X=F/sum(F);
pPress=P*X;
Mavg=par.MW'*X;
rho=P*Mavg/(R*T);
%% Model
RConvRxn=surfRxn(T,pPress,par);
dF=par.Acs*RConvRxn;
% Pressure Drop
vs=par.rhouIn/rho;
dP=-(par.ergun1+par.ergun2)*vs;
% Final Model
dYdz=[dF;dP];
end
Figure 9.15 shows the steady state profile of various species in the reactor. The reactor
attains equilibrium mole fractions for all the reacting species. This can be verified by
comparing the final steady state values of mole fractions with the ones computed using
the equilibrium conditions. This forms one check on the overall model performance.
On the same lines, another sanity check is given by the flowrate of nitrogen, FN2.
Since it is inert, the molar flowrate of nitrogen at the reactor exit should equal the
molar flowrate at the reactor entrance. The final check is by comparing the results of
Figure 9.15 with that of the original paper. The qualitative match indicates that the
steady state results are valid.
Figure 9.16 shows the effect of temperature and pressure on methanol mole frac-
tion along the length of the reactor. The inlet consists of 25% of CO, 50% H2, 10% CO2,
and the rest nitrogen, which represents one of the input conditions from Graaf and
coworkers (1989). Under both cases, the pressure drop in the reactor is about 100 Pa,
that is, 0.001 bar. The low pressure drop is due to the low velocity in the bed. Hence,
pressure drop may be neglected in the rest of the simulations.
3 CO
Mole (%)
MeOH
2
CO2
1
H2O
0
0 0.05 0.1 0.15
Axial location, z (m)
FIGURE 9.15 Steady state profiles of mole fractions of all species in a methanol synthesis reactor.
394 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
20
25 bar/473 K
15
10 bar/523 K
0
0 0.05 0.1 0.15
Axial location, z (m)
FIGURE 9.16 The effect of temperature and pressure on methanol mole fraction.
The methanol synthesis reaction is a slightly exothermic reaction with a decrease in the
number of moles. Thus, lower temperature and higher pressures increase the equilibrium
conversion. The thin solid line in Figure 9.16 represents the base case with 523 K and 50
bar pressure. When the pressure is decreased to 25 bar and 10 bar (solid lines) at the same
temperature, the equilibrium methanol concentration decreases. Due to high enough tem-
perature, complete conversion is obtained within 2–3 cm of the reactor length. On the other
hand, when the temperature is decreased (dashed lines), equilibrium conversion increases.
At 500 K temperature, conversion is higher than the corresponding case with 523 K. When
the temperature is further decreased to 473 K, equilibrium conversion further increases.
However, due to low temperature, the reactor length is not sufficient to ensure that the reac-
tion approaches equilibrium conversion.
With this, the steady state analysis is complete. Next, we focus on the transient reactor model.
9.4.2 Transient Model
The transient model for the pseudohomogeneous reactor is given by
¶ 1 ¶
¶t
Ck +
Acs ¶z
Fk = rcat ån r kj j (9.53)
j
When we solved the steady state model, we related concentration Ck to the flowrate Fk as
Fk æ P ö
Ck = ç ÷ (9.54)
å i
Fi è RT ø
Thus, the steady state model was converted to a molar flowrate basis and solved in the previ-
ous section. However, this strategy gets complicated for a transient model because flowrates
Fi and pressure P are all changing with time. An alternative way is to solve the equations in
concentration terms. To do so, we write
Fk = ( Acs v ) Ck (9.55)
Section Wrap-Up ◾ 395
where (Acsv) is the volumetric flowrate. This leads us to the familiar equation we encoun-
tered in Chapter 4:
¶ ¶
¶t
Ck + vCk = rcat
¶z ån r kj j (9.56)
j
When a liquid-phase system or a gas-phase system with no volume change was considered,
the velocity was constant and could be taken out of the spatial differential. However, since
this is not the case here, we need to compute vs. From the overall mass conservation
we can compute
Gin æ RT ö
v= = Gin ç ÷ (9.57)
r è PM ø
M= åX M
i
i i
éCCO2 , p ù
ê ú
êCH2 , p ú
êCCO, p ú
Yp = ê ú
êCH2O, p ú
êCCH OH, p ú
ê 3
ú
êëCN2 , p úû
at each node.
With this knowledge, we are equipped to formulate and solve the methanol synthesis
problem.
396 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
function dY=methanolTFun(t,Y,par)
% Transient model for methanol synthesis reactor
%% Variables and Parameters
n=par.n;
h=par.h;
R=par.R;
C=reshape(Y,6,n);
dC=zeros(6,n);
%% Model
P=par.Pin; % Neglect pressure drop
T=par.Tin; % Isothermal
for i=1:n
% Reaction term
X=C(:,i)/sum(C(:,i));
pPress=P*X;
rxnTerm=surfRxn(T,pPress,par);
% Convection term
Mavg=par.MW'*X;
rho=P*Mavg/(R*T);
if (i==1)
uC_In=par.Fin/par.Acs;
u=par.rhouIn/rho;
convTerm=(u*C(:,i)-uC_In)/h;
else
uPrev=u;
u=par.rhouIn/rho;
convTerm=(u*C(:,i)-uPrev*C(:,i-1))/h;
end
dC(:,i)=-convTerm+rxnTerm;
end
dY=reshape(dC,6*n,1);
Notice the first two lines in the model section are T=par.Tin and P=par.Pin,
for the isothermal PFR with negligible pressure drop. In case of pressure drop, the
appropriate balance equation would be used to compute the pressure at each location,
whereas energy balance will be included for a nonisothermal reactor.
Section Wrap-Up ◾ 397
(u*C(:,i)-uPrev*C(:,i-1))/h
where uPrev is the velocity computed at the previous axial location, i-1.
Driver script for methanol synthesis: The driver script required for methanol synthe-
sis is given below:
Figure 9.17 shows a comparison between steady state and transient simulations for the
same operating conditions. The thin solid lines represent the profiles from the steady
state model. The transient model is simulated for a long enough time (100 s, in this
example) for the system to attain steady state. The results from the transient model at
this end time are plotted as thick dashed lines. The two overlap, indicating that the tran-
sient model is likely to provide a reasonable prediction of methanol synthesis trends.
Figure 9.18 plots the axial profiles of methanol at various times from the start.
Initially, there is no methanol in the reactor (thick dashed line). As time progresses,
the amount of methanol along the reactor length increases and eventually reaches the
steady state value. The steady state profile is represented with thick lines.
398 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
Mole (%)
2
0
0 0.05 0.1 0.15
Axial location, z (m)
FIGURE 9.17 Comparison between the steady state solution computed using Example 9.4 and the
same solution from the transient model.
3
Methanol mole (%)
1
Increasing time
0
0 0.05 0.1 0.15
Axial location, z (m)
FIGURE 9.18 Axial profiles of methanol mole fraction at various times from the start obtained
using the transient solver.
captain Mahendra Singh Dhoni remarked that it was easier to “hit sixers” in Dharamsala
cricket ground than at Mohali. Dharamsala is at an elevation of approximately 2000 m
above mean sea level. The students modified a homework problem to a project problem
to find out how gravitational acceleration and air drag influence where the ball lands. We
will consider the first part of the problem: to obtain the trajectory of a cricket ball hit with
a certain velocity and at an angle and, hence, to find the location where the ball hits the
ground.
v = ui + vj
Since the initial speed and angle are given, the two components u0 and v0 are known at
t = 0. Air drag acts against the direction of motion of the ball, and gravity acts downward.
Thus
mu¢ = - k v u
(9.58)
mv ¢ = - g - k v v
form the overall balance equations for the ball. The net air drag is given by κ|v|v, which acts
against the direction of motion of the ball. Here, |v| is the magnitude of the velocity. The two
components of the air drag are, therefore, −κ|v|u and −κ|v|v (the negative sign indicating
the direction is opposite to the corresponding velocity component) in x- and y-directions,
respectively. Thus, the overall ODE-IVP is given by
éx ù é u ù
ê ú ê ú
d êyú ê v ú
= (9.59)
dt ê u ú ê - k v u ú
ê ú ê ú
ëê v úû êë - g - k v v ûú
é 0 ù
ê ú
0
Y (0) = ê ú (9.60)
êV0 cos ( q ) ú
ê ú
êë V0 sin ( q ) úû
The following example shows the basic code for solving the ODE-IVP.
400 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
function dY = ballFun(t,Y,param)
% Function used with ODE solver to calculate trajectory
% of a cricket ball hit with a particular velocity
% Y(1) x (Horizontal displacement)
% Y(2) y (Vertical displacement)
% Y(3) u (Horizontal velocity component)
% Y(4) v (Vertical velocity component)
u=Y(3); % X-velocity
v=Y(4); % Y-velocity
vel=sqrt(u^2+v^2); % Velocity magnitude
% Model Parameters
g=param.g;
c=param.kappa;
%% Model equations
dY =zeros(4,1);
dY(1)=u;
dY(2)=v;
dY(3)=-c*vel*u;
dY(4)=-c*vel*v - g;
The driver code for this problem is straightforward and will be skipped in this section.
The next section provides a modified driver script.
The problem is solved with the initial velocity of 35 m/s and angle of π/4. Figure
9.19 shows the variation in x and y locations with time.
The next aim will be to find the location and time where the ball hits the ground.
y ( t ) = 0 (9.61)
is satisfied.
Section Wrap-Up ◾ 401
80
60
x
Location (m)
40
y
20
0
1 2 3 4
An analytical case: To understand how the above equation works, consider the case where
the air drag is neglected. In such a case, the ODE (9.59) can be solved analytically to get
u ( t ) = u0 , v ( t ) = v0 - gt
1 (9.62)
x ( t ) = u0t , y ( t ) = v0t - gt 2
2
Solving Equation 9.61 implies finding the value of t that forms the root of the equation. In
the case of no air drag
In the case of the nonlinear problem, Equation 9.59 along with (9.61) forms a higher-order
differential algebraic equation (DAE). This cannot be converted into an index-1 DAE.
Thankfully, MATLAB provides Events Options in ODE solvers, which allows us to specify
the events. This can be activated using the option
opt=odeset('events',@eventFun);
where eventFun(t,y) is a function that returns three output arguments: (i) event func-
tion of the type (9.61), (ii) a flag to indicate whether the ODE solver should be halted, and
(iii) a flag to indicate the direction of change of the event function. The use of the events
option to obtain the location of ball landing is shown in the example below.
Solution: We will use the event function in MATLAB to obtain the location where
the ball lands. The ODE function, ballFun.m, remains the same as in the previous
example. We will first write an event function, as given below:
function [pos,isTerminal,dir]=ballLanding(t,Y)
pos=Y(2); % Check if Y-location is zero
isTerminal=1; % Stop simulation on true
dir=0; % Ignore direction
The first value determines the condition to be checked. This is nothing but the left-
hand side of Equation 9.61. The second line indicates that the simulation should be
stopped when the event condition y(t) = 0 is met.
The overall driver function for solving this problem is given below:
The result of running the code is similar to that in the earlier part of this problem. The
main difference is that the code stops when the event is reached. The time and location
at which the ball lands can be obtained as
>> disp(tOut(end))
4.4430
>> disp(XOut(end,1))
81.3707
Section Wrap-Up ◾ 403
25
20
15
y (m)
10
0
0 20 40 60 80
x (m)
FIGURE 9.20 Trajectory of the ball hit in the X–Y plane of the cricket field.
Clearly, since the ball lands beyond 75 m, Dhoni has scored a six. The trajectory of the
ball is plotted in Figure 9.20. The origin is where Dhoni hit the ball. Locations in the
X–Y plane that the ball traverses are shown as circles.
Let us consider a slightly different problem: Find the height of the ball from the ground
when the ball reaches the boundary. In other words, we need to obtain the height y when
the horizontal location is 75 m.
The event function can be written as
x ( t ) - 75 = 0 (9.63)
function [pos,isTerminal,dir]=ballLanding(t,Y)
pos=Y(1)-75; % Check if X-location is 75
isTerminal=1; % Stop simulation on true
dir=0; % Ignore direction
With this change, the ODE code is executed again. The time required for the ball to reach
the boundary is found to be approximately 4 s. The height of the ball at that time is 8.16 m.
Thus, if there was a 6 ft tall fielder at the boundary, he would not be able to catch the ball
and Dhoni would still score a six!
9.5.3 Animation
Finally, let’s get to the fun stuff: animation. The way we will animate motion of the ball is to
obtain the location of the ball at every 0.1 s. This is already done in Figure 9.20. However,
all the symbols are plotted at the same time. In animation, what we need instead is to plot
the first circle at time 0, then replace that circle with the second circle at time 0.1, replace
that with a third circle at time 0.2, and so on. If we use the plot command at each time,
404 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
the previous plot will get overwritten and the animation will show a flicker. Hence, we are
not going to plot a separate circle; instead, we will only change its X- and Y-coordinates. We
do so using
set(gca, 'xData',XOut(i,1),'yData',XOut(i,2));
pause(0.1);
The last line ensures MATLAB will pause for 0.1 s before changing the location of the ball
for the animation.
There is, however, one more issue. As the location changes, MATLAB changes the limits
of the two axes automatically. To avoid this, we need to fix the X- and Y-limits using
set(gca,'xLim',[0 85], 'yLim',[0 25]);
This will ensure that the two axes will always remain fixed to the range of 0–85 and 0–25 m,
respectively.
Finally, you may notice the animation to still be a bit jerky. To make it smooth, we will
animate it every 0.02 s. The MATLAB code ballAnimation.m that does this is given
below:
function exitCode = ballAnimation(tOut,XOut)
% To animate trajectory of a ball
% Interpolate data every 0.02 s
tAnim = 0:0.02:tOut(end);
Xanim = spline(tOut,XOut(:,1),tAnim);
Yanim = spline(tOut,XOut(:,2),tAnim);
for i = 2:length(tAnim)
set(oHnd,'xData',Xanim(i),'yData',Yanim(i));
pause(0.02);
end
exitCode = 1;
9.6 WRAP-UP
This chapter wraps up Part II of this book by bringing in various tools to analyze and simu-
late complex processes. The first three sections of this chapter presented nonlinear analysis
using several examples. The first tool in nonlinear analysis is used to obtain various steady
state solutions (Sections 9.1.1 and 9.2.1), also known as fixed points. The model can be
linearized, and the linear stability analysis (Section 5.2) around the fixed point is used to
study the local stability behavior of the system. Phase-plane analysis is a useful tool to study
the overall dynamic behavior. Such an analysis is also useful in studying oscillating sys-
tems (Section 9.3), both harmonic oscillations and limit cycle behavior. Indeed, nonlinear
dynamics (which also includes study of chaos, logistical maps, etc.) is a rich field itself; this
chapter barely scratches the surface.
This was followed by the simulation of tubular reactor in Section 9.4. A steady state PFR
is an ODE system, whereas a transient PFR is a hyperbolic PDE. As the velocity changes due
to reaction, careful attention needs to be paid to the overall problem formulation. Finally,
Section 9.5 simulates the trajectory of a projectile (cricket ball, in this case), where the event-
tracking feature of MATLAB ODE solvers was also introduced. This section concluded with
a fun example of animation in MATLAB to visualize the trajectory of the cricket ball.
EXERCISES
Problem 9.1 Perform the bifurcation analysis for a chemostat, as shown in Section 9.1 for
a larger value of saturation constant, K = 1.0. All other parameters are kept
the same as in Section 9.1.
Problem 9.2 Transcritical bifurcation: Perform bifurcation analysis for the following
systems:
æ x ö
x ¢ = (1 - x ) ç a -
è 1 - x ÷ø
x ¢ = x - ax (1 - x )
kt UA
f 2 (T ) = ( -DH )
1 + kt
( FC A,in ) + F rc p (Tin - T ) -
1+ b
(T - Tj,in ) = 0
Problem 9.4 (c.f., Strogatz’s book) Perform bifurcation analysis for the following autocata-
lytic series reactions:
A + X 2 X , X + B ® Products
406 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
407
CHAPTER 10
Regression and
Parameter Estimation
10.1 GENERAL SETUP
Several models of various types were discussed in the preceding two parts of this book.
These included differential equations, algebraic equations, expressions for certain proper-
ties, and differential algebraic equations. The aim of these models is to represent a relation-
ship between independent and dependent quantities and/or to predict the behavior of a
system. These models had several parameters, for example, rate constants to express reac-
tion rate, thermal conductivity in heat transfer problem, drag coefficient, etc. The param-
eter values were assumed to be known, which allowed us to set up and solve the problems
of interest.
The aim of this chapter is to focus on the process of obtaining the model parameters that
“best represent” the observed data. This is known by various terms: regression or param-
eter estimation. We will use these terms to mean the same process: Given a finite number
of observations, how do we fit the parameters of a chosen model that best represent the
observed data?
Such a model may be phenomenological. For example, the Arrhenius rate law for tem-
perature dependence on reaction rate
æ E ö
k (T ) = k0 exp ç - ÷ (10.1)
è RT ø
has a basis in statistical mechanics. Some others may represent a convenient form of rep-
resenting correlation observed in the data. For example, dependence of specific heat on
temperature
c p = a + bT + cT 2 + dT 3 (10.2)
409
410 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
is a convenient polynomial form that represents the observed trends for a wide range of
temperatures. Then there are other examples, such as Antoine’s relationship between satu-
ration pressure of a pure liquid and its temperature. Antoine’s equation
B
( )
ln psat = A -
T +C
(10.3)
y = mx + c
The abscissa x may be considered an “independent variable,” whereas the ordinate y may
be considered a “dependent variable.” The two may either be directly measured variables or
may be derived quantities. For example, you may recall taking a logarithm of reaction rate
æ E öæ 1 ö
ln ( r ) = ln ( k0 ) + ç - ÷ ç ÷ (10.4)
è R ø è T ø
10.1.1 Orientation
A set of observations—either from an experiment, any other measurement, or from a more
complex simulation—are available at certain discrete points. Let us consider that the data is
available at N different points, x represents the independent variable, and y represents the
dependent variable. In the three examples, (10.1) through (10.3), the choice of x and y were
clear. Let the data be represented as
( x1 , y1 ) , ( x2 , y2 ) ,¼, ( xN , yN ) (10.5)
y = f ( x; F ) (10.6)
y i = f ( xi ; F ) (10.7)
represents the model prediction of the ith data point when the corresponding observed
value of the independent variable xi is substituted in the model (10.6). The difference
ei = yi - y i (10.8)
10.1.2 Some Statistics
Let us recap some definitions. The arithmetic mean of the data is
å
N
xi
x= i =1
(10.9)
N
Sx
x= (10.9)
N
The sum of square deviation from the mean is represented as Sx, variance is represented as
var(x), and the standard deviation in the data as sx:
N
å(x - x )
2
Sx = i (10.10)
i =1
Sx
var ( x ) =
N -1
å (x - x )
2
Sx i
sx = = i
(10.11)
N -1 N -1
412 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
The denominator is var(x) and sx is the degrees of freedom, (N − 1). For example, if there is
a single data point, the variance of that data around its mean does not have any meaning!
Next, we come to an important definition: the sum of squares of the residuals:
å ( y - y )
2
Se = i i (10.12)
i
( )
Since the modeling error ei = yi - y i , the above expression forms a convenient merit func-
tion or objective function for parameter fitting process.
Finally, let us use the example of specific heat data for methane (given in Table 10.3) to
further understand some of these concepts. Since this data is obtained at various tempera-
tures, there is an underlying trend in the data, as seen in Panel (a) of Figure 10.1. The aim of
regression is to find that trend. Let us next consider the case where the specific heat is cal-
culated five times at the temperature of 400 K. The two cases are shown in Figure 10.1, with
the data plotted against the observation index in the abscissa. While there is a clear trend in
Panel (a), the data in Panel (b) is well spread around the value of 40.6 J/mol · K.
The data in Panel (a) was obtained from NIST WebBook of thermochemical data,
whereas that in Panel (b) is “synthetic” data that I generated randomly. The latter was
generated from an underlying normal or Gaussian distribution (also popularly known as
“the bell-shaped” curve of probability distribution) with a mean of μ = 40.6 and a standard
deviation of σ = 1.
The data is given as: c = [40.00 41.09 41.14 42.31 40.31]T. The underlying mean
and standard deviation (which are often unknown) are μ and σ. We distinguish them from
the sample mean y = 40.97 and the sample standard deviation sy = 0.897. Notice that since
we have a small number of data points, the sample mean and standard deviation is differ-
ent from the inherent statistics behind the data. The data shown here is just one realization
of the stochastic process. After the measurements are taken, this realization is fixed. If we
were to take the measurements again, we will get a different set of values, even though the
underlying process is the same.
45 45
40 40
35 35
1 2 3 4 5 6 7 1 2 3 4 5 6 7
(a) (b)
FIGURE 10.1 Specific heat of methane (a) at various temperatures and (b) at 400 K obtained with
five different experiments.
Regression and Parameter Estimation ◾ 413
y = a1 + a2 x + a3w + a4 u (10.13)
The aim is to find parameters for the above linear model, for which we use multilinear
regression. As we shall show soon, multilinear regression is a straightforward extension of
linear regression in multiple variables.
The model for specific heat
c p = a + bT + cT 2 + dT 3 (10.2)
although nonlinear in T, is linear in parameters a , b , c , d. Since T, T2, and T3 are linearly
independent, this has the same form as Equation 10.13, with x ≡ T , w ≡ T2 , u ≡ T3. The
regression for polynomial data as in Equation 10.2 is called polynomial regression.
Likewise, if we define a1 = ln(k0) and a2 = − E/R, model (10.4) is linear in parameters a1
and a2. Recall that Equations 10.1 and 10.4 represent the same equation. However, Equation 10.1
is nonlinear in parameter space. Obtaining the values of k0 and E, with the model expressed
in the original form of Equation 10.1, requires nonlinear regression. It may sometimes not
be easy to convert a nonlinear regression into a linear regression. Equation 10.3 provides an
example where a nonlinear regression is recommended.
All these examples and analyses will be covered in the remainder of this chapter. The
problem of fitting a straight line to observed data will be discussed in Sections 10.2.1 and
10.2.2, and statistical properties related to goodness of fit will be discussed in Section 10.2.3.
The linear regression problem will be extended to multiple variables in Section 10.3. Section
10.4 will discuss nonlinear regression, including conversion of a nonlinear regression prob-
lem to linear regression (as we indicated in reaction rate Equation 10.2 being converted to
linearize form (10.4)). Finally, Section 10.5 will present several case studies.
y = a1 + a2 x (10.14)
414 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
is manageable to understand, while still having all the key features of regression and obtain-
ing goodness of fit estimates. It also covers a rather broad spectrum of regression problems.
Given the data at N discrete points, the model predictions are
y i = a1 + a2 xi (10.15)
ei = yi - y i
= yi - ( a1 + a2 xi ) (10.16)
Se = åe
i =1
2
i
å ( y - (a + a x ))
2
= i 1 2 i (10.17)
i =1
Least squares solution: The aim of least squares solution is to find the values of a1 and a2 for
which Sε is minimum, that is
( a1 , a2 ) = min
a ,a
Se
1 2
(10.18)
This is said to be a least squares problem because we minimize the sum of squared errors.
For the linear mode, the least squares parameter values can be obtained by differentiating
Equation 10.17 with respect to a1 and a2 and equating them to zero. Differentiating with
respect to a1
N
¶Se ¶
å ¶a ( y - (a + a x ))
2
= i 1 2 i =0 (10.19)
¶a1
i =1
1
yields
å2 ( y - (a + a x )) = 0
i =1
i 1 2 i (10.20)
N N
a1N + a2 å åy
i =1
xi =
i =1
i (10.21)
Regression and Parameter Estimation ◾ 415
N
¶Se ¶
å ¶a ( y - (a + a x ))
2
= i 1 2 i =0 (10.22)
¶a2
i =1
2
yielding
å - 2x ( y - (a + a x )) = 0
i
i i 1 2 i (10.23)
N N N
a1 å
i =1
xi + a2 å åx y
i =1
xi2 =
i =1
i i (10.24)
Using the shorthand notation, Equations 10.21 and 10.24 are written as
éN Sx ù é Sy ù
ê úF = ê ú (10.25)
ëS x S xx û ëS xy û
to form two linear equations. The above equation can be solved simultaneously to obtain
least squares estimates of the parameters, a1 and a2.
é a1 ù
ei = yi - éë1 xi ùû ê ú (10.26)
a2 û
ë
F
e1 = y1 - ëé1 x1 ùû F
e2 = y2 - éë1 x2 ùû F
(10.27)
eN = y N - ëé1 x N ùû F
416 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
é e1 ù é y1 ù é1 x1 ù
ê ú ê ú ê ú
ê e2 ú = ê y2 ú - ê1 x2 ú é a1 ù
ê ú ê ú ê ê ú (10.28)
ú ëa2 û
ê ú ê ú ê ú
êëeN úû êë
y N úû êë1 x N úû F
E Y X
allows us to define errors in a compact form, E = Y − XΦ. Recall that the model equation
Y = X F (10.29)
is in the familiar form encountered in Chapter 2. In this example, the matrices X and Y
consist of N rows, corresponding to the N data points. Singular value decomposition (SVD)
on X matrix
X = U SV T (10.30)
gives two nonzero singular values. When there are two equations in two unknowns, a
unique solution is found by inverting the matrix X. However, in this case with N rows, since
the matrix X has two nonzero singular values, its left inverse, given by (XTX)−1XT, exists and
will provide a least squares solution.
Let us derive this result using a procedure similar to the one used earlier. The sum of
squared errors is given by
Se = ET E
= (Y - X F ) (Y - X F )
T
(10.31)
Recall that if we have a quadratic function, f(x) = hx2 − 2gx, with h > 0, then the value x = g/h
is minima. Analogously, the vector x that minimizes
argmin x T H x - 2g T x
x
{ (
F = argmin FT X T X F - 2Y T X F
F
) } (10.32)
where the constant term YTY does not affect the value of Φ. By comparison, H ≡ XTX is a
positive definite matrix. Thus, the least squares solution that minimizes the sum of squared
errors, Sε, is
( )
-1
F = XT X XTY (10.33)
Regression and Parameter Estimation ◾ 417
The following example demonstrates linear regression using Equations 10.25 and 10.33.
%% Method-1
N=length(T);
Sx=sum(T); Sy=sum(cp);
Sxx=sum(T.*T); Sxy=sum(T.*cp);
A=[N Sx; Sx Sxx];
b=[Sy; Sxy];
Phi1=inv(A)*b;
Using the above code, the specific heat of methane (in J/mol · K) is obtained as the fol-
lowing linear function of temperature (in Kelvin):
Figure 10.2 shows a comparison between the data and the straight-line model fit
obtained above. One can visually see from this figure that the model does a reasonable
job of predicting the experimental data.
One can also verify that Phi1, Phi2, and Phi3 values are exactly the same.
Furthermore, both A and X'*X give the same value:
9 3598
3598 1476304
The example above was a recap of fitting a straight line to data. This also showed numeri-
cally the equivalence between Equations 10.25 and 10.33. In Problem 10.1, you are asked to
prove this equivalence for general N-dimensional X and Y matrices.
Next, we will calculate some statistics and analyze how well the model fits the data.
418 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
46
38
34
300 350 400 450 500
Temperature (K)
FIGURE 10.2 Comparison of specific heat data (symbols) and a straight-line model fit (lines).
10.2.3 Goodness of Fit
10.2.3.1 Maximum Likelihood Solution*
Linear least squares described above implicitly assumes that the measurement errors draw
from a normal distribution. The discussion here simply follows the one in the Numerical
Recipes book by Press et al. The reader is referred to that book for more details. Let us
assume that there exist some true models:
y true ( x ; F ) , F = ëéa1
T
am ùû (10.35)
from which the experimental data yi are derived. The data has a certain measurement error
associated with it. These measurement errors are independent, that is, the measurement error in
the ith data point does not influence the measurement error in the jth data point. Furthermore,
let us assume the measurement errors to draw from a normal (Gaussian) distribution.
Consider the problem of finding the parameters, Φ, that are most likely to produce the
experimental data ((x1, y1), (x2, y2), … , (xN, yN)). If σ is the actual standard deviation of the
measurement errors, then
æ æ yi - y true ( xi ;F ) ö ö
2
1
Pi µ exp ç - ç ÷ ÷ (10.36)
çç 2 ç s ÷ ÷÷
è è ø ø
is the probability at each data point. Since the measurement errors are independent, the
probability of all the data is a product of the individual probabilities. The most likely set of
parameters F are the ones that maximize this probability:
æ ö
1 æ yi - y true ( xi ; F ) ö ÷
N 2
F mle
= max
F Õ ç
a exp - ç
çç 2 ç
è s
÷
÷ ÷÷
ø ø
(10.37)
i =1
è
where “mle” represents that Φmle is the maximum likelihood estimate. Taking a logarithm
ìï 1
N
ï ü
å ( y - y ( x ; F )) ý
2
F mle = max íN ln ( a ) - 2 i true i (10.38)
F 2s
îï i =1 þï
Se
se = (10.40)
N -m
Since the data has been used to compute m fitting parameters, the degrees of freedom
remaining are (N − m). For example, a unique line will pass exactly through two data points.
So, if N = 2 and m = 2, the variance is not defined.
The coefficient of determination is a popular way of quantifying how well the model fits
data. It is more popularly known as r2 value. It is commonly calculated as
S y - Se
r2 = (10.41)
Sy
where
Sy represents the variance in the data
Sε represents the variance not captured by the model
420 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
If the latter is low, r2 → 1 and the model is said to have a good fit. The coefficient of determi-
nation is a convenient way to quantify the performance of a model.
Another approach to qualitatively determine the goodness of fit is to plot the original
data and model fit on the same figure. Usually, the model is shown as a curve, whereas the
data are shown as points. Alternatively, a parity plot between the data yi as the abscissa and
the model prediction y i as the ordinate is obtained. If the data lies along the 45° line, we
have a good fit. Finally, the errors ei may be plotted against the data yi. Any trends in the
errors may become clear with such a plot. However, problems with “overfitting” the data (by
using many more parameters than necessary) are difficult to ascertain using this approach.
Likewise, it fails to compare two putative models to select the better one.
Quantitative information about the goodness of fit is provided by determining the vari-
ance in the estimates of the parameters ai . The diagonal elements of M = (XTX)−1 give the
variance of the corresponding parameters. Thus
sai = Mi ,i se (10.42)
However, since the actual value of σε is not available, it needs to be computed from the data.
Thus, the standard deviation can only be estimated from the data.
If the actual standard deviation σε is known, the standard results for normal distribution
may be used. For example, the range of parameter values with 95% confidence interval is
given by (ai ± 2σai). However, since we have a limited amount of data, Student’s t-distribution
may be used to calculate the confidence interval for the parameters. Thus, the 95% confi-
dence interval for the parameters is
ai ± dciai (10.44)
err=cp-cpHat;
sse=err'*err;
var_e=sse/(N-2);
stdErr=sqrt(var_e);
Regression and Parameter Estimation ◾ 421
Coefficient of determination (“R-squared”): From the variance in the cp data, the term
Sy is calculated as
S y = ( N - 1) var ( c p ) = 113.555
Se
r2 = 1- = 0.9953
Sy
Visualizing the model fit: The term r2 indicates that 99.5% of the variance in the original
data can be explained by the linear model. This can be visually seen from Figure 10.2.
Parity plot and error plots are two other ways of visualizing how well the model repre-
sents the data (Figure 10.3). The top panel is the parity plot of cp (data) vs. c p (model).
The closeness of the data points to the 45° line indicates a good model fit. Another
way of visualizing the same data is to plot the residuals (errors), ei = c p,i - c p,i as the
ordinate vs. cp , i as the abscissa. The errors are spread around the X-axis.
Although the errors are small, they form a convex curve, which may indicate the
existence of a higher-order polynomial term.
Confidence interval on least squares parameter values, Φ: The final task is to obtain confi-
dence intervals on the least squares parameters Φ. This will contain two parts: Estimate the
45
cp,model
40
35
35 40 45
cp
0.5
Error
0
40 45
cp
–0.5
FIGURE 10.3 Parity plot and error ei vs. cp , i for analyzing model fit.
422 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
standard deviation (σai) of the parameters and use it to compute the confidence interval. We
will use Equation 10.42 to compute σai, whereas Student’s two-sided t-distribution will use
σai to provide the 95% confidence intervals on Φ.
The diagonal elements of (XTX)−1 are
>> sqrt(diag(inv(X'*X)))
ans =
2.0803
0.0051
The two elements are indicators of the standard deviations, σai, of the two parameters,
as given by Equation 10.42. However, in the absence of information about the mea-
surement errors, we replace the standard deviation of measurement error, σyi, with the
value sε computed from the data. Thus, the estimated value
sest
ai ~ M ii se
If we had a very large number of data points, sε → σε (i.e., sε approaches the true stan-
dard deviation of measurement noise sequence). Under this assumption that sε is the
true value, the 95% confidence interval is given by “mean ±2 × standard deviation.”
However, since we have a limited amount of data, the Student’s t-distribution is
appropriate. The multiplier for 95% confidence interval with a degree of freedom
(N − m) = 7 is 2.365. The t-distribution tables are easily available. For example, please
see Wikipedia link https://en.wikipedia.org/wiki/Student’s_t-distribution for this
information. If you have the Statistics toolbox in MATLAB, the value tN − m for 95%
confidence interval can be calculated as*
>> tVal=tinv(0.975,7);
Once the tVal is obtained, the confidence intervals can be computed using (10.43):
>> stdPhi=sqrt(diag(inv(X'*X)))*stdErr;
>> confValue=stdPhi*tVal;
>> confInterval=[Phi-confValue, Phi+confValue]
confInterval =
17.5573 20.2852
0.0512 0.0580
Thus, based on the above information, the 95% confidence interval for the two param-
eters is 17.56 £ daci1 £ 20.29 and 0.0512 £ daci2 £ 0.058.
* The above form is because MATLAB uses one-sided t-distribution. Please see the table in the above link for more informa-
tion. Thus, tinv(p,N-m) will compute the probability.
Regression and Parameter Estimation ◾ 423
Since the variance in both the parameters is rather small, the confidence interval computed
above is narrow. This indicates a good model fit to the data.
Se = ET E (10.31)
In the derivation of least squares solution, we made no assumption about the number of
parameters, m. The derivation is more general and is applicable for a larger dimensional Φ
as well. In general, if Φ is an m × 1 vector, then X will be a N × m matrix. The objective func-
tion can be expanded to give the following expression for the least squares solution for Φ:
{ (
F = argmin FT X T X F - 2Y T X F
F
) } (10.32)
( )
-1
F = XT X XTY (10.33)
with m parameters. The N data points are in the form (xi, wi, … , ui; yi). The prediction using
the above linear model is
é a1 ù
ê ú
ê a2 ú
yi = éë1 xi wi ui ùû ê a3 ú (10.46)
ê ú
ê ú
êam ú
ë û
424 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
Following the same procedure as above, the model predictions for all the data points can be
obtained, and the error equation written as
é e1 ù é y1 ù é1 x1 u1 ù é a1 ù
ê ú ê ú ê úê ú
ê e2 ú = ê y2 ú - ê1 x2 u2 ú ê a2 ú
ê ú ê ú ê úê ú (10.47)
ê ú ê ú ê úê ú
ëeN úû êë
ê y N úû êë1 x N uN úû êëam úû
E Y X F
With this definition, the same solution
( )
-1
F = XT X XTY (10.33)
gives the linear least squares solution for an m-dimensional parameter vector Φ.
10.3.2 Polynomial Regression
Consider the example of fitting the specific heat to a wider range of temperatures.
A straight-line fit does not work.
Polynomial of the form
( )
-1
F = XT X XTY
é1 T1 T12 T1m-1 ù éc p1 ù
ê ú ê ú (10.50)
ê1 T2 T22 m -1
T2 ú c p2
where X = ê ú , Y =ê ú
ê ú
ê ú ê ú
êë1 TN TN2 TN úû
m -1
êëc pN úû
The next example aims to fit a higher-order polynomial to the data ranging from low to high
temperature in steps of ~100 K.
Regression and Parameter Estimation ◾ 425
%% Matrix Method
Y=cp;
X=[ones(N,1), T, T.^2];
m=size(X,2);
M=inv(X'*X);
Phi=M*X'*Y;
plot(cp,err,'o');
xlabel('c_p (J/mol.K)'); ylabel('error');
%% Error analysis
sse=err'*err;
Yvar=Y-mean(Y);
sst=Yvar'*Yvar;
Rsquare=1-sse/sst;
80
65
50
35
65
50
35
80
65
50
35
300 700 1100
Temperature (K)
FIGURE 10.4 Polynomial fit to cp vs. T data for first-, second-, and third-order polynomials.
Figure 10.4 shows the fit for polynomials of first-, second-, and third-order to the cp
data. The corresponding R-squared values are 0.9923, 0.9992, 0.9998, respectively. All
the three R-squared values are excellent.
Figure 10.5 shows the parity and error plots for the three polynomial fits. Although
the straight line fit gives an excellent R-squared value, a higher-order polynomial gives
a better qualitative fit with regard to the spread of the errors across the X-axis.
Based on “eyeballing the errors” from the graphs, a second-order polynomial seems
a reasonable fit for the data.
The above procedure of eyeballing the errors is not very reliable. We need some quantitative
measure to determine the goodness of fit. If the variance of measurement error is known,
the Chi-squared value
å ( y - y )
N 2
N 2
æ yi - y ö
å
i i
c2 = ç i
÷ = i =1
(10.51)
ç si ÷ s2
i =1 è ø
provides a good estimate of goodness of fit. However, in the example above, we have no idea
of the estimate of measurement error. In such a case, the standard deviation of the param-
eters becomes a useful criterion for analyzing the model fit. For the straight line
é 22.27 ù é 1.254 ù
F=ê ú , s F = ê ú
ë0.0499 û ë0.0014 û
Regression and Parameter Estimation ◾ 427
80
0.2
65
0
50
–0.2
35
1
80
cp,estimated
65
0
50
35 –1
80
2
65
0
40 60 80
50
–2
35
35 50 65 80
(a) cp,data (b)
FIGURE 10.5 Parity plots (a) and error plots (b) for first-, second-, and third-order polynomials for
fitting cp vs. T data.
é 13.70 ù é 1.036 ù
ê ú ê ú
F = ê 0.0749 ú , s F = ê 0.0028 ú
ê -1.52 ´ 10-5 ú ê1.68 ´ 10-6 ú
ë û ë û
In both these cases, the standard deviation of parameters is an order of magnitude lower
than the parameter values. However, when a third-order polynomial is considered
é 20.004 ù é 1.436 ù
ê ú ê ú
ê 0.0452 ú ê 0.0064 ú
F=ê , s F = ê
2.542 ´ 10-5 ú 8.57 ´ 10-6 ú
ê ú ê ú
êë -1.680 ´ 10-8 úû êë3.52 ´ 10-9 úû
428 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
X = U SV T (10.52)
Thus, the least squares parameter estimates may be written as
( ) ( S )U Y
-1
F = V ST S T T
(10.54)
( ) ( S ) is given by
-1
The term S T S T
Regression and Parameter Estimation ◾ 429
Thus, according to Press et al., the least squares parameter estimate is given by
(10.55)
where vi and ui are the ith column vectors of V and U, respectively. Furthermore, Press et al.
suggest that if certain singular values are too small, they may be dropped. Let us say that
only first r singular values of X are considered to be important; then
(10.56)
10.4 NONLINEAR ESTIMATION
10.4.1 Functional Regression by Linearization
A classic example is that of temperature-dependent Arrhenius rate law:
æ E ö n
r = k0 exp ç - ÷C (10.57)
è RT ø
æ E öæ 1 ö
ln ( r ) = ln ( k0 ) + ç - ÷ ç ÷ + n ln ( C ) (10.58)
è R øè T ø
y = a1 + a2 x + a3w (10.59)
y = ae bx (10.60)
y = ax b (10.61)
VCS
r= (10.62)
K m + CS
430 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
1 Km 1 1
= +
r V CS V (10.63)
a1 a2
y = a1 x + a2
the matrix X also must reflect this appropriately. The parameters can be obtained by linear
least squares regression:
( )
-1
F = XT X XTY
é 1/CS1 1ù é 1/r1 ù
ê ú ê ú (10.64)
1/CS 2 1ú 1/r2 ú
where X = ê , Y= ê
ê ú ê ú
ê ú ê ú
êë1/CSN 1úû êë1/rN ûú
m max CS
r= (10.65)
K + CS + aCS2
1 K æ 1 ö 1 a
= max ç ÷ + max + max CS (10.66)
r m è CS ø m m
1 K æ 1 ö 1
= max ç ÷ + max
r m è CS ø m
Regression and Parameter Estimation ◾ 431
TABLE 10.1 Reaction Rate vs. Substrate Concentration Data for Monod Kinetics
[S] 0.24 0.32 0.70 1.37 1.58 2.04 2.26 2.28 2.39 2.41
r 0.3839 0.3935 0.911 1.4975 1.5735 1.849 1.9355 1.893 1.970 1.924
which is of the same form as Equation 10.63. Thereafter, the procedure is as described
in Equation 10.64. Specifically, we define y = 1/r and x = 1/Cs and perform the regression.
Once the least squares values are obtained, the original parameters are calculated as
1
m max = , K = m max a1
a2
The values of parameters and their expected variance are obtained in a manner similar
to Example 10.1 as
é0.6210 ù é0.0376 ù
F=ê ú , sF = ê ú
ë0.2530 û ë0.0662 û
Caution!
One must, however, exercise caution in using a nonlinear model, either directly or linearized.
For example, consider the model
y = a × exp ( bx + c ) (10.67)
The parameters a and c cannot be obtained independently. This is because the above equa-
tion may be written as
y = ae c
×e
bx
(10.68)
a1
The parameters a1 and b can be obtained using linear or nonlinear regression. However, the
parameters a and c are not independently obtainable from x vs. y data. A practical example
of this will be discussed in Section 10.5.3.
Another example: Antoine’s equation
B
( )
log P sat = A +
T +C
(10.69)
432 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
( )
T log P sat = AT + ( AC + B ) - C log P sat( ) (10.70)
There is nothing wrong with the above formulation, and the resulting X matrix is also
nonsingular. However, an important assumption made in our analysis of parameter
estimation was that of uncorrelated and independent Gaussian noise sequence. In the
above equation, y = T log(P sat) does not satisfy that condition of Gaussian noise; more
importantly, the term w = log(P sat) and y are now correlated. Thus, one should avoid a
formulation where any experimentally measured quantity appears in both definition of
dependent variable y and independent variable(s) x/w.
[Phi,confInterval,err,~,STATS]=regress(Y,X);
Rsquare=STATS(1);
stdErr=sqrt(STATS(4));
The above command can replace the calculations in Example 10.2. The various outputs are
explained below:
The advantage of using regress is that it can handle ill-conditioned X matrix better
than the direct matrix method of Example 10.3. For example, the matrix X in Example 10.3
for a third-order polynomial fit was ill-conditioned. The round-off error problem is avoided
by regress since it uses QR factorization for solving the problem. The other SVD-based
approach was presented in Section 10.3.3.
Regression and Parameter Estimation ◾ 433
The Optimization Toolbox provides another set of tools for linear and nonlinear regres-
sion. These are optimization-based solvers. The solver useful for linear regression
is lsqlin. If the parameter estimation problem is unconstrained, then the method
described in Example 10.3 or Example 10.4 gives good results. However, if there are physi-
cal constraints on the parameter values, lsqlin provides a good option, since it solves
constrained optimization of the form*
1
min XF - Y 2
F 2 (10.71)
F min £ F £ F max
[Phi,sse,err]=lsqlin(C,d,[],[],[],[],PhiMin,PhiMax);
[Phi1,confInterval1,err1,~,STATS]=regress(Y,X);
Rsquare1=STATS(1);
stdErr1=sqrt(STATS(4));
* In addition to bounds on Φ, lsqlin also handles linear inequality and equality constraints. I favor this approach to keep
things simple and pertinent. Advanced readers may refer to MATLAB help or documentation of Optimization toolbox
for more information.
434 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
The reader can verify that the values of Phi1, confInterval1, err1, and Rsquare1
obtained here are the same as those obtained in Examples 10.1 and 10.2.
Using lsqlin:
[Phi2,sse2,err2]=lsqlin(C,d,[],[],[],[],PhiMin,PhiMax);
The readers can verify that Phi2 and sse2 are same as those in previous exam-
ples, and the residual error, err2, is just negative of that in previous examples (since
lsqlin computes (XΦ − Y)).
y = f ( x ; F ) (10.6)
Regression and Parameter Estimation ◾ 435
where x and y are measured at N different data points. The aim of the least squares fitting is
to find the parameters that minimize the least squares objective function:
ì N 2ü
argmin ï
F í å( yi - f ( xi ; F ) ï
ý ) (10.72)
ïi =1 ï
î Se þ
y i = f ( xi ; F ) (10.73)
The three MATLAB solvers, lsqnonlin, lsqcurvefit, and fmincon require differ-
ent functions to be defined. The first solver, lsqnonlin, requires us to supply the residual
error as a vector:
é e1 ù é y1 - f ( x1 ; F ) ù
ê ú ê ú
e2 ê y2 - f ( x2 ; F ) ú
E=ê ú=ê ú (10.74)
ê ú
ê ú ê ú
êëeN úû ëê y N - f ( x N ; F ) úû
with Φ being the adjustable parameter that lsqnonlin needs to determine. The syntax
for this code is
where Phi0 is the initial guess. The function, fittingFun.m, takes in the parameter
( )
vector as input argument and computes the residuals ei = yi - y i . The N × 1 vector of
residual errors must be returned by the function. Since fittinFun will also require the
data to calculate the residuals, the more appropriate syntax aligned with our way of using
MATLAB is
The second solver, lsqcurvefit, requires us to provide yi and y i, with the latter being
calculated in a function that is supplied to the solver. In other words, we will write a func-
tion modelFun.m that will calculate y i = f ( xi ; F ) and return a vector
é f ( x1 ; F ) ù
ê ú
=
Y
ê f ( x 2 ; F ) ú
ê ú
ê ú
ê f ( x N ; F )ú
ë û
436 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
Phi=lsqcurvefit(@(X,xData) modelFun(X,xData),Phi0,xData,yData);
Personally, I find not much to choose between the two solvers. Since lsqnonlin has a
more general structure, I prefer it over lsqcurvefit.
The final option is to use any generic optimization solver to solve the least squares prob-
lem. If a solver, such as fmincon or fminunc is used, the objective function should pro-
vide the least squares objective, that is, Sε from Equation 10.72. The syntax for using them is
and the function estimObj.m should return the sum of squared errors.
The next example demonstrates the three methods for solving nonlinear regression
problem.
m max CS
r=
K s + CS
function Rate=modelFun(Phi,CsData)
mu=Phi(1); K=Phi(2);
Rate=mu*CsData./(K+CsData);
function err=fittingFun(Phi,CsData,rData)
mu=Phi(1); K=Phi(2);
Rate=mu*CsData./(K+CsData);
err=rData-Rate;
function sse=estimObj(Phi,CsData,rData)
mu=Phi(1); K=Phi(2);
Rate=mu*CsData./(K+CsData);
err=rData-Rate;
sse=err'*err;
Regression and Parameter Estimation ◾ 437
Now that these functions are written, the following driver script is used to compute
the estimates Φ using these three methods:
%% Generic Minimization
Phi3=fminunc(@(X) estimObj(X,CsData,rData), Phi0);
T
The parameters estimated using all these three methods are F = éë3.80 2.23ùû .
Compare these with the parameter values found in Example 10.4 using linear
regression:
μmax = 3.95 , K = 2.45
Figure 10.6 compares the model predictions from nonlinear regression (top) and linear
regression (bottom) for this example. The coefficient of determination, R-square, was
0.996 for nonlinear regression. Thus, both the models provide reasonable performance.
2
Isqnonlin
Rate
1
0
(a)
2
Linear regression
Rate
0
0 1 2
(b) Concentration, Cs
FIGURE 10.6 Model fit (lines) for Monod kinetics data (symbols). (a) Shows results for lsqnonlin
and (b) for linear regression from Example 10.4.
goodness of fit based on R-squared values and 95% confidence intervals was calculated
(Example 10.2). In Example 10.3, polynomials of various orders were fitted to the specific
heat data for the full range of temperature from 298 to 1300 K. A quadratic or cubic poly-
nomial was found to give a good fit to the data. Interested readers will find it useful to
review these three examples to understand linear regression and goodness of fit estimates.
Finally, Example 10.5 showed that the same results are obtained from in-built MATLAB
functions from statistics and optimization toolboxes.
B
( )
ln psat = A -
T +C
(10.3)
%% Observed data
T=[50:10:140];
pSatB=[0.3751, 0.5027, 0.7112, 0.9883, 1.2529,...
1.8758, 2.3648, 2.9474, 3.9049, 4.5455];
%% Obtain Antoine's coefficients
C=250; % Coefficient C is given
Y=log(pSatB');
X=[ones(10,1), -1./(T'+C)];
PHI=inv(X'*X)*X'*Y;
A=PHI(1);
B=PHI(2);
pSatModel=exp(A-B./(T+C));
%% Plot results
plot(T,pSatB,'bo',T,pSatModel,'-r');
xlabel('Temperature (^oC)'); ylabel('p^{sat} (bar)');
figure(2);
plot(T,(pSatB-pSatModel),'bo');
xlabel('Temperature (^oC)'); ylabel('Residual, e_i');
The values of the parameters are A = 10.12 and B = 3343.0. The coefficient of deter-
mination is r2 = 0.998. Figure 10.7 shows the comparison between the model fit and
observed data (top panel) as well as how the model errors are spread.
Let us also look at 95% confidence intervals for the two parameters, which can be computed
as discussed in Section 10.2.3 to yield A = 10.12 ± 0.358 and B = 3343 ± 122.
Note how narrow the 95% confidence intervals are for the parameters A and B. Very nar-
row confidence intervals means that the model is very likely to capture the actual variance
underlying the observed data. This could either be due to overfitting, or if the measurement
errors are low and all the measurements nicely line up as we expect from the model. If these
values sound too good to be true, they really are! The vapor pressure data in Table 10.4 is
not actual observed data; it is synthetic data that I generated using the same model.
Thus, the statistical analysis of model parameters can reveal additional information
about the model fit and the data, which is not revealed by merely looking at Figure 10.7 or
440 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
4 Benzene
psat (bar)
2
0.2
Residual, ei
–0.2
50 80 110 140
Temperature (°C)
FIGURE 10.7 Observed data (symbols) and model fit (lines) for the saturation pressure of benzene
and the residual errors at various temperatures.
the R-squared value. This is merely a brief introduction to this field. Interested readers are
referred to advanced texts on regression and data analysis for more information.
function errVal=antoineFun(Phi,xData,yData)
T=xData';
lnPsat=log(yData');
% Model calculation
A=Phi(1); B=Phi(2); C=Phi(3);
Regression and Parameter Estimation ◾ 441
modelVal=A-B./(T+C);
errVal=lnPsat-modelVal;
end
%% Observed data
T=[50:10:140];
pSatEB=[0.0489, 0.0703, 0.1082, 0.1626, 0.2182, ...
0.3595, 0.4786, 0.6281, 0.8885, 1.0714];
%% Regression
PhiGuess=[10; 2500; 200];
PHI=lsqnonlin(@(X) antoineFun(X,T,pSatEB),PhiGuess);
A=PHI(1); B=PHI(2); C=PHI(3);
The values of the three parameters are A = 11.3604, B = 4583.9, C = 267.9, whereas
Figure 10.8 shows the model fit for ethylbenzene.
A ® B
A series of experiments were run and the reaction rate vs. concentration data was obtained.*
We will do this problem in two parts. In the first case, the reactor was operated with only
A in the feed at an initial concentration of CA0 = 1. With only this data available, we would
be unable to fit all the three parameters to the data. Next, we will use experimental data at
various initial concentrations of A and B to obtain all the rate constants.
* This is also synthetic data used for demonstration purpose. It was generated with the same underlying model to highlight
the issues discussed in this example.
442 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
1.2
0.9 Ethylbenzene
psat (bar)
0.6
0.3
0
0.05
0
Residual, ei
–0.05
–0.1
50 80 110 140
Temperature (°C)
FIGURE 10.8 Observed data (symbols) and model fit (lines) for saturation pressure of ethylbenzene
and the residual errors at various temperatures.
Since the only reaction taking place is A → B, the concentration of B is obtained from over-
all mass balance, noting that CA + CB = CA0 holds for the duration of the experiment. As a
first try, let us naïvely use the above data for performing linear regression of the LHHW
model. The nonlinear expression (10.75) can be converted to a linear form by inverting and
taking a square root:
1 æ 1 ö KA K B æ CB ö
1
=
r
ç ÷+
k çè C A ÷ø k
( )
CA + ç
k çè C A
÷÷ (10.76)
ø
f1 f2 f3
When the above code is executed, MATLAB shows the following warning message:
This message should not be ignored. Since we have used the left division operator “\”
to compute Φ, MATLAB will give a solution. The software does not know whether the
rank deficient condition was intended. Some students would look at the result vector:
>> phi
phi =
6.7348
6.6996
0
If the importance of the above example is not yet clear, let me write it again in bold font:
Do not ignore warnings in MATLAB.
Let’s see why this happens—due to the material balance, CB = 1 − CA. Substituting this in
Equation 10.75, we get
kC A
r=
(1 + K )
2
A C A + K B (1 - C A )
kC A (10.77)
= 2
æ ö
ç (1 + K ) + ( K - K ) C ÷
B A B A
çç
÷÷
è a1 a2 ø
444 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
As seen in the above expression, only a1 and a2 can be identified independently with the
data used in this example. When we write the rate expression in this form, the problem per-
haps becomes obvious. But what about the modified expression, Equation 10.76:
1 æ 1 ö KA K B æ 1 - CA ö
1
= ç ÷+
k çè C A ÷ø
( )
CA + çç ÷÷ (10.78)
r k è CA
k ø
f1 f2 f3
1 æ 1 ö
x= , u = CA , w = ç - CA ÷
CA ç ÷
è CA ø
When written this way, we can immediately observe that x, u, and w are linearly dependent.
While this problem is fairly clear when we use linear regression, naïve use of nonlinear
regression will mask these problems. Recall our statement in Section 10.4.1: One must exer-
cise caution when using regression.
× 10–3
5.6
5.2
Rate (estimated)
4.8
4.4
4.4 4.8 5.2 5.6
Rate (observed) × 10–3
FIGURE 10.9 Parity plot for the test data, with estimated values plotted vs. observed values.
%% Testing results
NTest=length(rateTest);
rHatTest=(k*CaTest)./(1+Ka*CaTest+Kb*CbTest).^2;
err=rateTest-rHatTest;
sse=err*err';
ssy=var(rateTest)*(NTest-1);
Rsquare=1-sse/ssy;
plot(rateTest,rHatTest,'bo','markersize',3);
The rate parameters obtained are k = 0.051, KA = 2.03, and KB = 0.49. Figure 10.9 shows
a comparison between observed data and model predictions. The symbols lie close
to the 45° line, indicating a reasonable fit. The R-squared value of r2 = 0.97 with test-
ing data indicates a good fit. Statistical information about confidence intervals also
indicates a good fit for the kinetic model. Thus, including additional data at different
initial CB0 concentrations improves the model performance.
r = kC An (10.79)
TABLE 10.2 Concentration vs. Time Data for Determining Kinetic Rate Constants
t 5 30 60 90 120 150 180 210 240
CA 0.988 0.828 0.651 0.525 0.379 0.253 0.160 0.086 0.063
There are two different approaches that chemical engineers are familiar with. The first
one is the differential approach. Here, we use the definition of the reaction rate in constant
volume batch reactor as
1 dN A dC
r= = A (10.80)
V dt dt
Given the above data, we may be tempted to use numerical differentiation to compute the
reaction rate defined above. However, this approach is discouraged for two reasons. First,
the experimental data is available at sparse intervals, making the rate computations highly
approximate. However, much more importantly, there are measurement errors in the data.
High frequency errors (such as measurement errors) get highly accentuated in numerical
differentiation. For example, at time t = 120, numerical derivative gives
0.253 - 0.525
r120 = = -4.48 ´ 10-3 mol/L × s
60
However, if there was up to 2.5% error in the experimental data (which is actually on the
lower side of what one can expect in experiments)
0.246 - 0.535
r120 = = -4.82 ´ 10-3 mol/L × s
60
to fit the observed data. The following part of the code does that:
n=length(CaData);
X=[ones(n,1), tData, tData.^2];
p=X\Y;
% Plot and verify the fit
Y=p(1)+p(2)*tData+p(3)*tData.^2;
plot(tData,CaData,'bo',tData,Y,'-r');
Figure 10.10 shows that the quadratic polynomial represents the data quite well. We
are not done yet. This is just a correlation that we fitted for convenience. We want to
use it for obtaining reaction rate parameters. It is easy to see that for the fitted polyno-
mial, the reaction rate is
r = - ( a2 + 2a3 )
where the negative sign is because A is a reactant. The following part computes the
corresponding reaction rate:
The last row is multiplied by 1000 to display. Now that we have reaction rate and con-
centration data for each, we can proceed with parameter estimation.
0.5
0.25
0
0 50 100 150 200 250
Time (s)
FIGURE 10.10 Observed data (symbols) and quadratic polynomial fit (line) to the data.
448 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
Concentration, CA
0.5
0.25
0
0 50 100 150 200 250
Time (s)
FIGURE 10.11 Observed data (symbols) and the fit for kinetic rate expression (dashed line).
The overall code for using differential method for rate analysis consists of the above
three parts, solved together in the same script file. Figure 10.11 shows that the model fits
the overall data very well. The kinetic rate expression obtained through this approach is
10.6 EPILOGUE
10.6.1 Summary
In this chapter, we covered methods for obtaining model parameters that fit a set of obser-
vations. We also provided a basic introduction to analyzing statistical properties to ver-
ify goodness of fit. I would like to impress upon the reader that obtaining least squares
parameters is the easy part of the regression process. These parameters are not useful unless
accompanied by some goodness of fit measures. Three different ways were introduced: (i)
qualitative method by plotting experiments and model fits together, (ii) simple statistics
such as coefficient of determination (“R-squared”) and sum of squared errors, and (iii) sta-
tistical bounds on parameters and confidence intervals.
I close this chapter by mentioning matrix-based and singular value decomposition–
based approaches as my recommended approaches for linear least squares. If the parameter
estimation problem is not linearizable, then the nonlinear estimator may be used. I would
recommend lsqnonlin as a preferred algorithm (Optimization toolbox).
Regression and Parameter Estimation ◾ 449
If you have access to the Statistics toolbox, using the full power of the toolbox is highly
recommended. The treatment in this chapter will help you build the foundation for under-
standing more advanced material in statistical analysis.
10.6.2 Data Tables
Table 10.3 shows the data for specific heat of methane at two different temperature ranges
(298–500 K and 600–1300 K). The data for pure component vapor pressures of benzene and
ethylbenzene are shown in Table 10.4, whereas reaction rate data is shown in Table 10.5.
50 0.3751 0.0489
60 0.5027 0.0703
70 0.7112 0.1082
80 0.9883 0.1626
90 1.2529 0.2182
100 1.8758 0.3595
110 2.3648 0.4786
120 2.9474 0.6281
130 3.9049 0.8885
140 4.5455 1.0714
Sources: Benzene: NIST WebBook, http://webbook.nist.gov/cgi/
cbook.cgi?ID=C71432&Mask=4#Thermo-Phase; Ethyl
benzene: NIST WebBook, http://webbook.nist.gov/cgi/
cbook.cgi?ID=C100414&Mask=4#Thermo-Phase.
450 ◾ Computational Techniques for Process Simulation and Analysis Using MATLAB®
EXERCISES
Problem 10.1 Show that Equations 10.25 and 10.33 are equivalent. In other words, show
that
éN Sx ù é Sy ù
XT X = ê T
ú, X Y = ê ú
ëS x S xx û ëS xy û
Problem 10.2 Write your code for Example 10.4 and derive the parameter values and the
standard deviations of these parameters.
For the data in Example 10.4, fit substrate inhibited rate law:
m max CS
r=
K + CS + aCS2
Appendix A: MATLAB® Primer
If your code replaces any of the above variables, then that new assigned value stays with
that variable and it loses its original (default) value. Thus,
>> i = 2
i =
2
online web and video courses free. Video lectures on MATLAB and related material can be accessed at: http://nptel.ac.in/
courses/103106118/ (August 1, 2016).
451
452 ◾ Appendix A: MATLAB® Primer
From this point onward, i will not represent the imaginary unit but instead shall represent
the value assigned to it.
Note that the value of i above is echoed (i.e., displayed) on the screen. The echo can be
suppressed by ending the statement with a semicolon:
The variable ans is a special variable. When an expression is executed but the value is not
assigned to any variable, the value is captured in the variable ans. For example
>> 3*i
ans =
6
Since the expression 3*i is not assigned to any variable, the latest such value is captured in
ans and the value is echoed on the screen (since there is no semicolon to end the statement).
é -4 2 3 4ù
ê ú
A=ê 3 1 4 7ú
ê 8 úû
ë4 0 5
é2 3ù
E=ê ú
ë1 4û
However, E should be obtained from array A, and not by entering a new matrix.
A.2.1.1 Building Up Matrices
Let us now look at MATLAB commands that achieve the above. Matrix A can be assigned as
The semicolons separate multiple rows. Individual cells on each row can be separated by
comma (or spaces). It is important to know a couple of things: First, like math operations fol-
low brackets–exponent–multiplication–addition, MATLAB parses row–columns. For example
The error signifies that I am trying to concatenate the first two rows with one element each
and the third row with two elements, hence the error. However, one may use brackets to
create the matrix:
>> A = [[4; 3; 4], [2; 1; 0], [3; 4; 5], [4; 7; 8]]
A =
4 2 3 4
3 1 4 7
4 0 5 8
Here, each bracket creates a 3 × 1 vector; four such vectors are concatenated to give a
3 × 4 matrix. Note that square brackets enclose a vector/matrix/array. On the other hand,
parentheses are used to query a location in the array. Thus, the command
>> A(3,2)
ans =
0
will result in a row vector [1 2 3]. You may try the following commands in MATLAB,
thinking about what the result would be before you execute the command in MATLAB
(and hence check your understanding):
>> p=1:3
>> q1=1:3:5
>> q2=1:1.5:5
>> q3=5:-1:1
>> q4=5:1
Perhaps the last command needs explanation. It says “5 to 1 in steps of 1.” However, the first
value, 5, is greater than the final value. Hence, no value can be assigned and q4 becomes
an empty matrix. Note that this does not give an error; instead an empty matrix is returned.
>> diag(A)
ans =
4
1
5
The above result is fairly straightforward—the vector returned contains diagonal elements of A:
>> diag(diag(A))
ans =
4 0 0
0 1 0
0 0 5
From the above description of the diag command, does your understanding of diag
match MATLAB results?
Appendix A: MATLAB® Primer ◾ 455
>> A(2,p)
ans =
3 1 4
The above command returns from the second row of A, the elements corresponding to
column numbers indicated by p vector, [1 2 3]. In other words, it returns the first three
elements of the second row of A.
The command to extract the entire second row of A is
>> d=A(2,:)
d =
3 1 4 7
Note that simply using colon implies “all.” Thus, the above command implies elements in A
at row 2—and all elements in that row.
It should be fairly clear how to get matrix E that contains entries from first two rows and
columns 2 and 3 of matrix A:
>> E = A(1:2,2:3)
E =
2 3
1 4
Finally, here is a command for the reader to try. It is perhaps a good idea for you to think of
what result you would expect and then execute the command in MATLAB:
>> F = A([3,2],[1,3])
Does the result match your reasoning? If still confused, think what the commands below
will give:
>> A(3,[1,3])
Next, consider
>> A(2,[1,3])
The result F above is nothing but the result of the above two commands placed in the first
and second rows of F, respectively.
456 ◾ Appendix A: MATLAB® Primer
Finally, there is a special keyword, end, to represent the last element of a vector/matrix:
>> d(end)
ans =
7
>> A(end-1,:)
ans =
3 1 4 7
>> 2*A;
>> A/2;
>> A+2;
In the commands above, each element of A is doubled and halved, and 2 is added to each
element of A, respectively. Note that in MATLAB, (A + 2) is not equivalent to (A + 2I).*
Transpose in MATLAB is obtained using single quotation (apostrophe) key. Thus, the
reader can verify the following commands:
>> Atrans=A';
>> Cr=A*A';
>> Cl=A'*A;
* The origin of this misunderstanding among my students is not clear to me. In matrix algebra, one does not add a scalar
to a matrix. In this book, however, we will understand (A+2) in the context of the MATLAB definition: 2 added to each
element of A.
Appendix A: MATLAB® Primer ◾ 457
Since A is a 3 × 4 matrix, the above three matrices have size 4 × 3, 3 × 3, and 4 × 4, respec-
tively. The variable Cr is
Cr =
45 54 63
54 75 88
63 88 105
The commands sum(d) and prod(d) compute the sum and product of all elements of
the vector, d, respectively.* Since the vector d=[3 1 4 7], the two values are 15 and 84,
respectively. The cumulative sum is computed using cumsum(d), the result being a vector
of the same dimension as d itself. The first element of the resulting vector is the first element
of d, the second element is the sum of the first two elements, the third element is the sum of
the first three elements, and so on:
>> cumsum(d)
ans =
3 4 8 15
The command cumprod works in a similar way and gives the cumulative product.
The command to compute the inverse is inv. The following two commands are equivalent:
>> F=inv(E)
F =
0.8000 -0.6000
-0.2000 0.4000
>> H=E^(-1)
H =
0.8000 -0.6000
-0.2000 0.4000
>> E.*E
ans =
4 9
1 16
* The sum (or prod) command operates on arrays as well, where the sum (or product) of all elements in each of the columns
of the array is computed.
† MATLAB calls element-wise operator as array operator to distinguish it from matrix operator. Thus, E*E is “matrix multi-
plication,” whereas E.*E is “array multiplication” of individual elements of the first array with the corresponding elements
of the second array.
458 ◾ Appendix A: MATLAB® Primer
Note that the product operates on each element of E individually. Since E = [2 3; 1 4],
the result of the above operation is a matrix of the same dimension, with each individual
element squared. Contrast that with the regular matrix product:
>> E*E
ans =
7 18
6 19
>> G=1./E
G =
0.5000 0.3333
1.0000 0.2500
>> I=E.^(-1)
I =
0.5000 0.3333
1.0000 0.2500
The element-by-element power operation (array operation) not only works to get scalar
power of a vector, it can also be employed as follows:
>> 2.^d
ans =
8 2 16 128
Note that each element of the resulting ans is the scalar 2 raised to the power of the cor-
responding element of the vector d.
>> 1/E
Error using /
Matrix dimensions must agree.
The forward slash operator is the so-called right division. Roughly speaking, A/B evaluates
as A*inv(B) if B is a square matrix.* Likewise, the backward slash operator is the left divi-
sion, with A\B roughly evaluating as inv(A)*B.
* A right division is a least-squares solution of a linear equation xB = A, whereas a left division is a least-squares solution of a
linear equation Ax = B. Please refer to Chapter 10 for more information on least-squares solutions.
Appendix A: MATLAB® Primer ◾ 459
A.2.4 Test Yourself
The percentage sign (%) signifies that all text following it is a comment, which is
ignored by MATLAB. The vector v is defined using the colon notation described in
Section A.2.1. The array operations described in Section A.2.2 give the vectors f and g,
and the final result is calculated using the command sum. The result is
>> res
res =
0.6665
The above code can be saved in a MATLAB file. As discussed in Section A.5, such a file is
known as “MATLAB Script.”
A.3 MATHEMATICAL FUNCTIONS
I have already covered a few vector functions (sum, cumsum, etc.) and a couple of matrix
functions (inv and slash). Let us now cover a few more functions.
>> l=length(d);
>> [m,n]=size(A);
460 ◾ Appendix A: MATLAB® Primer
The first command returns the length of the vector d, whereas the second command returns
the number of rows (in variable m) and columns (in variable n). Thus, l=4, m=3, and n=4.
Applying the length command to an array returns the larger between m and n:
>> length(A)
ans =
4
Power, logarithm, and exponent are other important functions in MATLAB, which have
both matrix and array versions. In the previous section, I explained matrix multiplication,
division, and power and contrasted them with array (i.e., element-wise) versions of the
same. An alternate command for power include mpower (matrix power, E^n) and power
(array power, E.^n):
>> power(E,2)
ans =
4 9
1 16
>> mpower(E,2)
ans =
7 18
6 19
In the same manner, the sqrtm and sqrt functions are used to find the matrix square root
and element-wise square root of an array. The commands E^0.5, mpower(E,0.5),
and sqrtm(E) all give the same results.
The functions for finding element-wise exponent or natural logarithm of an array are:
>> sqrtm(A)
Error using sqrtm
Expected input to be a square matrix.
The matrix functions mpower, expm, logm, and sqrtm can be applied to square
matrices only.
Appendix A: MATLAB® Primer ◾ 461
>> display(res)
res =
0.6665
Either echoing a variable or using display causes MATLAB to display the name of the
variable as well. Moreover, the entire display takes three lines.† Alternatively, the disp
command is a more succinct method since the variable name is suppressed:
>> disp(res)
0.6665
* In reality, when one does not use a trailing semicolon, MATLAB internally calls the display command itself. This is hid-
den to the user but is the reason why one sees the same result.
† The example shown above is exactly how MATLAB echoes a variable or uses display. A blank line is left before the dis-
played information and another blank line between the variable name and values. In all of this text, these superfluous blank
lines are not shown for brevity.
462 ◾ Appendix A: MATLAB® Primer
Since the above method does not print any other information than the variable value,
MATLAB string operations are used instead. For example
will yield a string str = 'first stringsecond string'. Note the lack of space
between the two strings that are concatenated. The above string can be displayed using
disp(str). Let us now add a space at the end of the first string and replace the second
string with a numeric string:
Here are two important points to note: (i) the use of square brackets to define a concatenated
string and (ii) the use of single quotes enclosing '0.6665'. The command is disp(s1),
where s1 and s2 are string variables. If the square brackets are eliminated, MATLAB reads
this as disp(s1,s2). Since disp takes only one input argument, this will result in an
error: Too many input arguments. Since the two strings need to be printed side by
side, horizontal catenation is used (exactly in a similar way as that with numeric variables).
Thus, the command used is disp([s1,s2]). The second point is regarding the differ-
ence between 0.6665 and '0.6665': The former is a number, whereas the latter is a
string. Numbers cannot be concatenated with strings.
So, what do we do if we have the result in numerical variable, res? This number is con-
verted to a string using the num2str command. Thus, a better way of displaying the result is
A recap of the various things just discussed: (i) using num2str to convert numeric
variable res to a string; (ii) concatenation of this to a string 'The result is ';
(iii) enclosing the strings to be concatenated within square brackets; and (iv) inclusion
of a space within the prefix-string to ensure there is a space between the string and
displayed value.
Next, consider the command:
The reason for this error is that num2str(E) results in a string array with two rows; this
cannot be concatenated with a string ('Matrix E is: ') having a single row.
An alternative way of displaying results is to use the command sprintf. Since this
command is not used much in this book, it will be skipped for brevity.
Appendix A: MATLAB® Primer ◾ 463
A.4.2 Plotting Results
Try the following problems yourself:
1. We want to plot y = 2x2 + 0.5 for the values of 0 ≤ x ≤ 1 in steps of 0.01. Generate x
using colon “:” notations. This line should be dashed-red line.
2. Please use help plot to learn how to plot in MATLAB, and to find out how to plot
with a dashed-red line instead of solid blue.
3. Label the two axes as “x” and “quadratic,” respectively, using xlabel and ylabel
commands in MATLAB.
4. On the same figure, plot z = x2 + 2x as a default line
The following video on the MathWorks site gives an introduction to basic plotting functions:
http://in.mathworks.com/videos/using-basic-plotting-functions-69018.html
The following example solves the above problem.
The first plot command tells MATLAB the format for the line to be plotted: '--r'
implies dashed line (--) with red color (r). The resulting plot is shown in Figure A.1.
Note the use of the command hold on. Without the use of this command, the
current plot will be overwritten. Since I wanted to plot the second line on the same
plot as the first one, I used this command to hold the previous plot. The hold on
command needs to be used only once for each figure. To toggle back* to the default
mode of overwriting a previous plot, use hold off.
Note that symbols may be displayed by using the appropriate plot formatting string. The last
item in the format string signifies the symbol. Thus, '-b.' indicates solid line, blue color,
with dot symbols, whereas 'ko' implies no line, black color with circle symbols.
* Simply using hold will toggle between on and off states. I personally discourage such use in favor of using hold on
and hold off explicitly.
464 ◾ Appendix A: MATLAB® Primer
Quadratic
1
0
0 0.5 1
x
code can be written in a MATLAB file (say “arrayOperations1.m”). The code can be
executed by typing
>> arrayOperations1
at the command prompt. All the variables declared and used in the script file are available
in the MATLAB workspace after the script completes its execution.
MATLAB function, on the other hand, is a file that takes certain sets of inputs, executes
a sequence of steps defined in that file, and returns the outputs to the calling entity. The
set of inputs and outputs are known as input / output arguments. A MATLAB function file
(usually) has the same name as the name of the function itself. The first line of a function
(myFirstFun.m) will be
function [out1,out2,...] = myFirstFun(in1,in2,...)
There can be zero or more input arguments. These input arguments (in1, in2, etc.) are
written after the function name, and are comma-separated and enclosed within parenthe-
ses. Likewise, there can be zero or more output arguments (out1, out2, etc.), which are
comma-separated and enclosed within square-brackets and written before the function
name. Each MATLAB function has its own workspace and exchanges information through
these input and output arguments. Any other variables defined or changed within the func-
tion have a local scope, meaning that they are not accessible outside the function and are
“lost” once the function finished execution.*,†
A THOUGHT EXERCISE
Consider a thought exercise: You and your friend have a set of instructions on how to solve
a problem. Your friend will assist you in solving the problem. You have a notepad where you
have written down several values that are required for solving the problem. You have access
to your own set of instructions, but not to those of the other person. After executing some of
your instructions, you are told to seek your friend’s help.
Scenario 1: Consider the first scenario, where you and your friend use the same shared note-
pad. At this stage, you pass on this notepad to your friend, who then executes their instructions
and does all the computations on this notepad. On completion, your friend returns the notepad
to you. Since you shared the notepad, you have access to not only the result but also to all the
number-crunching work done by your friend. In this case, your friend is a MATLAB Script.
Scenario 2: Consider another scenario, where your friend has their own notepad that is not
accessible to you. For helping you, they only need a couple of pieces of information from you.
You write them on a yellow post-it note and give the post-it to your friend. Your friend then
implements instructions and does all the computation on their own notepad. On completion,
the information you seek is returned to you written on another green post-it note. In this case,
your friend is a MATLAB function, their notepad is the function’s own workspace, the yellow
post-it note is a set of input arguments and the green post-it note is a set of output arguments.
* Exception to this rule is global variables, which have to be declared as global in the function. They are accessible to any
function or script using global declaration.
† Additionally, MATLAB also allows defining persistent variables, which are not lost once the function finishes execu-
tion. They are mentioned here for the sake of completeness, but will not be discussed in this book.
466 ◾ Appendix A: MATLAB® Primer
Let us now consider an example where we will use MATLAB files. In Chapter 10, the spe-
cific heat of methane gas at various temperatures is obtained in the form of the following
quadratic function:
where
cp is the specific heat in J/mol · K
T is the temperature in Kelvin
Let us consider two cases: (i) where we want to compute the specific heat for a given tem-
perature and (ii) where this computation of cp(T) needs to be used as a part of a larger code.
The next example distinguishes the two cases.
The above lines are typed in MATLAB editor, and the file is saved as spHeat.m. This
file is MATLAB script. It can be executed by typing the file name on the command
prompt (or in any other file). This gives
>> spHeat
Please enter temperature (K): 450
At T = 450 K, c_p = 44.372 J/mol.K
Note that the value of 450 (underlined for highlighting) was entered by the user.
Specific heat was computed at that temperature and assigned to the variable cp.
A script is a perfect way to write a sequence of MATLAB statements that one needs to
execute for a particular job. In examples in Chapters 4 and 7, the specific heat will be used
for other computations. In such a case, a MATLAB function will be more appropriate.
Appendix A: MATLAB® Primer ◾ 467
function cp=spHeat(T)
%% Compute specific heat at a given temperature
a1=13.7;
a2=0.075;
a3=-1.52e-5;
cp=a1+a2*T+a3*T.^2;
end
The computation of cp is vectorized by using array operand “.^” so that the specific
heat can be computed at multiple temperature values if T is a vector. The above com-
mands were saved as a function. The function needs to be invoked from the command
line by providing input arguments and capturing the results as output arguments:
The first command clears all the workspace variables, whereas the final command
calls the function to compute specific heat. The input argument may either be a vari-
able or just a value (scalar, vector, or array, as required in the function).
The first difference between the script and function is based on scope of the variables.
Variables of a function are local to that function itself. So, after executing the function,
if we were to check variables in our workspace, we will find that there are only two
variables in the workspace, T_gas and cp, because the parameters a1, a2, and a3
are local to the function spHeat and hence unavailable in the MATLAB workspace.
Another difference is the way the two are invoked. If the function is simply called
by the file name (without providing the adequate number and type of input argu-
ments), MATLAB will return an error:
>> spHeat
Not enough input arguments.
Error in spHeat (line 7)
cp=a1+a2*T+a3*T^2;
The final thing to note is that the file name should be the same as the name of the
function.
This completes our discussion on MATLAB scripts and functions. Before I close this sec-
tion, I would like to draw the attention of the user to the variables used. Note that the
468 ◾ Appendix A: MATLAB® Primer
temperature was named in the MATLAB workspace as T_gas. This value was passed on to
the function. As described in the pullout text box, this is equivalent to the “workspace”
writing the value of T_gas on a post-it note and handing it to “spHeat.” The function
“spHeat” refers to this value as variable T, does its computations using its own copy of the
variable (equivalent to doing this “on its own notepad”), and has no access to any variable
from the MATLAB workspace.
The most common use is when a block of commands need to be repeated a certain number
of times. For example, if the block is to be repeated five times, the first line written as
for i=1:5
so that the loop is repeated five times, and the index i is incremented every time the termi-
nal end statement is encountered. The increment of the index need not be in steps of 1; any
other increment is also possible, such as
for j=1:2:10
In this case, the index j takes the values 1, 3, 5, 7, and 9 in each iteration; clearly, the loop is
executed five times. Note that the value of the index is available within the loop, but it may
not be modified when the loop is being executed. When the loop completes execution, the
index variable takes the last value used in the loop. This value is accessible outside the loop.
Thus, after the two loops mentioned above finish execution, the variables i and j have the
values of 5 and 9, respectively.
With this background and the discussion in previous sections, the reader may try answer-
ing the following questions. For the several for statements below, what are the values that
the index will take each time the loop executes?
for p=0:0.2:1
for q=5:1
for r=5:-1:1
for s=[1 -2 0 2]
Appendix A: MATLAB® Primer ◾ 469
In each of the above example, the following lines of code may be executed:
for p=0:0.2:1
disp(p);
end
1. The first case should be fairly clear based on the preceding discussions. Here, the loop
will execute six times; the variable p takes the values of 0, 0.2, 0.4, 0.6, 0.8, and
1 each time the statements in the loop get executed.
2. Since the command [5:1] returns a blank matrix, the loop will not get executed. In
other words, the commands in the for block, before the corresponding end state-
ment, will be skipped, and variable q will become an empty variable (i.e., q=[ ]).
3. The loop will get executed five times, with the value of r starting at 5 and decrement-
ing with each iteration. In other words, in each of the five iterations of the for loop,
r will consecutively take the values of 5, 4, 3, 2, and 1.
4. The final example is a uniquely MATLAB-specific implementation of the for loop. In
each iteration, the index variable s consecutively takes the values from each column
of the argument. In other words, the loop will get executed four times. During the first
iteration, s takes the value of the first element, i.e., s=1; in the second iteration, it
takes the value of s=-2; in the third iteration, s=0; and in the fourth iteration, s=2.
While the last example shows the versatility of the for loop in MATLAB, it can make the
code somewhat confusing to debug. Instead, I suggest using the following, rather tradi-
tional, approach:
A=[1 -2 0 2];
for i=1:length(A)
s=A(i);
disp(s);
end
Although the variable s takes the same values in each iteration as in the previous case, the
latter code is more readable. The readers can compare the two codes and verify that they
give the same results.
What do you expect when the following code is used?
for s=A
disp(s);
end
The reader may verify that the above code also gives the same results.
470 ◾ Appendix A: MATLAB® Primer
Next, we will write a small code that uses the for loop for computing the terms of a
Fibonacci series. A Fibonacci series is a series comprising [1 1 2 3 5 8 13 21 ...],
where each element is the sum of the preceding two elements.
The result of executing the above code is the first ten terms of the series:
>> fibo
1 1 2 3 5 8 13 21 34 55
Astute readers will notice that the for loop starts with an initial value of 3. This is
done because the first two terms of the series are given to be 1, and the calculation of
the series starts from the third term.
The other construct that can be used for iteratively executing a block of code is the
while loop:
while <condition>
<block of commands that are repeated>
end
The while loop keeps executing as long as the abovementioned <condition> is true.
If the condition becomes false at any time, MATLAB finishes executing the block of the
code, reaches the end line, and exits the loop. If the <condition> evaluates as false
during the first instance, the block of commands within the while construct are not exe-
cuted. If the condition never becomes false, we get an infinite loop (which needs to be
interrupted using Ctrl-C).
The reader is encouraged to use the next example as a test example to check their under-
standing of the while loop.
Appendix A: MATLAB® Primer ◾ 471
i=10;
while i<5
disp(i);
i=i+1;
end
i=1;
while i<5
disp(i);
i=i-1;
end
Solutions:
1. The loop will run four times. In the fourth iteration, the value of i is incre-
mented to 5; the condition i<5 computes as false, and the loop does not execute
the fifth time.
2. The commands within the loop will not execute at all because, at the first instance,
the condition i<5 is false.
Here, it will run in an infinite loop because the condition i<5 always evaluates as true
since i is decremented.
This completes our discussion of iterative loops.
A.6.2 Conditional if Block
The syntax for the if-then-else block is
if <condition>
<statements-1 evaluated for true>
else
<statements-2 evaluated for false>
end
472 ◾ Appendix A: MATLAB® Primer
Here, <condition> refers to any comparison (including Boolean expression) that evaluates
as either true or false. If the condition is true, the first block, statements-1, is executed;
otherwise, the second block, statements-2, is executed. Nesting of multiple if-blocks is
permitted. Let us write a code for printing all prime numbers between 1 and n.
n=20;
for i=1:n
if isprime(i)
disp([num2str(i), ' is a prime number']);
end
end
2 is a prime number
3 is a prime number
5 is a prime number
7 is a prime number
11 is a prime number
13 is a prime number
17 is a prime number
19 is a prime number
This concludes our introduction to a conditional if statement. If there are multiple cases
to be analyzed, the switch-case statements are more efficient and readable. Interested
readers may look up MATLAB help for the switch statement, if required.
A.7 FUNCTION HANDLES
MATLAB functions were discussed in Section A.5. A function to calculate specific heat was
written in Example A.3. The function was used to calculate the specific heat at any desired
temperature as
>> cp=spHeat(T_gas);
There are several occasions, as we will see throughout this book, where a function needs
to be provided to another function for performing numerical computations. Function
handle is MATLAB’s way to provide or pass on a function to another function. As per
MATLAB help documentation, a function handle is a “data type that stores an association
Appendix A: MATLAB® Primer ◾ 473
to a function.” The function handle for the spHeat function may be written in one of the
following two ways:
Let us deconstruct the second approach, which is called the anonymous function method
and was introduced in MATLAB quite recently.* The symbol @ indicates function handle
and is immediately followed by comma-separated input arguments in round brackets and
then the function name (note that in the spHeat example, the function name is the same
as the file name). This function handle can then be passed to another function for further
computations.
For example, the enthalpy at a temperature, say 450 K, is computed as
450
ò
+ c p (T ) d T
0
H 450 = H 298 (A.2)
298
The integral in the above equation can be computed using the MATLAB command
integral (see Appendix D). This command requires one to pass the function handle
to the function cp(T). This can be done as follows:
The anonymous function handle passed on to the MATLAB solver integral is under-
lined above.
This brings us to the end of this primer on MATLAB.
* For this reason, students and practitioners still prefer the @spHeat form of function handle. However, the anonymous
function method is more modern and versatile and will be followed in this book. Readers are however free to follow the
method of their choice.
Appendix B: Numerical
Differentiation
B.1 GENERAL SETUP
Differentiation is one of the essential tools in engineering on which the foundation of cal-
culus rests. For a function y = f(t), it is defined as
dy f ( t + Dt ) - f ( t )
= lim (B.1)
dt t ®0 Dt
The shorthand representation of the first derivative is f ′(t). The value of the derivative eval-
uated at a certain point, t = ti, is represented as f ′(ti). The following notations will be used
throughout:
d
f ¢ (ti ) º f (t )
dt ti
whereas
d
f (ti ) = 0
dt
The former indicates the value of the derivative of f(t) at t = ti, whereas the latter indicates
derivative of the (constant) value that the function achieves at t1.
The second derivative is defined as
d 2 y d æ dy ö
f ¢¢ ( t ) º = (B.2)
dt 2 dt çè dt ÷ø
(a) (b)
FIGURE B.1 Schematic of the first derivative, f ′(t) and numerical approximation using (a) forward
and (b) central difference formulae.
f (ti + h ) - f (ti )
f ¢ (ti ) »
h
This is the forward difference formula for finding the first derivative of a function. A formal
derivation, error analysis, and trade-off between round-off and truncation errors (also see
Chapter 1) will be presented in the rest of this chapter.
Figure B.1 shows a schematic representation of derivative as a (slope of) tangent to the
curve in the Y–T plane. The slope of the line connecting the points a and (a + h) is the
approximation using forward difference formula. The approximation can be improved by
reducing h. Alternatively, a different method (called central difference formula) may be used,
where the derivative is approximated as the slope of the line connecting (a − h) and (a + h).
As indicated in the schematic in Figure B.1, the central difference formula better approxi-
mates the true derivative.
Numerical differentiation finds several applications. A large number of numerical
techniques for solving nonlinear algebraic equations (Chapter 6), implicit ODEs and dif-
ferential algebraic equations (Chapter 8), and regression (Chapter 10) requires deriva-
tives (or Jacobian) to be calculated. For example, in the Newton-Raphson method of
Chapter 6
x (i +1)
=x (i )
-
( )
f x( )
i
(B.3)
f ¢( x( ) )
i
if the derivative f ′(x) is not available, numerical differentiation may be used instead.
Solutions to partial differential equations (Chapters 4 and 10) using finite difference
approximation discussed in this chapter is popular for generic engineering problems.
Appendix B: Numerical Differentiation ◾ 477
Another application is in the calculation of heat or mass flux. If the walls of a heat
exchanger are at a higher temperature than the fluid in the channel, a temperature gradient
exists. For even slightly complex geometries, the net heat flux into the fluid
dT
J = -l f (B.4)
dz
often needs to be calculated numerically. Other applications include finding the rate-
determining step in a complex network of reactions, analyzing sensitivity of system perfor-
mance to various parameters, as well as in data analytics where gradient (first derivative)
and curvature (second derivative) can reveal critical information about the qualitative
nature of systems.
B.2.1 First Derivatives
Consider the Taylor’s series expansion of f(a + h):
h2 h3
f ( a + h ) = f ( a ) + hf ¢ ( a ) + f ¢¢ ( a ) + f ¢¢¢ ( a ) + (B.5)
2 3!
Retaining the term in f ′(a) and moving the other terms to the left-hand side, we get
f (a + h ) - f (a ) é h h2 ù
f ¢ (a ) = - ê f ¢¢ ( a ) + f ¢¢¢ ( a ) + ú (B.6)
h ë2 6 û
The first term on the right-hand side is the forward difference approximation of f ′(a),
h
whereas the terms in square brackets represent the error. Here, f ¢¢ ( a ) is the leading error
2
( )
term. Hence, the forward difference formula is O h1 accurate. Using the mean value theo-
rem, the above equation may be written as
f (a + h ) - f (a ) h
f ¢ (a ) = - f ¢¢ ( x ) (B.7)
h 2
Fwd . diff Error
There exists a point ξ ∈ [a, (a + h)] such that the terms in the brackets of Equation B.6 may
be equivalently written in the above form.
478 ◾ Appendix B: Numerical Differentiation
Additional nodal points are required to derive a more accurate formula for the first
derivative. The central difference formula uses f(a − h), in addition to f(a) and f(a + h), to
approximate the derivative f ′(a) (hence the name). Again, Taylor’s series expansion is
h2 h3
f ( a - h ) = f ( a ) - hf ¢ ( a ) + f ¢¢ ( a ) - f ¢¢¢ ( a ) + (B.8)
2 3!
h3
f ( a + h ) - f ( a - h ) = 2hf ¢ ( a ) + f ¢¢¢ ( a ) + (B.9)
3
f ( a + h ) - f ( a - h ) h2
f ¢ (a ) = - f ¢¢¢ ( x ) (B.10)
2h 6
Central diff Error
Comparing the forward and central difference formulae in Equations B.7 and B.10 demon-
strates what I argued in the schematic of Figure B.1. Since the forward difference formula
( )
scales as O ( h ), whereas the central difference formula scales as O h2 , the latter is more
accurate for the same choice of step-size h. This has already been observed in Chapter 1 and
will be demonstrated later in this section.
Similar to the forward difference formula, it is easy to observe that the backward differ-
ence formula for the first derivative is
f (a ) - f (a - h )
h
f ¢ (a ) = f ¢¢ ( x )
+ (B.11)
h 2
Bkd difference Error
The backward difference formula has the same accuracy as the forward difference formula.
In certain situations, a forward (or backward) difference formula with a higher accuracy
is needed, for example, to calculate the heat flux at the boundary with higher accuracy or in
solving PDEs (see Chapter 4). A three-point forward difference formula would be used in
such a case. Here, the values f(a) , f(a + h) and f(a + 2h) are used to calculate the derivative
f′(a). This is in contrast to f(a − h) , f(a) and f(a + h) used in the central difference formula.
Taylor’s series expansion of f(a + 2h) is
4h 2 8h3
f ( a + 2h ) = f ( a ) + 2hf ¢ ( a ) + f ¢¢ ( a ) + f ¢¢¢ ( a ) + (B.12)
2! 3!
Note that subtracting 4f(a + h) from the above equation eliminates the term in f ″(a), and
the third-order term becomes the leading term. This can indeed be done by observation.
Appendix B: Numerical Differentiation ◾ 479
However, as the functions become more complex, or if higher-order formulae are required,
a more versatile method called method of undetermined coefficients may be used to compute
the approximation. For example, the three-point forward difference formula is written as a
weighted sum:
f ¢ ( a ) = w1 f ( a ) + w2 f ( a + h ) + w3 f ( a + 2h ) (B.13)
é h2 h3 ù
f ¢ ( a ) = w1 f ( a ) + w2 ê f ( a ) + hf ¢ ( a ) + f ¢¢ ( a ) + f ¢¢¢ ( a ) + ú
ë 2 3! û
é 4h2 8h3 ù
+ w3 ê f ( a ) + 2hf ¢ ( a ) + f ¢¢ ( a ) + f ¢¢¢ ( a ) + ú (B.14)
ë 2 3! û
h2
f ¢ ( a ) = ( w1 + w2 + w3 ) f ( a ) + ( w2 + 2w3 ) hf ¢ ( a ) + ( w2 + 4w3 ) f ¢¢ ( a )
2
h3
+ ( w2 + 8w
w3 ) f ¢¢¢ ( a ) + (B.15)
3!
Comparing with the left-hand side, the coefficient of the second term on the right-hand
side should equal 1, whereas the others should be zero. Thus, we obtain three equations in
three unknowns:
w1 + w2 + w3 = 0
w2 + 2w3 = 1 / h (B.16)
w 2 + 4w 3 = 0
-3 2 -1
w1 = , w2 = , w3 = (B.17)
2h h 2h
2
w2 + 8w3 = -
h
480 ◾ Appendix B: Numerical Differentiation
h2
E=- f ¢¢¢ ( x ) (B.18)
3
-3 f ( a ) + 4 f ( a + h ) - f ( a + 2h ) h2
f ¢ (a ) = - f ¢¢¢ ( x ) (B.19)
2h 3
B.2.2 Second Derivative
Having derived the formula for f ′(a), let us now determine higher-order differentiation
formulae. Unlike first derivatives, at least three nodes are required for computing the sec-
ond derivative. The central difference formula for f ″(a) can be derived by inspection from
Equations B.5 and B.8. Note that adding the two eliminates the term in f ′(a):
é h2 h4 ù
f (a + h ) + f (a - h ) = 2 ê f (a ) + f ¢¢ ( a ) + f ¢¢¢¢ ( a ) + ú (B.20)
ë 2 4! û
The odd-powered terms of the Taylor’s series vanish. The fourth-order term becomes
the leading error term. Thus, the formula and error estimate for the central difference is
given by
f ( a + h ) - 2 f ( a ) + f ( a - h ) h2
f ¢¢ ( a ) = 2
- f ¢¢¢¢ ( x ) (B.21)
h
12
Formula Error
( )
Thus, the central difference formula for f ″(a) is O h2 accurate. This formula can also be
derived using method of undetermined coefficients, discussed earlier, by writing
f ¢¢ ( a ) = w1 f ( a - h ) + w2 f ( a ) + w3 f ( a + h )
An interested reader can verify that this method yields the same formula and error infor-
mation as derived in Equation B.21.
In a similar manner, higher-order derivatives can also be derived. Since these formulae
are not required in this book, I will not discuss them further. Still, the method of undeter-
mined coefficients forms a useful template to derive these formulae, including cases with
unequal grid size.
Appendix B: Numerical Differentiation ◾ 481
% Problem Setup
a=1; h=0.01;
trueVal=1/(1+a^2);
%% Compute numerical derivatives and errors
% Forward difference for f’(x)
d1_fwd=(atan(a+h)-atan(a))/h;
err1_fwd=abs(d1_fwd-trueVal);
% Central difference for f’(x)
d1_ctr=(atan(a+h)-atan(a-h))/(2*h);
err1_ctr=abs(d1_ctr-trueVal);
% 3-point forward difference for f’(x)
d1_3pt=(-3*atan(a)+4*atan(a+h)-atan(a+2*h))/(2*h);
err1_3pt=abs(d1_3pt-trueVal);
% Display Results
disp([d1_fwd, d1_ctr, d1_3pt]);
disp([err1_fwd, err1_ctr, err1_3pt]);
With the step-size h = 0.01, the forward difference formula gave the numerical deriva-
tive value as 0.4975. For all other cases, the displayed value was 0.5000. Hence,
it is more instructive to see the errors. The results below show the errors for forward,
central, and 3-point forward difference formulae for various values of h:
Clearly, the central and three-point forward difference formulae are more accurate
than the standard forward difference formula.
Moreover, when step-size is reduced by one order of magnitude, the error in for-
ward difference formula decreases by one order of magnitude, whereas the errors in
central and three-point forward difference formulae decrease by two orders of magni-
( )
tude. This is a consequence of the fact that the latter two are O h2 accurate, whereas
the standard forward difference formula is O ( h ) accurate.
For the initial few step-sizes, the error in the central difference formula is approxi-
mately half of that in three-point forward difference formula. This approximate
relationship is expected, comparing the error terms for the two formulae from
Equations B.10 and B.19.
482 ◾ Appendix B: Numerical Differentiation
Finally, notice that the error in central and three-point forward difference formulae is
greater at h = 10−6 than it is at h = 10−5. This will be the topic of discussion in the next
section.
f (a + h ) - f (a ) h
f ¢ (a ) = - f ¢¢ ( x ) (B.7)
h 2
The computer representation of a value differs from its true value due to machine precision.
So, the true value of f(a) differs from the value represented in the computer, f ( a ).
Consider the example of tan−1(1). The true value is 0.785398… (additional digits are
ignored for this example). Presuming that there is a computer using the decimal number
system (as discussed in Chapter 1), which uses a four-digit representation of the mantissa,
this number is represented in the “decimal computer” as tan -1 ( -1) = 0.7853. The relative
error between the two is bounded by the machine precision. Thus, one can get the following
relationship between true value and the computer representation:
f ( a ) = f ( a ) (1 + e ) (B.22)
f (a + h ) - f (a )
f ¢ (a ) = (B.23)
h
Subtracting the two, the net error is given by
Enet = f ¢ ( a ) - f ¢ ( a ) (B.24)
f (a + h ) - f (a ) h
= e+ f ¢¢ ( x ) (B.25)
h 2
2emach h
£ f ( x1 ) + f ¢¢ ( x ) (B.26)
h 2
Appendix B: Numerical Differentiation ◾ 483
where both ξ, ξ1 ∈ [a, a + h]. The first term is the round-off error, and the second term is the
truncation error. The optimum value of h that minimizes the net error is obtained by dif-
ferentiating the above equation and equating it to zero:
d 2e
Enet = - mach f ( x1 ) + f ¢¢ ( x ) = 0
dh h2
Thus
f ( x1 )
hopt = 2 e mach (B.27)
f ¢¢ ( x )
1/2
hopt ~ ëéemach ùû (B.28)
Arguing on similar lines, the truncation error in the central difference formula is
Etrunc = c1h2
c2
Er -off =
h
Since Enet = |Etrunc| + |Er − off|, it is easy to show that the optimum value of step-size for the
central difference formula is
1/3
hopt ~ ëéemach ùû (B.29)
Since the machine precision in MATLAB is 2 × 10−16, the optimum value of step-size is of
the order of hopt ∼ 0.6 × 10−5. The observation in Example B.1 is thus in agreement with the
optimal step-size value derived herein.
Appendix C: Gauss
Elimination for
Linear Equations
x1 + x2 + x3 = 4
2 x1 + x2 + 3x3 = 7 (C.1)
3x1 + 4 x2 - 2 x3 = 9
The aim in this example is to solve the system of linear equations given above. This can be
expressed in the following standard form:
é1 1 1 ù é x1 ù é 4 ù
ê úê ú ê ú (C.2)
ê2 1 3 ú ê x2 ú = ê7 ú
ê3 4 -2 ú ê x3 ú ê 9 ú
û ëû ëû
ë
A x b
The above system of equations has a unique solution if the determinant of A is nonzero.
A more general way of expressing the same statement is that a unique solution exists if A
is a full-rank matrix rank(A) = n. If the rank of A is not n, then the system of equations may
have infinitely many solutions or no solution. If rank(A) = m < n, it implies that only m rows
of A are linearly independent, whereas the other rows can be written as a linear combina-
tion of these m rows. If the vector b also satisfies the same linear combination, then there are
infinitely many solutions to the equations. Consider the following equations:
x1 + x2 = 1
2 x1 + 2 x2 = a
485
486 ◾ Appendix C: Gauss Elimination for Linear Equations
Clearly, the left-hand side of the second equation is twice that of the first. Thus, rank(A) =
1 and the system of equations does not have a unique solution. If the right-hand side, a, is
also twice that of the right-hand side of the first equation, then the two lines are collinear
and there are infinitely many solutions. If a ≠ 2, then there is no solution. Summarizing,
Consider how we solve the above set of three equations given in Equation C.1. We will use
the first equation to eliminate x1 from the second and third equations, that is, R2 − 2R1 and
R3 − 3R1:
x1 + x2 + x3 = 4
0 - x2 + x3 = -1 (C.3)
0 + x2 - 5x3 = -3
Now, we can use the last two equations to eliminate x2 and obtain the value of x3:
-4 x3 = -4
x1 + x2 + x3 = 4
0 - x2 + x3 = -1 (C.4)
0 + 0 - 4 x3 = -4
The above equation has reduced the overall system of equations to one where the matrix A
has an upper triangular form. The process we followed is called Gauss Elimination. The final
solution is obtained using a procedure called back-substitution. The final equation is used to
obtain the value of x3 = 1. Thereafter, this value can be substituted in the second equation.
This will yield the value of x2 = 2. Substituting these values in the first equation yields x1 = 1.
Let us revisit the steps used in solving the above set of equations:
• In the first sequence of steps, the first equation was used to eliminate x1 from all the
remaining equations.
• Thereafter, the second equation was used to eliminate x2 from the remaining equa-
tions. No change was made to the first equation.
• For a larger system of equations, the procedure continues until the right-hand side
involves an upper triangular matrix. In each sequence of steps, the ith equation is used
to eliminate xi from all subsequent equations.
Appendix C: Gauss Elimination for Linear Equations ◾ 487
The following code demonstrates the use of Gauss Elimination with back-substitution
for this problem. This is not a generalized code. It simply follows the steps laid out in the
example above.
Ab
1 1 1 4
0 -1 1 -1
0 0 -4 -4
* The reason why we call this naïve Gauss Elimination will be come clear in Section C.3.
488 ◾ Appendix C: Gauss Elimination for Linear Equations
Now that the above results are in the reduced upper triangular form of Equation C.4,
we can back-substitute the values to obtain x. First, x3 can be calculated from the last row:
Ab3,end
x3 = =1 (C.5)
Ab3,3
Ab2,end - ( Ab2,3 x3 )
x2 = =2 (C.6)
Ab2,2
followed by calculation of x1 as
%% Back-Substitution
x=zeros(3,1);
for i=n:-1:1
x(i)=( Ab(i,end)-Ab(i,i+1:n)*x(i+1:n) ) / Ab(i,i);
end
x =
1
2
1
The strategy I have followed in Example C.1 is the following: The Gauss Elimination part of
the code is written for a specific 3 × 3 system example, whereas the back-substitution part is
written as a general-purpose code for any n × n system. This is intentional. Converting the
Gauss Elimination code to a general n × n system of equations is left as an exercise for the
reader. The building blocks are already present in the above code, and the algorithm for a
generic solver is discussed presently.
Appendix C: Gauss Elimination for Linear Equations ◾ 489
• Perform the row operation Ri ← Ri − αikRk where “←” represents the assignment
operator.* Here, Ri represents the ith row, that is, Ab(i,:). Note that since the
pivot row elements Abk , 1 , Abk , 2 , … , Abk , k − 1 are already zero due to previous row
operations, the above needs to be performed only for elements after the diagonal
element. In other words
In MATLAB, the above row operation can be done efficiently using the colon
notation to indicate entire row:
Ab(i,k:end)=Ab(i,k:end)-alpha*Ab(k,k:end);
Compare this with the underlined statements in Example C.1 to understand how
the row operation is coded to simplify the calculation of Equation C.9.
• Increment i and repeat the previous two steps until all the elements in the pivot
column are eliminated, that is, until i = n.
• Increment k so that the next row is the pivot row and Ak, k is the pivot element.
Repeat the elimination steps until Ab is an upper triangular matrix, that is, until
k reaches (n − 1).
* I use the assignment operator ← in linear equations chapters to avoid ambiguity. This is to avoid readers interpreting
R2 = R2 − αR1 as a linear equation. When I need to use the notation R2( ) = R2( ) - aR1( ), the assignment operator is not
i +1 i i
After the Gauss Elimination steps, Ab is an upper triangular matrix. Let us say that this
reduced matrix represents the following modified equation:
Ux = b (C.10)
Note that the matrix modified after row eliminations is related to the above as Ab = [U b].
Row eliminations do not affect the desired solution, x, as we have seen that in the reduction
for the example in Equation C.2. For this example, the reader may verify that
é1 1 1 ù é x1 ù é 4 ù
ê úê ú ê ú (C.11)
ê0 -1 1 ú ê x2 ú = ê -1ú
ê0 -4 úû êë x3 úû êë -4 úû
ë0
U x b
Note the equivalence between row operations described in the algorithm above and the
familiar equation solving procedure through elimination that I described to reduce a set of
equations (C.1 through C.4).
C.1.3 Back-Substitution
Due to the upper triangular form of the matrix in Equation C.11, the solution is obtained
by starting from the last row. Specifically
b3 Ab
x3 = = 3,end = 1
U 3, 3 Ab3,3
Thereafter
Ab2,end - ( Ab2,3 x3 )
x2 = =2 (C.12)
Ab2,2
and
Abn,end
xn = (C.14)
Ab(n,n)
Appendix C: Gauss Elimination for Linear Equations ◾ 491
å
n
Abi ,end - Abi , j x j
j =i +1
xi = (C.15)
Ab(i ,i )
A = LU (C.16)
It should be noted that the decomposition is not unique. Several different factorizations into
L and U matrices can be obtained. If we were to multiply Equation C.11 by L on both sides
(C.17)
x = L
LU b
A b
This is the algorithm behind the Doolittle method for LU decomposition. The upper trian-
gular matrix U is obtained using Gauss Elimination. It turns out that the elements of the
lower triangular matrix are the values of αik used during factorization, as per Equation C.8.
The lower triangular matrix to complete the LU decomposition is then given by
é 1 0 0 0ù
ê ú
êa21 1 0 0ú
L = êa31 a32 1 0ú (C.18)
ê ú
ê ú
êan1 a n2 a n3 1 úû
ë
%% Gauss Elimination
% Get augmented matrix
Ab=[A, b];
L=eye(n);
%% Back-Substitution
x = zeros(3,1);
for i=n:-1:1
x(i)=( Ab(i,end)-Ab(i,i+1:n)*x(i+1:n) ) / Ab(i,i);
end
The core of the code has remained the same, with the addition of the underlined state-
ments: First, the matrix L is initialized as identity, followed by inserting values of αik in the
first column and then inserting αik in the second column. This completes the L matrix:
L =
1 0 0
2 1 0
3 -1 1
It is easy to verify that the product L*U indeed gives back the matrix A:
>> U=Ab(1:n,1:n);
>> Acheck=L*U;
LU decomposition can be used to solve a linear equation Ax = b. It consists of three steps.
First, LU decomposition is used to obtain matrices L and U. Since Lβ = b, as per Equation
C.17, the second step is to obtain β from L and b. This is done using forward substitution,
which works exactly like backward substitution method but starting with the first row. Note
that the vector β resulting after forward substitution step is nothing but the last column of
the Ab matrix. Finally, the third step is backward substitution to solve Ux = β, which was
discussed earlier.
The utility of LU decomposition is not really in solving a single equation, Ax = b.
Instead, it is very useful if we need to solve a series of equations Ax[i] = b[i], where the
matrix A remains constant while vector b changes. The greatest effort in solving linear
equation is in factorization/elimination. Hence, factorizing a matrix once and using
forward/backward substitutions multiple times reduces the effort. An example of a sys-
tem where Ax[i] = b[i] is solved multiple times is covered in Chapter 7, where the right-
hand side is a nonlinear term; b[i + 1] = g(x(i)) is nothing but the function evaluated at
the current guess, followed by solving the nonlinear equation to obtain the new guess,
x(i + 1) ≡ x[i + 1].
Another use from a numerical methods perspective is that sparse matrices, of the
type discussed in Chapter 7, can sometimes be factorized into L and U using efficient
algorithms. Thus, this method becomes a computationally useful means to solve such
equations.
The MATLAB code for obtaining LU decomposition is lu(A). The result of this code is
different from the algorithm discussed here, since it uses partial pivoting (see next section).
Thus, the final result is of the form LU = PA, where P is a permutation matrix that accounts
for row exchanges done in the partial pivoting method.
C.3 PARTIAL PIVOTING
I have referred to partial pivoting method a couple of times. Let us look into what this
means. As a simple, motivating example, consider that the linear system of equations was
instead given as
x1 + x2 + x3 = 4
2 x1 + 2 x2 + 3x3 = 9 (C.19)
3x1 + 4 x2 - 2 x3 = 9
The first set of row operations using first row as the pivot row yields
é1 1 1 4ù
ê ú
Ab = ê0 0 1 1ú (C.20)
ê0 1 -5 -3úû
ë
494 ◾ Appendix C: Gauss Elimination for Linear Equations
The naïve Gauss Elimination, as discussed earlier, cannot continue from this stage because
the next pivot element A2 , 2 = 0. However, if we were to solve the equations manually, we
would just switch the last two equations and use the second equation to obtain the value
of x3. The matrix after exchange of the last two rows becomes
é1 1 1 4ù
ê ú
Ab = ê0 1 -5 -3ú (C.21)
ê0 0 1 1 úû
ë
This is the crux of Gauss Elimination with partial pivoting. Partial pivoting is basically a row
exchange operation carried out at each step. Unlike the example mentioned here, the aim of
partial pivoting is broader. A premature end to Gauss Elimination is not the only problem.
Another issue is round-off errors. In the case of examples where two large numbers are
subtracted, pivoting improves the performance of Gauss Elimination. For example, if the
second equation was
2 x1 + ( 2 + d ) x2 + 3x3 = 9. ´´´
the diagonal element, A2 , 2, would become δ ≪ 1. This can lead to round-off errors because
of the subtraction of a close number and/or division by a small number. Recall that in back-
substitution, we compute x2 = (numerator)/A2 , 2.
Theoretical guarantees exist for Gauss Elimination with full pivoting, where both row
and column interchanges are allowed. However, partial pivoting, where only row inter-
changes are executed, is found to be equally useful in practice. Hence, all the numerical
solvers use Gauss Elimination with partial pivoting.
In the algorithm given in Section C.1.2, at any kth step, Ak , k, is the pivot element. In
naïve Gauss Elimination, the pivot element is used to eliminate coefficients Ai , k in the pivot
column below the diagonal. In case of Gauss Elimination with pivoting, an additional step
is added before this:
Consider the original example from Equation C.1. In the first step, with k = 1, we per-
form the pivoting step. The largest absolute value in the pivot column (column 1) is A3 , 1.
Appendix C: Gauss Elimination for Linear Equations ◾ 495
Therefore, rows 3 and 1 are interchanged so that the new pivot element is the dominant
element in that column. The matrix Ab after the row interchange is
Ab =
3 4 -2 9
2 1 3 7
1 1 4
The elimination is then performed with α21 = 2/3 and α3 , 1 = 1/3. After appropriate row oper-
ations, the matrix becomes
Ab =
3.0000 4.0000 -2.0000 9.0000
0 -1.6667 4.3333 1.0000
0 -0.3333 1.6667 1.0000
Next, A2 , 2 is the pivot element. The elements in column 2 at and below the diagonal as
−1.6667 and −0.3333. Since the former has the largest absolute value, no row interchange
needs to be performed. The algorithm continues with α3 , 2 = 0.2 and yields
Ab =
3.0000 4.0000 -2.0000 9.0000
0 -1.6667 4.3333 1.0000
0 0 0.8000 0.8000
The matrix is now in upper triangular form. We also note that the α values used were
é * * *
ù
ê * *
ú
a = ê0.6667 ú
ê *ú
ëê0.3333 0. 2 úû
The following code is a modification of Example C.1 with partial pivoting included.
%% Gauss Elimination
% Get augmented matrix
Ab = [A, b];
% With A(1,1) as pivot Element
k=1;
pivotCol=Ab(k:n,k);
[~,idx]=max(abs(pivotCol));
idx=(k-1)+idx;
if (idx~=k) % Row interchange
pivotRow=Ab(k,:);
Ab(k,:)=Ab(idx,:);
Ab(idx,:)=pivotRow;
end
for i=2:3
alpha=Ab(i,1)/Ab(1,1);
Ab(i,1:end)=Ab(i,1:end) - alpha*Ab(1,1:end);
end
% With A(2,2) as pivot Element
k=2;
pivotCol=Ab(k:n,k);
[~,idx]=max(abs(pivotCol));
idx=(k-1)+idx;
if (idx~=k) % Row interchange
pivotRow=Ab(k,:);
Ab(k,:)=Ab(idx,:);
Ab(idx,:)=pivotRow;
end
for i=3:3
alpha=Ab(i,2)/Ab(2,2);
Ab(i,2:end)=Ab(i,2:end) - alpha*Ab(2,2:end);
end
%% Back-Substitution
x = zeros(3,1);
for i = 3:-1:1
x(i) = ( Ab(i,end)-Ab(i,i+1:n)*x(i+1:n) ) / Ab(i,i);
end
The underlined block has been added. The first line in that block extracts the pivot col-
umn from the diagonal to the last row into a vector pivotCol. The next line uses the
max function to find the index of the largest value in pivotCol. Since pivotCol
contained elements from row k to n, the original index in Ab matrix is (k-1)+idx.
The next three lines perform the row interchange: the kth row is stored in a dummy
variable pivotRow, the idxth row overwrites the kth row, and the stored pivot-
Row overwrites the idxth row.
Appendix C: Gauss Elimination for Linear Equations ◾ 497
The readers can verify that the results from this code match the one obtained from
hand calculations and described immediately preceding this example.
Recall that I had mentioned that the MATLAB function lu gives LU decomposition with
partial pivoting, in the form PA = LU. Permutation matrix P takes care of the row inter-
changes due to pivoting. In this example, only rows 3 and 1 were exchanged. Therefore
é0 0 1ù
ê ú
P = ê0 1 0ú
ê1 0 0 úû
ë
é3 4 -2 ù é 1 0 0ù
ê ú ê ú
U = ê0 -1.6667 4.3333ú , and L = ê0.6667 1 0ú
ê0 0 0.8 úû ê0.3333 0. 2 1 úû
ë ë
>> [L,U,P]=lu(A)
L =
1.0000 0 0
0.6667 1.0000 0
0.3333 0.2000 1.0000
U =
3.0000 4.0000 -2.0000
0 -1.6667 4.3333
0 0 0.8000
P =
0 0 1
0 1 0
1 0 0
Most commercial linear algebra solvers use the Gauss Elimination method with partial piv-
oting. It is not recommended to use naïve Gauss Elimination. In any case, linear algebra is a
very strong suite of MATLAB. Unless you have reasons otherwise, I recommend only using
the inbuilt MATLAB methods for linear algebra since they are well optimized.
C.4 MATRIX INVERSION
Matrix inversion is done by writing the following linear equations:
Ac1 = e1 , Ac 2 = e 2 , ¼, Ac n = en
498 ◾ Appendix C: Gauss Elimination for Linear Equations
where ei is the ith unit vector (i.e., column vector with all zeros, except ith element is 1). The
inverse of the matrix A contains vector ci as its ith column. Thus, iA=[c1, c2, ...,
cn]. The inverse is obtained by writing
A Aug = [A | In´n ùû
Two methods are popular among students and practitioners to solve a linear problem Ax = b
in MATLAB:
>> x=inv(A)*b;
or
>> x=A\b;
If solving a linear equation is the aim, one should always prefer the second method.*
Calculating inverse, as described above, takes a lot more computational effort than solving
a linear equation using the slash operator.
* Note that slash operator also solves a linear least squares problem (see Chapter 10) if A is a non-square “tall” matrix with
nrows > ncols. So be careful!
Appendix D: Interpolation
D.1 GENERAL SETUP
Consider a car moving on a city road. The speed of the car recorded at every 10 s interval
is given in Table D.1.
Interpolation is used if one is interested in knowing the speed of the car at any time
between 0 and 90 s. Interpolation involves “joining the dots” with a smooth curve and read-
ing off the data on this fitted curve at any time of interest.
The above example of interpolating vehicle speed might be closer to chemical engi-
neering than you think. It is indeed used in simulating exhaust emissions performance
of a vehicle under test conditions. For example, the U.S. FTP-75 cycle is U.S. Federal Test
Protocol for emission testing of light-duty vehicles.* It involves testing a vehicle under
simulated urban driving conditions. Numerical simulation of a process under U.S. FTP-75
cycle would involve simulating the behavior at any arbitrary computational point within
the entire cycle. Since discrete measurements are limited to the measurement frequency,
interpolation is used to obtain data as a smooth continuous function of time. Figure D.1
shows the U.S. FTP-75 test cycle. As can be seen from the figure, the vehicle speed changes
frequently in the 1833 s test cycle. The data may be sampled at a rate of 10 to 0.1 Hz. If
simulation requires data between the sampling instances, interpolation is used to fill in
that data.
Another example would be in the modeling of solar or wind power sources. Solar radia-
tion or wind speeds may be available at discrete time instances, whereas numerical simula-
tion would require the information as a continuous function of time. Similar to the example
above, the intermediate data may be filled in using interpolation.
Let the tabulated data above be represented as n pairs: (t1, y1) , (t2, y2) , … , (tn, yn). Here, t
is the independent variable, y is dependent variable, and ti and yi represent the data points.
Furthermore, we will try to fit some functional form, y = p ( t ) to the above data, such that
the fitted function passes through all the data points. In other words, each of the n data
points exactly satisfies the equation yi = p ( t i ).
499
500 ◾ Appendix D: Interpolation
TABLE D.1 Data for Speed of a Moving Vehicle in Moderate Traffic Urban Conditions
Time (s) 0 10 20 30 40 50 60 70 80 90
Speed (km/h) 45 32 0 0 7 12 20 15 29 55
70.0
Cold start phase Stabilized phase Hot start phase
60.0
50.0
Speed (mile/h)
40.0
30.0
20.0
10.0
0.0
0 200 400 600 800 1000 1200 1400 1600 1800 2000
Time (s)
FIGURE D.1 Vehicle speed vs. time in a U.S. FTP-75 test cycle.
D.2 INTERPOLATING POLYNOMIALS
One of the first attempts at interpolation was the fit a polynomial function for p ( t ). When
there are n data points, one can fit an (n − 1)th order polynomial exactly to these data
points. Indeed, we can choose to write this polynomial as
However, more efficient forms of polynomial function have been introduced so that the
coefficients ai can be obtained in a more straightforward manner. Newton’s interpolating
polynomials take the form
Thus, at the first point, only the first (constant) term on the right-hand side is retained;
at the second point, the first two terms are retained; and so on. The coefficients a0 to an − 1
can be obtained as Newton’s divided difference polynomials. While this formula is more
generic, it will not be discussed in this section. Instead, conceptually simpler Lagrange
interpolating polynomials will be discussed, followed by Newton’s forward difference for-
mula. The latter is applicable when the independent variable, t, is equispaced. In spite of
this strong requirement, this formula is useful for the derivation of integration and differ-
ential equation schemes discussed elsewhere.
Appendix D: Interpolation ◾ 501
pL ( t ) = a1
(t - t2 ) (t - t3 ) + a (t - t1 ) (t - t3 ) (t - t 4 ) + (D.2)
(t1 - t2 ) (t1 - t3 ) 2 (t2 - t1 ) (t2 - t3 ) (t2 - t 4 )
n
pL ( t ) = åa L
i =1
i i ,n (t )
(D.3)
n
t -tj
Li ,n ( t ) = Õ t -t
j =1 i j
(D.4)
j ¹i
Since the polynomial pL ( t ) passes through the first point, (t1, y1) satisfies Equation D.2.
When we substitute t = t1, the numerator of the first term becomes exactly equal to the
denominator. Thus, the first term becomes a1. In all the other terms, the numerator is a
product of (t1 − t1) with other terms, whereas the denominator is nonzero. Thus the other
terms, except the first term, vanish. Thus, y1 = a1. One can easily see the same pattern repeat-
ing for other polynomials as well:
n
t -tj
Li ,n ( t i ) = Õ t -t
j =1 i j
=1
j ¹i
whereas
Li ,n ( t m ) =
(tm - t1 )(tm - tm )(tm - tn ) = 0, i¹m
Denominator
pL ( t ) = åy L
i =1
i i ,n (t )
(D.5)
502 ◾ Appendix D: Interpolation
n
t -tj
where Li ,n ( t ) = Õ t -t
j =1 i j
(D.6)
j ¹i
The application is demonstrated in the following example. This example (for divided differ-
ence formula) is also shown in video lectures on NPTEL.
t - t1
a= (D.7)
h
(
t - t i = ( t - t1 ) - ( t i - t1 ) = h a - ( i - 1) )
Similar to Equation D.1, the interpolating polynomial is now written as follows:
(
pF ( a ) = a0 + a1ha + a2h2 a ( a - 1) + + an-1hn-1 a ( a - 1) ( a - n + 2 ) ) (D.8)
The above polynomial passes through the point (t1, y1), which corresponds to α = 0. At this
value of α, all the terms on the right-hand side, except the constant term, drop out. Thus
y1 = a0 (D.9)
The polynomial also satisfies the condition y = y2 at α = 1. At this condition, the first two
terms on the right-hand side are retained, whereas the other terms drop out:
y2 = a0 + a1h (D.10)
Appendix D: Interpolation ◾ 503
i!
yi +1 = a0 + a1h ( i ) + a2h2 ( i ) ( i - 1) + + ai -1hi -1 + ai hi i ! (D.15)
1!
As before, the first-order forward differences can be computed by subtracting the consecu-
tive equations. From Equations D.12 and D.10, we obtain
So far, the values of the first two parameters are known from the original data (see
Equation D.9) and first-order forward difference Δy1 (see Equation D.11). The next Newton’s
forward difference parameters can be computed from the higher-order forward difference
formula. Specifically,
D 2 y1 = ( Dy2 - Dy1 )
= 2a2h2 (D.19)
Continuing further,
and so on. Without going into further details, one can observe that a clear pattern
emerges:
y1 = a0 (D.9)
D 2 y1 = 2a2h2 (D.19)
D 3 y1 = 3 ! a3h3
(D.21)
Di yi = i ! ai hi (D.22)
n -2
D 2 y1 Dn-1 y1
pF ( a ) = y1 + Dy1a +
2!
a ( a - 1) + +
(n - 1) ! Õ ( a - k )
k =0
(D.23)
Now consider that there exists some function f(t) from which the data points for the inter-
polation were generated. The difference between a value generated from the “true” function
f(t) and the interpolated polynomial pF ( a ) is the error in Newton’s forward difference for-
mula. Consider that a new point, (tn + 1, yn + 1) also becomes available. The error in ignoring
this additional data while computing pF ( a ) is given by the additional term
Dn y1
RF = a ( a - 1) ( a - n + 1) (D.24)
n!
dn Dn y1
f ( x ) = , t1 £ x £ t n (D.25)
dt n hn
Thus, the residual term in computing Newton’s forward difference formula from
Equation D.23 is
hn ( n )
RF =
n!
(
f ( x ) a ( a - 1) ( a - n + 1)
) (D.26)
where, tData and yData form the original data, and the interpolated value yq is returned
for the query point, tq. From the help text for interp1, the following information is dis-
played about the available methods for interpolation:
The first option fits a piecewise linear curve to the datapoints. For the data in Table D.1, the
value at tq = 45 is the average between the values at 40 and 50, as shown below:
>> yq = interp1(tData,yData,45,‘linear’)
yq =
9.5000
The method ‘previous’ provides the so-called zero-order hold (also known as piece-
wise constant) approximation, wherein the value of yq is kept constant at the value yi for the
entire interval ti ≤ t < ti + 1. Thus, the value of yq at tq = 45 is the same as the value at t = 40, that
is, yq=7. The most popular options, though, are cubic spline and piecewise cubic Hermite
interpolating polynomial (PCHIP). The preferred MATLAB functions for cubic spline and
PCHIP algorithms are spline and pchip, respectively. The following example illustrates
their use.
506 ◾ Appendix D: Interpolation
It is interesting to note that for this data, cubic spline happens to give interpolated
value close to that obtained using Lagrange and Newton’s interpolating polynomials,
whereas PCHIP provides a significantly different value.
Spline interpolation fits piecewise polynomial functions to successive points. The sim-
plest one is “linear spline.” Here, two successive points are connected by straight line. The
interpolating function is just a combination of these lines, given by the following piecewise
linear function:
ì y2 - y1
ï y1 + ( t - t1 ) t - t , t1 £ t < t 2
ïï 2 1
(D.27)
y - y
f ( t ) = í y2 + ( t - t 2 ) 3 2
, t2 £ t < t3
ï t3 - t2
ï
ïî
The first function above is clearly a straight line connecting (t1, y1) and (t2, y2). Any point
between t1 and t2 is then interpolated based on this equation. The same applies to the rest
of the data intervals.
The term spline is typically not used for the function above. Instead, this is referred to as
linear interpolation, or more appropriately, piecewise linear interpolation. The linear func-
tions result in a curve that is nonsmooth. The term “spline” evokes an idea similar to using
the Bezier curves in an art class to smoothly connect dots on a canvas. The connecting
curves are smooth. While different splines exist, the practically most relevant is cubic spline
interpolation. When the term “spline” is used without any qualifier, it usually means cubic
spline, where piecewise cubic polynomials are used for interpolation. Thus, a (cubic) spline
would be represented as
ì q1 ( t ) , t1 £ t £ t 2
ï
ï q2 ( t ) , t2 £ t £ t3
f (t ) = í (D.28)
ï
ïqn-1 ( t ) , t n-1 £ t £ t n
î
Appendix D: Interpolation ◾ 507
qi(t) is a cubic function. The choice of qi(t) can be any cubic function; however, as this func-
tion connects points (ti, yi) and (ti + 1, yi + 1), they satisfy the equation. Borrowing idea from
Lagrange interpolation, let us define
t - ti
i = (D.29)
t i +1 - t i
so that at xi, ℓi = 0 and at xi + 1 , (1 − ℓi) = 0. The cubic polynomial may then be written as
The above function has an advantage that two of the four coefficients of a cubic polynomial
are conveniently defined. Note that when ℓi = 0, only the first term remains and the other
three are zero; whereas when ℓi = 1, only the second term is nonzero.
A total of (n − 1) polynomials of the form (D.30) define the cubic spline. Hence, 2(n − 1)
parameters have to be obtained. There are two boundary nodes, t1 and tn and (n − 2) inter-
nal nodes. In order to ensure that the curve is smooth, the slope and curvature of the two
curves qi − 1(t) and qi(t) should be equal at each internal node ti. This leads to the following
2(n − 2) equations:
whereas the final two equations are obtained by imposing no curvature condition at the two
boundary nodes:
q1¢¢( t1 ) = 0 (D.33)
qn¢¢-1 ( t n ) = 0 (D.34)
The derivation of the above equations is straightforward, though tedious. I refer an inter-
ested reader to any numerical methods text for the derivation. The intent of this discussion
was to give a flavor of cubic spline interpolation.
The cubic spline option is available in the interp1 function discussed earlier in this
section. Rather than the generic interp1 function, the preferred means of using cubic
spline is the function spline, as demonstrated in Example D.3. It should be noted that all
these functions allow the independent query variable tq to be a vector. In such a case, the
code returns a vector yq of the same length such that each element of yq is the interpolated
value at the corresponding element of the vector tq. Thus, if interpolated values are required
at multiple points, the function spline or pchip may be used with a vector-valued tq.
508 ◾ Appendix D: Interpolation
The primary advantage of spline function is in the cases where interpolants are
required to be computed multiple times in any MATLAB application. Typically, if the val-
ues of independent variables are known a priori, a vector-valued tq can be used. However,
when the query points for interpolation may be generated within the code, the function
spline (or pchip) needs to be used multiple times. Since calling these functions involves
computing the parameters of Equation D.30 each time for the same data sets tData and
yData, this would be computationally inefficient. A better way to do this is to use the
spline function to define the piecewise cubic polynomials and store them:
Invoking spline without the query vector tq returns a pp structure, which defines the
node points (original values of t in Table D.1) as well as coefficients of the nine cubic poly-
nomials; it also indicates that this pp structure is cubic spline interpolation (order: 4)
for 1D interpolation (dim: 1). Once this structure is obtained, it may be used multiple
times through the ppval command:
>> yq = ppval(splineParam,45)
yq =
9.0185
Note that this result is the same as that in Example D.3. The next example demonstrates the
use of spline interpolation through multiple applications.
% Given data
tData = [0:10:90]';
yData = [45 32 0 0 7 12 20 15 29 55]';
plot(tData,yData,'bo'); hold on; % Plot original data
60
40
Speed (km/h)
20
–20
0 20 40 60 80
Time (s)
Figure D.2 shows the result of the above code. Spline interpolation results in a smooth
curve fitting to the initial data. The smoothness is due to the conditions of (D.31) and
(D.32) imposed at all interior nodes.
60
Spline
40
Speed (km/h)
20
pchip
0
–20
0 20 40 60 80
Time (s)
FIGURE D.3 Comparison of cubic spline (solid line) and monotonicity shape preserving PCHIP
interpolation (dashed line) for vehicle speed data.
The resulting interpolated values are plotted in Figure D.3. The dashed line shows
that the pchip interpolation preserves monotonicity. Between t = 20 and t = 30, the
interpolated values remain at 0 km/h because the interpolant does not allow yq to get
values outside the range of the bounding points.
This completes our discussion on interpolation. While Newton’s difference formulae were
derived in this chapter, the practical implementation of interpolation in MATLAB is
through spline and pchip functions.
Appendix E: Numerical
Integration
E.1 GENERAL SETUP
Numerical integration aims to find an approximate solution to the problem:
ò
I = f ( x ) dx
a
(E.1)
d
f (x) = y (x) (E.2)
dx
then integral I is equal to I = y(b) − y(a). Numerical methods for computing the integral
can be used if f(x) either is known as an explicit function of x, or can be calculated numeri-
cally or indirectly (for any given value of x), or is available as a tabulated pair (x, f(x)).
Additionally, note that Equation E.2 is an ordinary differential equation. The aim of an
ODE solver is to obtain y(x) given an initial value y(a) at a point x = a. Thus, evaluating
the integral I is equivalent to solving the ODE (E.2) for y(a) = 0. An equivalence between
numerical integration and solving ODE is provided in a plug flow reactor (PFR) case study
in Chapter 3.
511
512 ◾ Appendix E: Numerical Integration
f (x)
a b
ò
0 0
H = H + c p dT
T T0 (E.3)
T0
Xset
FA0
Vpfr =
ò ( -r ( X )) dX
0
A
(E.4)
In the above equation, Vpfr is the desired volume of the PFR, whereas the rate of chemical
reaction taking place in the system depends on the concentration. It can be converted into
a function of conversion, X, and is given by the function −rA(X). In the rest of this chapter,
the example of PFR design equation will be used to demonstrate the concepts of numerical
integration.
Appendix E: Numerical Integration ◾ 513
h
I»
2
( fa + fb ) (E.5)
T
f (x)
a b
FIGURE E.2 Schematic showing single application of the trapezoidal and Simpson’s 1/3rd rules.
514 ◾ Appendix E: Numerical Integration
the value of I. In case of trapezoidal rule, the polynomial connecting the two endpoints is a
straight line, T(x). The equation of this line is given by
T ( x ) - fa fb - f a
=
x -a b-a
Integrating the above equation yields the approximate value of I using the trapezoidal rule:
b
ò
I » T ( x ) dx
a
(E.7)
b
é ù b
ê ú fb - f a é x 2 ù
I = fa ê x ú + ê - ax ú
h ë2 ûa
ê ú
ë ûa
f b - f a é b2 - a 2 ù
I = fah + ê - a (b - a )ú
h ë 2 û
æ a +b ö
I » fa h + ( fb - f a ) ç -a÷ (E.8)
è 2
ø
h
2
It is easy to verify that this is the same as the trapezoidal rule written in Equation E.5. ■
Thus, the trapezoidal rule has been derived in two different ways: geometrically using the
analogy with the area of a trapezoid and algebraically integrating the equation of a straight
line connecting the two points. While these derivations are elegant, they still do not reveal
the numerical error in computing the approximate value of integral using Equation E.5.
A formal derivation of the trapezoidal rule including the error estimates will be discussed
presently in Section E.2.2. The overall trapezoidal rule may be written as follows (also see
Equation E.17):
h h3
I=
2
( f1 + f 2 ) -
12
f ¢¢ ( x ) (E.9)
This means that the error in the trapezoidal rule scales as a cube of the interval size. In other
words, the error reduces by a factor of 8 if the interval size, (b − a), is halved. Moreover,
Appendix E: Numerical Integration ◾ 515
the integral is exact if f″(ξ) = 0, ξ ∈ [a, b]. Obviously, when f(x) is a straight line, then the
trapezoidal rule is exact. Practical implementation of the trapezoidal rule, including dem-
onstration of the error analysis is presented in Example E.1.
ò (2 - x + ln ( x )) dx
1
Solution (direct): The solution to the problem will involve computing fa, fb and hence
use Equation E.5 to compute the integral. The true value of integral is
2
x
y (x) =
ò (2 - x + ln ( x )) dx = x ln ( x ) + x - 2
Thereafter, the true value is computed as y(b) − y(1) and the absolute error can be
found. The MATLAB code to do this is given below:
The value of integral for b = 2 is 0.8466, whereas the numerical error is 3.9721E-2.
Thus, using the trapezoidal rule with a large step-size of 1 results in nearly 4% error.
Next, the error behavior is analyzed for various values of b. The results are given
below:
b h error
2 1 3.972e-2
1.5 0.5 6.831e-3
1.1 0.1 7.569e-5
Thus, when the interval width is decreased by one order of magnitude, from 1 to 0.1,
the error decreases by almost three orders of magnitude. This is because, as seen in
Equation E.9, the trapezoidal rule has an accuracy of O h3 . ( )
516 ◾ Appendix E: Numerical Integration
While the above solution works, it can be made more elegant using MATLAB function. A
separate MATLAB function fun4Int may be defined to calculate f(x) = 2 − x + ln(x) as a
function of independent variable x. There are two ways to do this:
Once the function has been defined, the code in Example E.1 may be modified as follows :
% Problem Setup
a=1; b=2;
h = (b-a);
% Trapezoidal Rule (single application)
I_trap=h/2*(fun4Int(a)+fun4Int(b));
err_trap=abs(trueVal-I_trap);
The solution above is more elegant than the “direct” solution. It is modular, in that it allows the
core integration code to be used for integrating other functions with little change. Since the
function f(x) is written once and called multiple times as needed, it is prone to lesser errors.
t1 = a, f1 = f ( t1 )
a +b
t2 = , f2 = f (t2 ) (E.10)
2
t 3 = b, f 3 = f ( t 3 )
Simpson’s 1/3rd rule along with the numerical error is given by the following expression:
h h5
I=
3
( f1 + 4 f 2 + f 3 ) -
90
f ²² ( x ) (E.11)
Appendix E: Numerical Integration ◾ 517
Simpson’s 1/3rd rule requires, for each interval, one additional function evaluation than
the trapezoidal rule. However, the accuracy of the numerical integration formula increases
( ) ( )
by two orders of step-size, from O h3 in case of the trapezoidal rule to O h5 in case of
Simpson’s 1/3rd rule. The derivation of Simpson’s 1/3rd rule and error estimate will be dis-
cussed in Section E.2.2. Use of this method for numerical integration will be demonstrated
in the next example.
b h error
2 0.5 4.598e-4
1.5 0.25 2.772e-5
1.1 0.05 1.718e-8
In this example, when the interval width is decreased by one order of magnitude, from
1 to 0.1, the error decreases between four and five orders of magnitude: The ratio of
the two errors is 3.7 × 10−5, consistent with the observation that Simpson’s 1/3rd rule
( )
is O h5 accurate.
Direct comparison between the trapezoidal and Simpson’s 1/3rd rules reveals that the latter
is more accurate. This rather significant improvement in accuracy comes at a cost of only
one additional function evaluation. Another example, discussed in Section E.2.3, will dem-
onstrate the significance of the order of accuracy: The additional computational require-
ment is justified by highly significant reduction in accuracy of the numerical method.
3h 3h5
I=
8
( f1 + 3 f 2 + 3 f 3 + f 4 ) -
80
f ²² ( x ) (E.12)
518 ◾ Appendix E: Numerical Integration
Equation E.12 gives Simpson’s 3/8th rule. Comparing with Equation E.11, Simpson’s 3/8th
rule has the same order of accuracy as the 1/3rd rule, since the error in both these methods
( )
scales as O h5 . Note that Simpson’s 1/3rd rule gives the same order of accuracy as the 3/8th
rule, but with one less function evaluation in each application.
Since Simpson’s 3/8th rule follows the same general pattern as the trapezoidal and
Simpson’s 1/3rd rules, a demonstration will be skipped for brevity. The reader can verify
that the error using Simpson’s 3/8th rule for Example E.1 is of the same order of magnitude
as the 1/3rd rule, albeit slightly lower.
òa
ò
I = f ( x ) dx » p ( x ) dx
a
(E.13)
In case of trapezoidal rule, a first-order interpolating polynomial is fitted to two data points.
Thus, the polynomial p takes the following form:
f ¢¢ ( x )
p ( a ) = f1 + Df1a + a ( a - 1) h2 (E.14)
2
Recall from Appendix D that for Newton’s forward difference interpolation, we define
x -a
a= , with h = ( b - a ) (E.15)
h
Change the variable of integration to α, the limits of integration are from α = 0 to α = 1:
1
ò
I = p ( a ) × h da
0
Substituting
f ¢¢ ( x )
1
æ ö
ò
I = h ç f1 + Df1a +
ç
0è
2!
a ( a - 1) h2 ÷ da
÷
ø
1
é a2 f ¢¢ ( x ) 2 æ a3 a2 ö ù
= h ê f1a + Df1 + h ç - ÷ú (E.16)
êë 2 2! è 3 2 ø úû
0
Appendix E: Numerical Integration ◾ 519
éf +f ù f ¢¢ ( x ) é 1 ù
I = h ê 1 2 ú + h3 - (E.17)
2 2 êë 6 úû
ëû
Formula Error
( )
This demonstrates the O h3 accuracy of the trapezoidal rule mentioned in Section E.2.1
and demonstrated in Example E.1.
In a similar manner, Simpson’s 1/3rd rule may also be derived. Here, the step-size is given by
b-a
h=
2
With the same definition of α as before, the limits of integration, x = a and x = b, correspond
to α = 0 and α = 2, respectively. Thus, integral I in Simpson’s 1/3rd rule is written as
2
ò
I = p ( a ) × h da
0
f ¢¢¢ ( x )
2
æ 1 ö
ò
I = h ç f1 + Df1a + D 2 f1a ( a - 1) +
ç
0è
2 3!
a ( a - 1) ( a - 2 ) h3 ÷ da
÷
ø
2
é a2 æ a3 a2 ö f ¢¢¢ ( x ) 3 æ a 4 öù
= h ê f1a + Df1 + D 2 f1 ç - ÷+ h ç - a3 + a2 ÷ ú (E.18)
êë 2 è 6 4 ø 6 è 4 ø úû 0
é æ 4 öù f ¢¢¢ ( x )
I = h ê2 f1 + 2Df1 + D 2 f1 ç - 1 ÷ ú + h 4 éë 4 - 8 + 4 ùû (E.19)
ë è3 ø û 6
Is Error = 0?
Formula
The above equation gives a surprising result. The first part gives Simpson’s 1/3rd rule.
However, notice that in the final term in the above equation, the expression in brackets
equates to zero. Does that mean that Simpson’s 1/3rd rule is accurate and does not have any
numerical error? As demonstrated in Example E.2, there is a numerical error in implement-
ing Simpson’s 1/3rd rule. So, what the above derivation implies is that the leading error term
is not the third-order term; instead, the analysis needs to be done including an additional
term in Newton’s forward difference polynomial used in (E.18). Doing so yields
2
æ 1 1
è ò
I = h ç f1 + Df1a + D 2 f1a ( a - 1) + D 3 f1a ( a - 1) ( a - 2 )
0
2! 3!
f ²² ( x ) ö
+ a ( a - 1) ( a - 2 ) ( a - 3 ) h 4 ÷ da (E.20)
4! ø
520 ◾ Appendix E: Numerical Integration
Note that the first four terms are the same as the one obtained in Equation E.19. Thus
2
é 1 f ²² ( x ) 4 æ a5 3 4 11 3 öù
I = h ê2 f1 + 2Df1 + D 2 f1 + D 3 f1 × 0 + h ç - a + a - 3a2 ÷ ú
êë 3 24 è 5 2 3 ø úû 0
é 1 ù
I » h ê2 f1 + 2 ( f 2 - f1 ) + ( f 3 - 2 f 2 + f1 ) ú
ë 3 û
and
f ²² ( x ) æ 32 88 ö f ²² ( x )
E = h5 ç - 24 + - 12 ÷ = -h5
24 è 5 3 ø 90
h f ²² ( x )
I= éë f1 + 4 f 2 + f 3 ùû - h5 (E.21)
3 90
Formula Error
The above equation shows Simpson’s 1/3rd rule along with the error estimate. This justi-
fies the observation in Example E.2 regarding the effect of interval size on the accuracy of
Simpson’s 1/3rd rule.
b-a
h= , x1 = a, xn+1 = b
n
where Iitrap is the integral obtained at the ith interval using data at ith and (i + 1)th nodes:
h -h3
Iitrap =
2
( fi + fi +1 ) +
12
f ¢¢ ( xi ) (E.23)
n
h h3
I = éë( f1 + f 2 ) + ( f 2 + f 3 ) + + ( fn + fn+1 ) ùû -
2 12 å f ¢¢ ( x )
i =1
i (E.25)
h h3
I=
2
éë f1 + 2 ( f 2 + + fn ) + fn+1 ùû -
12
nf ¢¢ ( x ) (
) (E.26)
The last term in the above equation is derived using mean value theorem, with x Î éëa, b ùû.
The first sum gives the integral using multiple applications of the trapezoidal rule. Using the
relationship of h, the error term may be written as
(b - a )
3
Enet =- nf ¢¢ ( x ) (E.27)
12n3
(b - a ) h2 f ¢¢
Enet = -
12
( x ) (E.28)
There are a few things to note from the error analysis of multiple applications of the trapezoidal
rule. The first thing is that the net error is proportional to h2 and not h3. This means that the
error reduces by two orders of magnitude when the step-size is reduced by one order of magni-
tude. This issue of local vs. global truncation errors (GTEs) is discussed at length in Chapter 1.
Another important point to note is regarding this drop in order of accuracy for GTE.
Throughout this book, I have discussed the importance of the order of accuracy, that
( )
O hn method would often be a preferred numerical method over O hm , when n > m. ( )
So, what does it mean when the local truncation error (LTE) scales as O h3 while the ( )
( )
GTE scales as O h2 ? Does that mean that multiple applications of the trapezoidal rule
are not useful? In reality, this is not the case because for a single application, h = (b − a),
whereas for multiple applications of the trapezoidal rule, h = (b − a)/n. So, the compar-
ison between two options is valid only if the step-size h refers to the same quantity.
In LTE, h refers to the entire interval size, whereas in GTE it refers to individual step-
size. On the other hand, comparison between the trapezoidal and Simpson’s rules is still
valid because in both these cases h refers to the same thing.
522 ◾ Appendix E: Numerical Integration
So, how does one compare single and multiple applications of the trapezoidal rule?
In case of the former, n = 1 in Equation E.27. Thus the error in a single application of the
trapezoidal rule is
(b - a )
3
Enet = - f ¢¢ ( x ) (E.29)
12
(b - a )
3
Enet =- f ¢¢ ( x ) (E.27)
12 n2
Thus, the error in multiple applications of the trapezoidal rule is expected to improve by a
factor of 4 when two intervals are used, and by two orders of magnitude when n ∼ 10 inter-
vals are used. The next example demonstrates this.
ò (2 - x + ln ( x )) dx
1
When the code is executed for n = 10, the error in the trapezoidal rule is 4.164 × 10−4.
Comparing with Example E.1, the error has fallen by nearly two orders of magnitude,
as predicted in Equation E.27.
Next, the step-size was halved (with n = 20). The resulting error was
In summary, the error decreased by a factor of 4 when the step-size was reduced by
half, whereas it decreased by two orders of magnitude when the step-size was reduced
by one order of magnitude. This observation is a consequence of the fact that the GTE
of the trapezoidal rule scales as O h2 . ( )
Another interesting observation on comparing the above results with Example E.2 is
that the error in a single application of Simpson’s 1/3rd rule happened to be similar to the
error using 10 applications of the trapezoidal rule. This demonstrates the power of using
higher-order formulae: Three function evaluations in Simpson’s 1/3rd rule were sufficient
to provide similar accuracy as eleven function evaluations in the trapezoidal rule.
h
I= é( f1 + 4 f 2 + f 3 ) + ( f 3 + 4 f 4 + f 5 ) + + ( fn-1 + 4 fn + fn+1 ) ùû + Enet (E.30)
3ë
f (x)
a b
The reader can verify that the formula for and the error in Simpson’s 1/3rd rule is given by
h
I= é f1 + 4 ( f 2 + f 4 + + fn ) + 2 ( f 3 + f 5 + + fn-1 ) + fn+1 ùû + Enet (E.31)
3ë
(b - a ) f ²²
Enet = -h 4
90
( x ) (E.32)
With fVec available, Equation E.26 is used for the trapezoidal rule, Equation E.31
for Simpson’s 1/3rd rule, and a similar equation (derivation is left as an exercise to the
reader) for Simpson’s 3/8th rule. With the true value of the integral known, the errors
can also be computed. The results are given below.
Value of integral using the three methods
0.8851 0.8863 0.8863
will give the exact same result as the trapezoidal rule implemented in Example E.3 earlier.
Simply replacing the third line from the end from Example E.3 will provide the same results
as multiple implementation of the trapezoidal rule.
Appendix E: Numerical Integration ◾ 525
A more powerful computation of integral is provided by the command quad. The syntax
for quad is
where the first argument passes the function fun4Int that calculates f(x), while the last
two arguments are limits of integration. This uses Simpson’s quadrature method.
In later versions of MATLAB, quad will be superseded by an improved algorithm inte-
gral. This uses global adaptive quadrature method. The regular implementation is similar:
There are two main differences between trapz and the two quadrature methods men-
tioned here. The first one is that trapz takes actual values [x1 x2 ⋯ xn + 1] and corre-
sponding function values éë f1 f 2 fn+1 ùû as input arguments. On the other hand, quad
and integral require the actual function f(x), in the form of a MATLAB function file
(or MATLAB anonymous function) as an input argument, and compute the integral for the
specified integration limits (also input arguments). The second difference is that trapz
simply calculates the numerical integral value as such; quad and integral calculate
the numerical integral such that the error E < tol. In other words, if Inum is the numerical
integration value and Itrue is the true value of the integral, then the step-size h is chosen such
that the error |Itrue − Inum| < εtol.
The error tolerance value εtol can be specified by the user. If it is not specified (as in the
example above), a default value of εtol = 10−6 is used.
Recall that according to Equation E.32, it is possible to know the error Enet. However, in
most practical examples, it is either not possible or not convenient to know f ²² ( x ). Hence,
there needs to be an alternate method to estimate the error.
Simpson’s quadrature method, implemented in quad, utilizes the fact that although the
error cannot be known exactly, there are ways to estimate it even when f ²² ( x ) is unknown.
I will use Simpson’s 1/3rd rule to explain the concept behind this.
The true value of integral and its relationship with Simpson’s 1/3rd rule is
1
where c1 = - ( b - a ) f ²² ( x ) is a constant.
90
If the same calculation is repeated using a step-size of h/2
h4
I true = I simp ( h/2 ) + c1,h /2 (E.34)
24
1
where again c1 = -
90
(b - a ) f ²² ( x ).
526 ◾ Appendix E: Numerical Integration
Usually, f ²² ( x ) from Equation E.33 and f ²² ( x ) from Equation E.34 are reasonably con-
stant for most practical problems. Thus, the difference between the two will provide an
estimate of the error:
15h 4
0 = I simp ( h ) - I simp ( h/2 ) + c1 (E.35)
16
Since
Enet = c1h 4
16
Enet ~ I simp ( h ) - I simp ( h/2 ) (E.36)
15
gives an estimate of the error. Simpson’s quadrature method involves calculating the integral
at a chosen value of h, then repeating the calculations with half step-size h/2. Thereafter, the
error is estimated as per Equation E.36. If this value is above the required tolerance, the
algorithm sets h ← h/2 and the procedure is repeated until the error tolerance is met.
When this condition is met, the more accurate of the two values is returned as the integral.
The command integral uses a different algorithm. Since it is not aligned with the
objective of this textbook, it will not be discussed further. This chapter ends with the fol-
lowing example of implementation of quad and integral.
Bequette, B.W., Process Dynamics: Modeling, Analysis, and Simulation (1998), Prentice Hall PTR,
Upper Saddle River, NJ.
Butcher, J.C., Numerical Methods for Ordinary Differential Equations, 2nd edn. (2008), Wiley, West
Sussex, England.
Carslaw, H.S. and Jaeger, J.C., Conduction of Heat in Solids, 2nd edn. (1986), Oxford University Press,
Oxford, U.K.
Chapra, S.C. and Canale, R., Numerical Methods for Engineers, 5th edn. (2006), McGraw-Hill, Inc.,
New York.
Cutlip, M.B. and Shacham, M., Problem Solving in Chemical and Biochemical Engineering with
POLYMATH, Excel, and MATLAB, 2nd edn. (2008), Prentice Hall, Upper Saddle River, NJ.
Fausett, L.V., Applied Numerical Analysis Using MATLAB, 2nd edn. (2007), Pearson Inc., New York.
Fogler, H.S., Elements of Chemical Reaction Engineering, 5th edn. (2008), Prentice Hall, Pearson
Education, Boston, MA.
Golub, G.H. and Van Loan, C.F., Matrix Computations, 3rd edn. (1996), Johns Hopkins University
Press, Baltimore, MD.
Gupta, S.K., Numerical Methods for Engineers, 3rd edn. (2013), New Academic Press, New Delhi,
India.
Petzold, L.R. and Ascher, U.M., Computer Methods for Ordinary Differential Equations and Differential-
Algebraic Equations (1998), Society for Industrial and Applied Mathematics, Philadelphia, PA.
Press, W.H., Teukolsky, S.A., Vetterling, W.T., and Flannery, B.P., Numerical Recipes: The Art of
Scientific Computing, 3rd edn. (2007), Cambridge University Press, New York.
Pushpavanam, S., Mathematical Methods in Chemical Engineering (2004), Prentice-Hall India Pvt.
Ltd (Verlag), New Delhi, India.
Shampine, L.F. and Reichelt, M.W., The MATLAB ODE suite, SIAM Journal on Scientific Computing,
18 (1997), 1–22.
Shampine, L.F., Reichelt, M.W., and Kierzenka, J.A., Solving index-1 DAEs in MATLAB and
Simulink, SIAM Review, 41 (1999), 538–552.
Skogestad, S., Dynamics and control of distillation columns: A tutorial introduction, Transactions of
IChemE (UK), 75, Part A (1997), 539–562.
Strang, G., Linear Algebra and Its Applications, 4th edn. (2006), Cengage Learning—India Edition,
New Delhi, India.
527
Index
A Bisection method
convergence order, 229
Adams-Bashforth (AB) methods
failure, 231
coefficients, 321–322
function values, 228
definition, 320–321
iteration value, 228
second-order, 92
MATLAB® code, 232–233
stability, 325
Redlich-Kwong equation, 229–231
Adams-Moulton (AM) methods
Blasius equation, 391
first-order method, 318
Boundary value problems (BVPs), 71–72
higher-order method, 319–320
Bracketing method, 228–233, 235
second-order method, 318–319
Brent’s method, 235–236
stability, 325
Broyden’s method, 246
Adaptive step-size methods, 329–330
Brusselator system, 384–387
Antoine’s equation, 410, 431
Butcher’s tableau, 87, 94
linear regression, 439–440
BVPs, see Boundary value problems
nonlinear regression, 440–442
vapor-liquid equilibrium, 253–254
Array functions, 461
C
Cartesian coordinate system, 341
Catalyst pellet
B
Langmuir-Hinshelwood kinetics, 308–311
Backward difference formula (BDF) methods Thiele modulus, 305–307
first-order method, 322 Chemostat analysis
formula for, 323 eigenvalues and eigenvectors, 188–189
NDF, 324 Jacobian value, 188
second-order method, 322–323 multivariate Taylor’s series expansion, 187
stability, 325–326 phase portrait, 190
Banded system, 277–278, 288–289 stability condition, 189–190
Batch distillation model, 356–357 trivial steady state, 190–191
Antoine’s equation, 354 Chemostat model, 261
benzene and ethylbenzene, 353–354 bifurcation analysis, 366–368
definition, 355 multivariate Newton-Raphson technique,
flowrates, 358–359 243–244
process, 354–355 phase-plane analysis, 365–366
Raoult’s law, 354 steady state multiplicity and stability,
temporal variations, 358 364–365
vapor fraction, 354–355 stiff system, 120–124
BDF, see Backward difference formula methods transcritical bifurcation
Belusov-Zhabotinsky (B-Z) reaction, 384 bifurcation plot, 371
Bifurcation point, 367 dilution rate, 369
529
530 ◾ Index
CSTR, 338 L
definition, 316–317
Langmuir-Hinshelwood-Hougen-Watson (LHHW)
direct substitution, 338
model
first-order kinetics, 342
different initial concentrations, 444–445
formulating and solving, 339
rate expression, 441
heterogeneous reactor models, 341–342,
single concentration, 442–444
351–353
Langmuir-Hinshelwood (LH) kinetics, 308–311
index of, 340–341
Lax-Friedrichs scheme, 139
nested approach, 343–345
Lax-Wendroff method, 140
ode15s, 348–350
Linear algebra
pendulum model, 341
eigenvalues and eigenvectors
distributed parameter
applications, 58–62
systems, 316
characteristic equation, 55–56
Euler’s implicit method, 325
complex eigenvalues, 56
MATLAB® nonstiff solvers, 326
decomposition, 56–58
MATLAB® stiff solvers, 326
definitions, 54–56
non-self-starting property, 318
linear difference equations, 64–65
RK methods, 317
linear differential equations, 63–64
stiff system
similarity transform, 62–63
linearized model, 314
overview, 28–30
ODE-IVP method, 315
SVD
phase-plane plot, 314
condition number, 47
Runge-Kutta (RK) methods, 313
directionality, 51–54
single variable, 315
MATLAB command, 45–46
trapezoidal methods, 325
orthonormal vectors, 41–42
adaptive step-sizing, 329–330
rank of a matrix, 47
AM-2, 327–329
sensitivity of solutions, 47–51
multivariable, 330
system of equations, 27–28
Integral equations
vector space
complex kinetics, 266–268
blending operations, 40–41
first-order kinetics, 263–265
change of basis, 34–35
Fredholm integral equations, 262
definition and properties, 30–32
Interpolation
image space and column space, 39
polynomial function
linear independence, 32–33
Lagrange interpolation, 501–502
natural basis, 34
Newton’s forward difference formula,
null and image spaces in MATLAB, 39–40
502–505
null and image spaces of matrix, 35–38
speed data, 499–500
span, 32
spline
subspace, 33
cubic spline interpolation,
Linear difference equations, 64–65
508–509
Linear differential equations, 63–64
methods, 505
Linear least squares regression
vs. pchip interpolation, 506–510
error and coefficient
U.S. FTP-75 test cycle, 499–500
quantitative information, 420
Inverse quadratic interpolation, 236
sample standard deviation, 419
Iterative methods
statistics, 420–423
Gauss-Siedel method, 293–297
general matrix approach, 415–416
with under-relaxation, 297
maximum likelihood solution, 418–419
straight line
error, 414–415
J
N discrete points, 414
Jordan canonical form, 63 specific heat data, 417–418
Index ◾ 533