Solving_Linear_and_Nonlinear_Equations_with_Python_1731869913
Solving_Linear_and_Nonlinear_Equations_with_Python_1731869913
Prepared by:
Dr. Gokhan Bingol ([email protected])
November 16, 2024
(Initial publication: February 10, 2022)
Document version: 4.0
Before the use of computers, there were several ways to solve algebraic and transcendental equations. In
some cases, the roots could be obtained by direct methods, however many other equations could not be
solved directly (Chapra & Canale 2013). The linear and nonlinear equations have arisen not only in many
aspects of process engineering analysis but also in many machine learning (ML) and artificial intelligence
(AI) methods. Therefore, mastering the methods to obtain the solutions of these equations is not only
essential to be able to understand, analyze and design engineering systems but also enables engineers to
tackle a range of ML/AI challenges, from predictive modeling to feature extraction.
Given the abundance of tools at our disposal, students generally question whether it is necessary to learn
the methods presented in this work. The answer is, it depends. If one only needs to solve a single equation
that can be conveniently plotted, then this approach would probably be the best. On the other hand, if the
equation is generated by say Process A and Process B needs the solution to continue its computation then
the need to numerically solve the equations arises. Even without the knowledge of any of the methods in
the following sections, it is still possible to “numerically” find the root of a given function by writing a
simple code. Let’s work on finding the root of f(x) = x2-5 = 0 in the interval of [0, 4].
Script 1.1
import numpy as np
n=1
while True:
x = np.linspace(start=0, stop=4, num=10**n)
y = np.fabs(x**2 - 5)
index = np.argwhere(y<1E-5)
if len(index)==0:
n += 1
continue
print(f"Generated {10**n} numbers and root={x[index]}")
break
Generated 1,000,000 numbers and root=[[2.23607]]
Note that somewhere between 100,000 to one million linearly spaced numbers has to be generated to be
able to find the root with a tolerance of 10 -5. Needless to say, this also means that the function has to be
evaluated over 100,000 times.
If all we were interested in was finding the solution of the equation, then the above approach should work
fairly well; however, it drastically suffers from the performance point of view. If the solution was needed
many times such approach could have been clearly the bottleneck. Of course, the above approach was a
rather naive way of solving the equation but clearly demonstrated the need for better approaches. Let’s try
a different way to solve the same problem:
Script 1.2
f = lambda x: x**2 - 5
x0 = 5 #initial guess
length, TOL = 1, 1E-5
iter = 0 #number of iterations
while True:
iter += 1
fx = f(x0)
if abs(fx)<TOL:
break
print(x0, iter)
2.23607 24
This approach is significantly better than our first attempt (Script 1.1) since the number of iterations is
reduced from over 100,000 to only 24.
In the following sections, you'll find methods that, with the help of modern computers, make solving
complex equations efficient. The target audience of the current work is engineers and therefore this
document assumes the reader already has some background in calculus, numerical analysis and a basic to
intermediate knowledge of Python programming language. Throughout, you’ll encounter detailed, real-
world engineering examples designed to bring these methods to life and deepen your understanding.
3
2. Bracketing Methods
Intermediate value theorem: A continuous function whose domain contains the interval [a, b] takes any
value between f(a) and f(b) at some point in the interval (Wikipedia, 2023)1.
Bolzano's theorem: This is a corollary to intermediate value theorem and states that if a continuous
function has values of opposite sign inside an interval, say [a, b], then it has a root in that interval.
It is due to Bolzano’s theorem that the numeric methods using bracketing methods, which rely on the fact
that the signs of f(a) and f(b) are different, force that f(a) and f(b) must have different signs. Once a proper
choice of [a, b] is made, bracketing methods guarantee that a root will be found.
For example, if we attempt to solve f(x) = x2-5 = 0 using bracketing methods and if the interval is chosen
to be [3, 5], since f(3)f(5)>0 an error will be raised:
Script 2.1
from scisuit.roots import bisect
f = lambda x: x**2-5
result = bisect(f=f, a = 3, b = 5 )
RuntimeError: Func has same sign at both bounds.
One should note that the condition f(a)·f(b)>0 does not necessarily imply that there is no root, it only tells
us that we are not guaranteed to find a root. Instead of choosing a=3, if one chooses a=0:
Script 2.2
result = bisect(f=f, a = 0, b = 5 )
print(result)
Bisection using brute-force method
Root=2.23607, Error=9.54e-06 (18 iterations).
It is observed that a root has been found since f(a)·f(b)<0, and therefore according to Bolzano’s theorem
there lies at least one root in the interval. However, note that f(a)·f(b)<0 does not guarantee that there is
only a single root in the interval and according to Smith 2 (1998) when an equation has multiple roots, it is
the choice of the initial interval which determines which root is located.
1 https://en.wikipedia.org/wiki/Intermediate_value_theorem
2 Smith, MD (1998). https://web.mit.edu/10.001/Web/Course_Notes/NLAE/node2.html
2.1. Bisection method
Bisection method uses two different approaches to locate the root:
A) Brute force,
B) False position (Regula falsi).
It is seen that the brute-force approach required 18 iterations to reach the root (compare with Scripts 1.1 &
1.2). Oliveira & Takahashi (2020) remarks that typically bisection method produces an estimate with a
precision higher than needed. In line with that, although the tolerance level was set to 10 -5, the method
yielded an error in the order of 10 -6. It is insightful to take a look under the hood to see how the bisect
function actually found the root:
5
Table 2.1: Working internals of bisection method
iter a b xn f(a) f(b) f(xn) Error
1 0 4 2 -5 11 -1
2 2 4 3 -1 11 4 1
...
6 2.125 2.25 2.1875 -0.484375 0.0625 -0.214844 0.0625
...
17 2.236053 2.236084 2.236069 -0.000065 0.000072 0.000003 0.000015
18 2.236053 2.236069 2.236061 -0.000065 0.000003 -0.000031 0.000008
In the 1st iteration the bounds were [0, 4], which were provided by the user, and therefore the midpoint
X1=2. Since in the 1st iteration f(X1) = -1 < 0 therefore in the 2nd iteration the lower bound, a, was replaced
with X1 and therefore [2, 4] became as the new brackets containing the root. Observe that the length of the
interval is always halved (1st = 4 -0 = 4, 2nd = 2).
We might assume that each and every iteration strictly brings us closer to the true root. Although, the
overall number of iterations indeed does that, while finding to the root, as seen from Fig. (2.2), the true
error might show some oscillations.
40
% True Relative Error
35
30 Note that while in the 1st iteration the true error
25 was around 10% (X1=2), in the 2nd iteration it
20
15 suddenly increased to approximately 35% (X2=3).
10
5
0
0 2 4 6 8 10 12 14 16 18 20
iterations
0.4
0.2
0
0 2 4 6 8 10 12 14 16 18 20
iterations
The question we would like to pose here: Was the perfect fit of exponential trendline by chance? Was
there some rationale behind it?
Noting that in brute-force, the interval is halved, we can write a simple differential equation as follows:
dE
=kE
dn
which can be simply read as “the change of error per iteration is proportional to the existing error”,
which completely overlaps with the definition of brute-force. Using the data in Table 2.1, the value of k
can easily be computed as -0.69314. Note that the negative sign of k tells that the error decreases. Solving
the simple differential equation, it can be found that error at nth iteration:
E0
En=
2n
In a process engineering sense the differential equation approach to the error can be interpreted as that the
error in brute-force methodology follows a 1st order “reaction” kinetics.
7
2.1.2) Regula falsi
The brute force approach only halves the interval and does not use any information coming from the
values of f(a) and f(b). For example, if the value of f(a) is closer to zero than f(b), then it is very likely that
a is closer to the root than b.
The improved estimate of the true root, xr, can easily be found by two approaches:
1) Writing the equation of line between [a, f(a)] and [b, f(b)], which will yield Eq. (2.1):
f (b)⋅(a−b) (2.1)
x r=b−
f (a)−f (b)
2) Using the triangle similarity between [a, f(a), xr] and [b, f(b), xr] in Fig. (2.4) will give Eq.(2.2):
Eq. (2.1) is recommended by Chapra & Canale (2013) and Press et al. (2007) whereas Eq. (2.2) is used
by more recent literature such as Gupta (2019) and Wikipedia (2022)3.
If the function is strictly monotone in the chosen interval, the regula falsi approach requires less iterations
than brute force. For example, in the interval [0, 4] f(x) = x2-5 = 0 is strictly monotone and therefore in
Unlike Example (2.2), if the function’s value is nearly constant in the chosen interval, such as f(x)=x10-1 in
[0, 1.3], then the steps taken at each iteration will be very small and thus can lead to a very slow
convergence. In these cases, the brute force approach can yield considerably faster convergence than
regula-falsi.
#regula falsi
rf = bisect(lambda x: x**10-1, a = 0, b = 1.3, method=("rf", False))
print(bf)
print(rf)
Bisection using brute-force method
Root=1.00000, Error=9.92e-06 (17 iterations).
Fig. (2.5) shows the change of bounds during each iteration for f(x) = x10-1 in the interval of [0, 1.3] when
regula falsi method was used. It is seen that only the lower bound (a) changed whereas the upper bound
(b) never changed. In other words, the upper bound was stagnant and this stagnation caused slow
convergence.
9
When regula falsi without modification
was used, the upper bound’s value
remained same, whereas only the lower
bound’s value changed.
One way to alleviate the problem of slow convergence for functions with nearly constant values in the
interval is to modify the regula falsi method by detecting the stagnant bound (the bound which is
constantly replaced by xr) and dividing the value of f(xr) by 2 as recommended by Chapra and Canale
(2013).
Example 2.4: Finding the root of f(x) = x10-1 = 0 using modified regula falsi.
#modified regula falsi
rf_modified = bisect(lambda x: x**10-1, a = 0, b = 1.3, method=("rf", True))
print(rf_modified)
Bisection using regula falsi method
Using Modified regula-falsi
Root=1.00000, Error=4.17e-09 (13 iterations).
It is seen that compared with regula falsi (Example 2.3), modified regula falsi considerably improved the
convergence rate of f(x) = x10-1. We might wonder why?
If number of iterations vs f(b) is plotted, the effect of the strategy recommended by Chapra and Canale
(2013) can be observed, as can be seen from Fig. (2.6-B).
Figure 2.6: The application of the strategy recommended by Chapra and Canale (2013) to alleviate
stagnation, A) value of b (upper bound) vs iterations, B) value of f(b) vs iterations
It is seen from Fig. (2.6-B) that in the 2nd, 4th ... iterations when the stagnation was detected the value of
f(b) was halved. On the other hand, from Fig. (2.6-A) it is observed that even if the value of f(b) was
halved, the value of b did not change since any attempt to halve the value of b might cause the root to be
missed.
11
2.2. Ridders method
It is a powerful variant of false position method. The root is assumed to be in the interval [x0, x2].
x0 + x2 (2.3)
x 1=
2
Then a non-linear function is defined:
H (x )= F (x )e
mx (2.4)
H 2−2 H 1 + H 0 =0 (2.5)
F 2⋅e
m⋅x 2
−2 F 1 e
m⋅x 1 X0
+ F 0 e =0 (2.6)
2m⋅d m⋅d
F 2⋅e −2 F 1 e + F 0 =0 (2.7)
Eq. (2.7) is a quadratic equation in terms of emd. If we define α= emd then Eq. (2.7) can be rewritten as
follows:
2
F 2 α −2 F 1 α + F 0 =0 (2.8)
2
Δ=4⋅( F 1− F 0⋅F 2 )>0 and noting that F2·F0<0 (ensuring real roots), the solution of Eq. (2.8):
F 1 −sign( F 0 ) √ Δ
α= >0 (2.9)
F2
Keeping in mind that H1<0, the final step is to apply regula falsi (Eq. 2.2) between (x1, H1) and (x2, H2) to
find the new point x3 :
H 2 x 1 −H 1 x2 (2.10)
x 3=
H 2 −H 1
After dividing Eq. (2.10) by H1 and adding and subtracting x 1 to the numerator and performing
straightforward algebraic manipulation:
d (2.11)
x 3= x 1−
H 2 / H 1−1
H 2 F ( x2 ) e2md F ( x 2 ) md (2.12)
= = e
H 1 F( x 1 )e md F ( x 1 )
13
Placing Eqs. (2.9) and (2.12) into Eq. (2.11) yields the following equation4,5:
sign[ F( x 0 )−F ( x2 )] F( x 1 )
x 3= x 1 +(x 1 −x 0 ) (2.13)
√ F( x 1 ) − F( x 0 ) F ( x 2)
2
print(res_x2)
print(res_x10)
Ridder's method
Root=2.23607, (3 iterations).
Ridder's method
Root=1.00000, (4 iterations).
Example (2.5) clearly illustrated that regardless of the function’s behavior after just a few iterations
Ridder’s method was able to find the root.
The convergence of Ridder’s method is quadratic, however, since it requires two function evaluations, the
actual order of the method is √2 (Press et al. 2007). Press et al. (2007) states that Ridder’s method is an
extraordinarily robust algorithm in both reliability and speed and therefore generally competitive with the
more highly developed and better established methods such as Van Wijngaarden, Dekker, and Brent.
4 The derivation is based on Ridders (1979) and in Eq. (2.13), Ridders (1979) uses sign[F(x0)] whereas Press et al.
(2007) uses sign[F(x0)-F(x2)].
5 To avoid the sign function in Eq. (2.13), Ridders (1979) recommends dividing the numerator and denominator by F(x 0).
However, for small F(x0) values, this can cause numerical instabilities, therefore scisuit uses the equation proposed by
Press et al. (2007).
2.3. Discussion
The above-mentioned bracketing methods were tested on 12 different functions at different intervals. The
tabulated data is presented in the format of number of iterations and in parenthesis the total running times.
Table 2.2: Number of iterations and running times (ms) required by different bracketing methods
f(x) Interval brute-force regula falsi modified rf Ridder
2
f1 = x ⋅(
x2
+√ 2⋅sin(x))−
√3 0, 1.2 17 (0.039) 43 (0.148) 10 (0.039) 3 (0.018)
3 18
f2 = 11⋅x 11−1 0.4, 1.6 17 (0.015) IE 23 (0.029) 4 (0.008)
35
f3 = 35⋅x −1 -0.5, 1.9 18 (0.014) IE 83 (0.098) 6 (0.010)
−9 −9 x
f4 = 2∗(x⋅e −e )+1 -0.5, 0.7 17 (0.023) 514 (0.806) 18 (0.036) 4 (0.012)
f5 = x 2−(1 – x )9 -1.4, 1 18 (0.021) IE 26 (0.047) 5 (0.015)
f6 = (x−1)⋅e−9 x + x 9 -0.8, 1.6 18 (0.024) IE 33 (0.062) 5 (0.014)
x 1
f7 = x 2+sin( )− -0.5, 1.9 18 (0.019) 28 (0.056) 11 (0.023) 4 (0.013)
9 4
1 1
f8 = ⋅(9− ) 0.001, 1.201 17 (0.012) IE 21 (0.017) 7 (0.007)
8 x
f9 = tan (x)−x – 0.046302 -0.9, 1.5 18 (0.013) 518 (0.388) 19 (0.022) 3 (0.007)
f10 = x + x⋅sin( √ 75 x) – 0.2
2
0.4, 1 16 (0.023) 7 (0.024) 9 (0.024) 3 (0.012)
10
f11 = x – 1 0, 1.3 17 (0.012) 61 (0.060) 13 (0.017) 4 (0.008)
2
f12 = x −5 0, 4 19 (0.015) 12 (0.017) 8 (0.014) 3 (0.007)
IE: Max iterations exceeded (max iteration = 1000)
It is seen that for all tested functions bisection and ridder methods converged to a root whereas regula
falsi and modified rf had problems converging to a root even when allowed to iterate up to 1000 iterations.
For example, for f2 and f5 regula falsi required 2789 and 13990 iterations, respectively.
In cases where the function’s value does not change considerably in the chosen interval, i.e. f 11 (see
appendix), in terms of number of iterations modified regula falsi performed better than brute-force which
performed considerably better than regula falsi. For a monotone function such as f12, as expected both
regula falsi and modified rf required less iterations than bisection method.
Regardless of the function’s behavior, in all cases Ridder’s method outperformed regula falsi, modified rf
and brute-force methods. However, one should note that in every iteration Ridder’s method has to
15
evaluate the function twice whereas for example brute-force only evaluates the function once per each
iteration.
From Table (2.3), it is seen that regardless of the method applied, the runtime cost of different functions
are different. Of all tested functions, brute-force had the least runtime cost per iteration whereas Ridder’s
methods had the most, since it has to evaluate the function twice. Except one function, namely f 10,
modified regula falsi had higher runtime costs than regula falsi.
Although looking at the number of iterations it takes to find the root is a good strategy and employed in
the literature to select the applicability of a method to a certain problem, it does not suffice to win the
competition. For example, in all cases, Ridder’s method had the least number of iterations; however, in all
cases, the runtime cost per iteration of Ridder’s method was the highest, e.g. 2.5 to 3 times higher than
brute-force. Therefore, it is suggested that the overall runtime performance of a method should also be
taken into account when making a selection. Looking at Table (2.2), it is seen that of all the tested
functions Ridder’s method had the least overall runtime cost although it had the highest runtime cost per
iteration.
3. Open Methods
In the previous section we talked about bracketing methods where the root was always located within an
interval, [a, b]. It is due to this reason that the bracketing methods always converge.
converge
Unlike bracketing methods, open methods require initial value(s) that do not need to bracket the root. As
such, open methods can diverge; however, when they converge, they generally converge faster than
bracketing methods. Let’s investigate what we mean by “generally” by finding the root of f ( x)=x 10 −1
using both bracketing and open methods:
Script 3.1
from scisuit.roots import bisect, newton
f = lambda x: x**10-1
res_bisect = bisect(f=f, a = 0, b = 5 )
print(res_bisect)
It is seen that while bisection method located the root after 19 iterations, newton’s method exceeded
maximum iterations which was set to 100 by default (for scipy.optimize6 maxiter=50). Now if we set
maxiter to 500 and run newton’s method again:
Script 3.2
res_newton = newton(f=f, x0 = 0.01, fprime=lambda x: 10*x**9, maxiter=500)
print(res_newton)
Newton method using (Newton-Raphson)
Root=1.00000, Error=7.41e-09 (377 iterations).
377 iterations!!
6 https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.newton.html
17
Finally, let’s change our initial guess x0 to 1.3 and observe the effect:
Script 3.3
res_newton = newton(f=f, x0 = 1.3, fprime=lambda x: 10*x**9, maxiter=500)
print(res_newton)
Newton method using (Newton-Raphson)
Root=1.00000, Error=2.92e-09 (7 iterations).
7 iterations!! considerably faster than bisection method. It is clearly seen that when using open-methods
an intuition about the function can make a considerable difference not only on the performance but also on
the outcome of the method.
3.1. Newton-Raphson method
It is probably the most widely used root finding method (Chapra and Canale 2013; Press et al. 2007). As
seen from Fig. (3.1), the iteration process starts with an initial guess of the root, xi. Then using the
derivative of the function, a tangent line is drawn and its intersection with the x-axis is considered as an
improved estimate of the true root. Therefore,
f ( x i)
x i+1 =x i − (3.1)
f ' ( xi )
Eq. (3.1) tells that in order to find the root besides an initial guess, x0, the value of the function, f(x), and
the value of its derivative, f`(x), are needed at each iteration step.
f(x) = x2 – 5 = 0 is monotonically increasing in the neighborhood of x0=1, therefore only 5 iterations were
required to find the root with an error of 9.18·10-7. It should also be noted that for f(x) = x2 – 5, the initial
estimate would have little effect on the performance of the method.
19
Example 3.2: Find the root of f(x) = x10 – 1 = 0.
res = newton(lambda x: x**10-1, x0=1.3, fprime=lambda x: 10*x**9)
print(res)
Newton method (Newton-Raphson)
Root=1.00000, Error=2.92e-09 (7 iterations).
However, unlike f(x) = x2 – 5 = 0, where the initial estimate of the root had minor effect on the
performance of Newton-Raphson method, for f ( x)=x 10 – 1=0 the initial estimate of the root could have
significant performance effect, especially when the initial estimate is in a location where the function’s
value is nearly constant.
Script 3.4
from scisuit.roots import newton
It is observed from the output that when the value of derivative is large (x 0=100), then the steps taken by
the method will be small and will yield a relatively slow convergence. However, on the other hand, when
the derivative is small (x0=0.01), then the initial step taken will be rather large and therefore will yield
such a large Xn that the convergence thereafter will be considerably slow as seen in the Fig. (3.2).
Figure 3.2: Effect of initial estimate for f(x) = x10 – 1 = 0 on performance A) x0=100, B) x0=0.01
If the initial estimate (or part of the iteration process) hits a local extremum Newton’s method can badly
diverge; however, when it converges, it converges quadratically therefore becomes the method of choice
for any function whose derivative can be evaluated efficiently, and whose derivative is continuous and
nonzero in the neighborhood of a root (Press et al. 2007) as seen in Example 3.1.
3.2. Secant Method
In the previous section, it was mentioned that Newton’s method was the choice “… any function whose
derivative can be evaluated efficiently”. This is not a problem for polynomials or many other functions,
but there are functions whose derivative maybe very difficult to find.
In such cases, the derivative can be approximated by a backward finite difference method:
f ( xi−1 )−f (x i )
f ' ( x i )= (3.2)
xi−1− x i
f ( x i )( x i−1−x i )
x i+1=x i − (3.3)
f ( x i−1 )−f ( x i )
21
Remembering that regula falsi method requires two starting points and draws a line between them, then
you might be wondering what the difference between regula falsi and secant method is?
Notice that in Fig. (3.5), for this particular figure, secant method would diverge from the root as of second
iteration whereas regula falsi would always converge, as long as enough iterations are allowed.
3.3. Halley’s Method
It is similar to Newton-Raphson's method, but converges more rapidly in the neighborhood of a root 7. The
derivation is straightforward and includes Taylor series expansion in the neighborhood of root, x.
2
f ( x)=f ( x n )+ f ' ( x n )( x−x n )+ f ' ' ( x n )( x−x n ) (3.4)
We know that when f(x)=0 then x=xn+1. Substituting these knowns into Eq. (3.4) and furthermore using
Eq. (3.1) and after some algebraic manipulation, the following equation is found [ a detailed derivation is
presented by Weisstein (2024)8]:
f ( xi)
x i+1=x i −
[
f ' ( x i )⋅ 1−
f ( xi) f ' ' ( xi)
2 f ' ( xi) 2
] (3.5)
It should be noted from Eq. (3.5) that Halley’s method not only requires the first derivative but also the
second derivative as well.
It is seen that Halley’s method required less than both Newton-Raphson and secant methods.
It is seen from the output that Halley’s method required less than Newton-Raphson (7 iterations).
7 https://blogs.sas.com/content/iml/2016/08/24/halleys-method-finding-roots.html
8 Weisstein, Eric W. "Halley's Method." From MathWorld--A Wolfram Web Resource.
https://mathworld.wolfram.com/HalleysMethod.html
23
3.4. Müller-Traub method
So far the methods discussed can only be used to find the real roots. Müller’s method can find both real
and complex roots. We have already seen that the secant method draws a straight line to obtain the new
point (xnew). Müller’s method takes a similar approach, but draws a parabola through three points.
The blue line shows the parabola drawn through the 3 points
on the f(x), the black line.
Let’s see how we can find the new point, namely xnew (Chapra and Canale 2013 presents a detailed derivation ). We
can write the equation of the parabolic function as follows:
2
f (x )=a ( x−x 2 ) +b ( x−x 2 )+c (3.6)
If one replaces x with x0, x1 and x2, then 3 equations will be obtained. After a straightforward algebraic
manipulation the following set of equations will be obtained (two unknowns, a and b):
h0 =x 1 −x 0 h 1=x 2 −x 1
f ( x 1)−f ( x 0 ) f ( x 2 )−f ( x 1 ) (3.8)
δ 0= δ 1=
x1 −x 0 x 2− x 1
Therefore the unknowns in the parabolic equation (a, b and c) in terms of Eqs. (3.8):
δ 1−δ 0
a= b=a h1 +δ 1 c=f ( x 2 ) (3.9)
h1 +h 0
Solving the quadratic equation (Eq. 3.6) using Eq. (3.9) will yield the new point:
−2 c (3.10)
x new = x2 +
b∓ √ b −4 ac
2
(the solution of the quadratic equation uses a modified form of the equation to take into account loss of significance ).
Muller method
Root=-1j (2 iterations).
It is easily seen that the solution of f(x) = x2 + 1 requires complex roots and therefore none of the
methods, except the Müller’s, could have locate the root. Please note the effect of the initial guess, x 0, on
the root (x0=-1 yielded 1j whereas for x0=1 the root was -1j).
Muller method
Root=(0.259205+0j) (7 iterations).
It should be noted that, when a function has both real and complex roots, even if the initial estimate of the
root is a real number, it is not guaranteed that Müller’s method will converge to a real root.
25
3.5. Discussion
Similar to bracketing methods, open methods were also tested on 12 different functions. The tabulated
data is presented in the format of number of iterations and the running times in parenthesis.
Table 3.1: Number of iterations and running times (ms) required by different open methods
It is seen that in all cases Newton’s method converged to a root, whereas out of 12 functions secant
method failed to converge 5 times and Müller’s method failed to converge twice (f3 and f11) and did not
converge to a real root 4 times (f4, f5, f6 and f9).
In terms of the overall runtime costs, Newton’s method outperformed Müller’s method in 5 functions (f4,
f5, f6, f8 and f10) whereas Müller’s method was the winner for 4 functions (f1, f2, f7 and f12).
The similarities and differences between secant and regula falsi methods have already been mentioned. If
one compares both method (Tables 2.2 & 3.1), it is seen that in general when converged, secant method
performed faster than regula falsi. For example, for f1, regula falsi required 43 iterations and 143 ms
whereas secant method located the root in 13 iterations and 45 ms.
It is seen from Table (3.1) that functions f3 and f11 posed problems to all methods, i.e. Newton-Raphson,
secant and Müller. Plot of f3 and f11 are presented below.
It is seen from Fig. (3.7) that for both f3 and f11 the functions’ values did not change considerably in the
part of the selected interval or changed very quickly. When the function value stagnates, it yields a very
small derivative which can then cause the problem of exceeding numerical bounds during iterations (see
Fig. 3.2).
On the other hand, for a strictly monotonous function, i.e. f(x) = x2-5, not only all methods did converg to
a root but also required less than 10 iterations. Müller’s method, which uses a parabola to estimate the
new root, required only 2 iterations to locate the root of f12, which itself is a parabola.
27
Table 3.2: Runtime cost per iteration (microseconds / iteration)
It is clearly seen from Table (3.2) that of all the functions tested the runtime cost per iteration of Müller’s
method was up to ~3 times higher than Newton-Raphson and secant methods. It has already been
mentioned that Müller’s method can be used for both real and complex numbers. In order to achieve this,
the code that powers Müller’s method uses a double precision complex number data structure 9 rather than
a double precision data type.
Besides, for trigonometric functions one has to use cmath rather than the math library of Python where for
the trigonometric functions required here performance of math is better than cmath. For example, if one
computes the square root of 25 using math and cmath libraries (for the purpose of benchmarking,
computation was run for 1 million times), it would be seen that math library will provide results ~1.5
times faster than cmath library.
All these could also be among the reasons why Müller’s method’s runtime cost per iteration was higher
than the other two methods, namely Newton-Raphson and secant.
9 https://en.cppreference.com/w/cpp/numeric/complex
4. Hybrid Methods
Brent's method is due to Richard Brent and builds on an earlier algorithm by Theodorus Dekker, therefore
is also known as the Brent–Dekker method (Wikipedia 2024 10). It is sometimes known as van
Wijngaarden-Dekker-Brent method (Wolfram MathWorld 202411).
Inverse Quadratic Interpolation: Similar to Müller’s method, when there are 3 points on f(x), it is
possible to define a quadratic function and find its intersection with the x-axis. However, it is possible that
the parabola, y=f(x), might not intersect with the x-axis, for example y=x2+1. However, x=f(y), will
always intersect with the x-axis as shown below:
29
Writing the Lagrange polynomial through the given 3 points:
Remembering that the root, xi+1, corresponds to y = 0, then Eq. (4.1) can be rewritten as:
From Eq. (4.2) one should note that that if yi-2, yi-1 and yi are not distinct then xi+1 does not exist.
Table 4.1: Number of iterations and running times (ms) of brute-force, secant and brentq methods
f(x) brute-force secant Brent
+ √ 2⋅sin( x))− √
2
2 x 3
f1 = x ⋅( 17 (0.039) 13 (0.045) 8 (0.020)
3 18
11
f2 = 11⋅x −1 17 (0.015) NC 11 (0.013)
35
f3 = 35⋅x −1 18 (0.014) NC 14 (0.013)
−9 −9 x
f4 = 2∗( x⋅e −e )+1 17 (0.023) NC 8 (0.012)
2 9
f5 = x −(1 – x) 18 (0.021) 9 (0.017) 8 (0.011)
−9 x 9
f6 = ( x−1)⋅e +x 18 (0.024) 19 (0.034) 13 (0.016)
2 x 1
f7 = x +sin( )− 18 (0.019) 8 (0.015) 9 (0.012)
9 4
1 1
f8 = ⋅(9− ) 17 (0.012) NC 11 (0.007)
8 x
f9 = tan( x)−x – 0.046302 18 (0.013) 22 (0.022) 8 (0.008)
f10 = x + x⋅sin( √ 75 x) – 0.2
2
16 (0.023) 9 (0.023) 8 (0.015)
10
f11 = x – 1 17 (0.012) NC 7 (0.007)
2
f12 = x −5 19 (0.015) 8 (0.011) 7 (0.007)
It is seen that in all cases brute-force method converged to a root with up to 19 iterations. On the other
hand, out of 12 functions, the secant method not converged to a root for 5 functions. As Brent’s method
has the sureness of a bracketing method in finding the root, therefore similar to brute-force method in all
cases Brent’s method converged to a root, but with fewer iterations, ranging from 7 to 14 iterations. In
terms of number of iterations, out of 12, except f7, Brent’s method outperformed the secant method. On the
other hand in terms of overall runtime cost, of all the functions tested, Brent’s method had the least
runtime cost. For example for the function namely f 1 the overall runtime performance is twice the secant
and brute-force methods. Since Brent’s method is a hybrid method, a question might arise, where does the
performance gain, in terms of number of iterations and overall runtime cost, come from? Let’s run brentq
function with the debug parameter set to True (feature discontinued) :
import math
from scisuit.roots import brentq
Table 4.2: Output of debugTbl variable showing the methods applied for finding the of f1
Iteration # Method
1 secant attempted, succeeded! It is seen that Brent’s method uses a
2 inv quad inter attempted, failed! using bisect mixture of brute-force, secant and
inverse quadratic interpolation (IQI).
3 secant attempted, succeeded!
4 inv quad inter attempted, failed! using bisect Out of 8 iterations to find the root of
5 secant attempted, succeeded! the function, namely f1, secant
method was used 4 times, brute-force
6 secant attempted, succeeded! was used twice (due to failed attempt
7 inv quad inter attempted, succeeded! to use IQI) and twice the IQI.
It is informative to compare the values Xn assumed during each iteration of brentq and secant methods.
31
4.2. ITP Method
Introduced by Oliveira and Takahashi (2020), it is short for Interpolate Truncate and Project and has the
optimal worst-case performance of the bisection method where the worst-case performance of bisection
method can be computed as follows:
⌈
n1/2= log 2
b−a
2ϵ ⌉ (4.3)
Similar to bisection method, ITP method initially requires the root to be bracketed in the interval of [a, b].
Unlike bisection method, ITP applies 3 distinct steps: a) Interpolation, b) Truncation, c) Projection:
B) Truncation: Interpolation alone may place xf too close to the bounds, which can slow down the
convergence. So the ITP method applies a truncation step:
i) A mid-point is computed:
a+b (4.5)
x 1/2=
2
ii) Parameters k1 and k2 are used to control the distance from the midpoint:
x t =x f +σ⋅δ (4.6)
k2
where δ=k 1⋅(b−a) and σ is the sign of x 1/2−x f to ensure xt is closer to the mid-point.
C) Projection: By defining a dynamic range projection step ensures the next point is neither too close nor
too far from the midpoint, therefore keeping the process stable and efficient. Without the projection step,
the interpolation or truncation points could occasionally jump too far from the root, especially if the
function behavior is irregular. By keeping the estimate close to the midpoint, the projection step ensures
that the ITP method retains a balance, avoiding oscillations and stabilizing the convergence. The dynamic
range is defined as:
nmax −k YY − XX (4.7)
r=TOL×2 −
2
where nmax =n0 +n1/2 is the estimated maximum number of iterations needed, k is the current iteration
count and YY-XX is the current interval width. Note that Eq. (4.7) ensures that the range gets smaller with
each iteration, making the steps more conservative as we get closer to the root.
Now we measure the distance of truncated point (xt) to the mid-point (x1/2). If xt lies within dynamic range
of x1/2, then xt is kept for next iteration, otherwise it is projected to be closer to the mid-point:
Note that the projection step acts as a "safety net" to prevent the guesses from overshooting or
undershooting due to aggressive steps.
D) Update: The final step is updating the values of bounds, namely a and b, so that the initial bracket gets
narrower by each iteration. The update logic is same as the bisection method, such that:
33
Example 4.1: Find the root of f(x) = x2-5 = 0.
from scisuit.roots import bisect, brentq, itp
ITP method
Root=2.23607, Error=3.43e-07 (7 iterations).
It is seen that in terms of number of iterations hybrid methods, brentq and ITP, performed significantly
better than the brute-force method. Although employment of regula-falsi decreased the number of
iterations down to 12 [see Example (2.2)], the hybrid methods considerably outperforms both approaches.
Oliveira & Takahashi (2020) performed numerical experiments on 24 functions and found that the ITP
method required the least amount of function evaluations followed by Ridders, Matlab, Illinois, and
Regula Falsi.
The iteration count can be influenced by several factors related to parameter choices and function
behavior. The parameters k1 and k2 control the size of the truncation step, and small or misfit values can
lead to slower convergence, since the method takes smaller steps than necessary, which results in more
iterations. A common choice for these parameters is k1=0.1 and k2=2.0 (Oliveira & Takahashi, 2020).
Notice that under the same conditions, now it takes 12 iterations to locate the root. In selection of k 1 and
k2, scisuit package uses an approach similar to R’s ITP package12.
12 https://github.com/paulnorthrop/itp/blob/b51384f3c4514afa80797bb0b6d040f8ba38584a/src/itp_c.cpp#L61
5. Polynomials
Polynomials frequently arise in many applications in engineering and science. For example, when writing
energy balances involving radiation heat transfer or when calculating pressure-drop through a packed bed
of particles the roots of the polynomials must be found.
2
f n ( x)=a 0 +a1 x +...+a n x
n (5.1)
where n is the order of the polynomial and the a’s are constant coefficients. For n=1, finding the single
root is straightforward. For n<5 there are well-defined equations to find all the roots. However, although
there are equations13 to solve 4th degree polynomials, they are not as straightforward as the equations for
2nd or 3rd degree polynomials.
When n≥5, there are no equations to find the roots and the methodology to find the roots is as follows:
Example 5.1:
Solution:
13 https://en.wikipedia.org/wiki/Quartic_equation
35
Note that in its current form the polynomial is not monic (an=2). Therefore, divide by an.
1. x4 + 1/2·x2 – 1/2 =0 (monic polynomial)
( )
0 1 0 0
0 0 1 0
0 0 0 1
1/2 0 −1/2 0
3. Find the eigenvalues
import numpy as np
m = np.array( [
[0, 1, 0, 0],
[0, 0, 1, 0],
[0, 0, 0, 1],
[0.5,0,-0.5,0]] )
Notice that the above output is exactly the same as calling np’s Polynomial.roots function. At this point, it
is instructive to see, how the companion matrix formed:
#Insert zero-vector to the identity matrix as first column (we padded the matrix)
m = np.insert(m, 0, zeros, axis=1)
print(m)
[[ 0 1 0 0]
[0 0 1 0]
[0 0 0 1]
[ 0.5 0 -0.5 0]]
( )
0 1 0 0
0 0 1 0
0 0 0 1
1/ 2 0 −1/2 0
37
6. Set of Equations
f 1 (x 1 , x 2 ,..., x n )=0
f 2 ( x1 , x 2 ,..., xn )=0
(6.1)
...
f n ( x1 , x 2 ,..., x n)=0
and we are seeking x1, x2, …, xn satisfying all of the equations given in Eq. 6.1.
For simplicity, let’s focus on functions with two variables (derivation for functions with n variables
follows the same logic). Taylor’s theorem for a function of two variables, namely f(x, y), near (a, b):
∂f ∂f
f (x , y )≈ f (a ,b)+ (a , b)(x −a)+ (a , b)( y−b)+
∂x ∂y
(6.2)
2 2 2
1∂ f 1∂ f ∂ f
(a ,b)(x −a)2 + (a ,b)( y−b)2 +2 (a, b)( x−a)( y−b)
2 ∂ x2 2 ∂ y2 ∂x∂ y
If there are two functions: u(x, y)=0 and v(x, y)=0, then applying Taylor’s theorem:
∂ ui ∂u
ui+1=u i +( x i+1− xi ) +( y i+1− y i ) i
∂x ∂y
(6.3)
∂ vi ∂v
v i +1=v i +( x i+1 −x i ) +( y i+1− y i ) i
∂x ∂y
yields two equations & two unknowns.
∂ ui ∂u ∂u ∂u
x i+1 − i y i+1 =−ui + x i i + yi i
∂x ∂y ∂x ∂y
(6.4)
∂ vi ∂ vi ∂ vi ∂ vi
x i+1− y i+1=−v i + x i + yi
∂x ∂y ∂x ∂y
where unknowns are on the left hand-side.
Applying the Cramer’s rule:
∂ vi ∂u ∂ ui ∂v
ui −v i i vi −ui i
∂y ∂y ∂x ∂x (6.5)
x i+1 =x i − y i+1= y i −
∂ ui ∂ v i ∂ u i ∂ v i ∂ u i ∂ v i ∂ ui ∂ v i
− −
∂x ∂ y ∂ y ∂x ∂x ∂ y ∂ y ∂x
Notice that:
1. The denominator for equations xi+1 and yi+1 is the determinant of the Jacobian matrix.
2. The solution starts with an estimated solution vector, v=(x0, y0) corresponding to the function
vector f=(f1, f2)
Solution:
Define the equations as Python functions ( instead of variables x and y, we use a single notation t where x=t0 and
y=t1, which brings the flexibility to work with n variables easily):
"""
Note that the Python functions are defined
in the form of f(x, y) = 0
"""
def f1(t):
return t[0]**2 + t[1]**2 -5
def f2(t):
return t[0]**2 - t[1]**2 -1
#function vector
f = [f1, f2]
result = fsolve( f, v )
print(result)
39
#check if the estimated roots satisfy (close enough to 0.0) equations
for func in f:
print(isclose(func(result.roots), 0.0, abs_tol=1E-5), end=", ")
Solving Set of Equations
Converged to roots after 4 iterations.
Root #0=1.7321
Root #1=1.4142
True, True
Finally, note that changing the initial solution vector could change the roots.
result = fsolve( f, v )
print(result)
Solving Set of Equations
Converged to roots after 4 iterations.
Root #0=-1.7321
Root #1=-1.4142
Finally note that in math.isclose function the abs_tol parameter has been changed to 10 -5 as the default
absolute tolerance level is 10-9 whereas the default tolerance level for fsolve function is 10-5.
6.2. Set of linear equations
Assume we have the following set of linear equations:
a n0 x 0 +a n1 x 1+...+ann x n=bn
where a’s are the cofficients, x’s are unknowns and b’s are constants.
In order to solve the set of linear equations, it is more convenient to work with matrix notation, i.e. A as
the coefficient matrix, x as the unknown or solution vector and b as the constant vector. Therefore, the set
of equations will be compacted into:
Ax=b (6.7)
where
( ) () ()
a00 a01 ... a 0n b0 x0
A= a10 a11 ... a 1n , b= b 1 , x= x1
⋮ ⋮ ⋮ ⋮ ⋮ ⋮
an 0 a n1 ... ann bn xn
There are 3 scenarios that can happen in our search for the solution:
1. No solution
2. Exactly one solution
3. Infinitely many solutions.
41
Example 6.2: Given the following 3 pairs of points (1,3), (2, 7), (3, 15) find the equation for the trendline
(the “best” line).
Solution:
The equation for a line is: y= ax + b where there are two unknowns, namely a and b. However, 3 points
are given which yields 3 equations (more than unknowns):
( )( ) ( )
3=a+b 1 1 3
a
7=2 a +b , in matrix notation 2 1⋅ = 7
b
15=3 a +b 3 1 15
x = [1, 2, 3]
y = [3, 7, 15]
plt.scatter(x=x, y=y)
plt.show()
We can however obtain a solution that will represent the best line:
import numpy as np
import scisuit.plot as plt
#equation of the best line using the coefficients from the solution
f = lambda x: x*sol[0] + sol[1]
x = [1, 2, 3]
y = [3, 7, 15]
plt.scatter(x=x, y=y)
#draw a line
plt.plot(x=x0, y=y0, lw=3, ls="--")
plt.show()
Figure 6.2: Scatter chart with linear trendline (the best line)
43
Applications
A. Energy Balance
Problem: In a manufacturing process, a spherical piece of metal is subjected to radiative heater operating
at 850K and to air flowing at 350K. The surface emissivity (ε) of the metal is 0.6 and the convective heat
transfer coefficient (h) is 40 W/(m2K). Find the temperature of the metal after a long enough time
(Adapted from Jaluria Y, 2020).
Solution:
We assume that the Biot number (Bi) is smaller than 0.1 and therefore there is no temperature gradient
within the metal. After a long enough time steady state conditions will prevail, therefore heat losses will
be equal to heat gains.
1) We know that the temperature must be in between [350, 850] (already bracketed):
45
2) Application of secant method requires two starting points, x0 and x1. Choose x0=350 and x1=850.
If you were to choose x 1 close to root such as 650 or 700 then the number of iterations would have been 6
and 7, respectively.
3) Similar to bisection method, Brent’s method requires that the root is bracketed initially. Therefore we
could also have used Brent’s method:
Notice that in terms of the number of iterations required, Brent’s method outperformed both bisection and
secant methods.
4) Another advanced method, that initially requires the root to be bracketed, is the ITP method:
where
−8
a=0.6×5.67×10
b=c =0
d=40
−8 4
e=−40×350−0.6×5.67×10 ×850
import numpy as np
a = 0.6*5.67*1E-8
b=c=0
d = 40
e = -40*350 - a*850**4
poly = np.polynomial.Polynomial([e, d, c, b, a])
print(poly.roots().tolist())
[(-1244.20+0j), (299.14+1035.43j), (299.14-1035.43j), (645.92+0j)]
47
B. Particle Technology
Particulate materials such as powders or bulk solids are widely used in process industries, for example in
the food processing, pharmaceutical, biotechnology, oil, chemical, mineral processing, metallurgical,
detergent, power generation, paint, plastics and cosmetics industries (Rhodes 2008).
1. Fluidization Velocity
Background: Fluidization is a process where a particulate material is converted from a static solid-like
state to a dynamic fluid-like state (Wikipedia 2022 14). Fluidization provides excellent heat and mass
transfer and therefore has several applications.
When a fluid is passed upwards through a bed of particles the pressure loss in the fluid due to frictional
resistance increases with increasing fluid flow. A point is reached when the upward drag force exerted by
the fluid on the particles is equal to the apparent weight of particles in the bed (Rhodes 2008). At this
point the bed of particles assumes the characteristics of a boiling liquid, hence the term fluidization. The
fluid responsible for fluidization may be a gas or a liquid (Shilton & Niranjan, 1993).
where H is the height of the particles before fluidization, A is the cross-sectional area of the container
(bed), ε is the voidage, ρp and ρf are the densities of particle and fluid, respectively.
ΔP
=(1−ϵ )⋅g⋅( ρ p− ρ f )
H
Pressure drop in the presence of multiple particles can be found using Ergun’s equation:
2 2
Δ P 150⋅(1−ϵ ) μ⋅U mf 1.75⋅ρ f U mf (1−ϵ )
= ⋅ 2 + ⋅ 3
H ϵ3 x sv x sv ϵ
14 https://en.wikipedia.org/wiki/Fluidization
where Umf is the minimum fluidization velocity, xsv is the Sauter-mean diameter and μ is the viscosity.
Equating last two equations gives the final form of the equation:
2
150⋅(1−ϵ )2 μ⋅U mf 1.75⋅ρ f U mf (1−ϵ ) (B.1)
(1−ϵ )⋅g⋅( ρ p − ρ f )= ⋅ 2 + ⋅ 3
ϵ3 x sv x sv ϵ
Problem: A packed bed of solids of density 1475 kg/m 3 occupies a depth of 0.5 m in a cylindrical vessel
of inside diameter 25 cm. The mass of solids in the bed is 15 kg and the surface-volume mean diameter of
the particles is 3 mm. What is the minimum flow rate of air at 60°C to fluidize the particles? At 60°C:
ρair=1.0585 kg/m3 and μair=1.998·10-5 Pa·s
Solution:
0.01016
ϵ =1− =0.586
0.02454
ΔP m kg Pa
=(1−0.586 )⋅9.81 2⋅(1475−1.0585) 3 =5991.12
H s m m
Observe that Eq. (B.1) is quadratic with respect to Umf and can be rewritten as follows:
2
A⋅U mf + B⋅U mf +C=0
where,
49
Noting that all the terms in A, B and C are known, the positive root ( we are interested in speed and not in the
velocity since the flow direction is known) gives the minimum fluidization velocity as Umf=2.06 m/s.
It is known that the temperature of the gas could affect Umf since increasing Tgas decreases density and
increases viscosity. A comparatively short script was written to investigate the effect of temperature on
Umf. For the investigation, the temperatures have been arbitrarily chosen as 20°, 40°, 60° and 80°C.
import math
import numpy as np
import scisuit.plot as plt
from scisuit.eng import Air
mp = 15 #kg
Vp = mp/Rho_p
Voidage = 1 - Vp/Vbed
def umf(T):
air = Air(T=T+273.15)
Rho_f, Mu_f=air.rho(), air.mu()
poly=np.polynomial.Polynomial([C, B, A])
return [x for x in poly.roots() if x>0][0]
plt.scatter(x=T, y=Umf)
plt.xlabel("T(°C)")
plt.ylabel("Umf (m/s)")
plt.show()
It is seen that increasing the gas
temperature increased Umf. Also one can
notice the perfect relationship between
Umf and temperature.
The strong relationship between air temperature and Umf could be due to the fact that the above-given
script did not take into account the following points:
1. Particle type [Geldart (1973) A, B, C or D]. For example, Botterill 15 et al. (1982) reported the
effect of temperature on Umf for some Group B and D particles. They observed a decrease of Umf
with increasing temperature for Group B particles whereas for Group D powders, an increase in
Umf was observed.
15 Botterill JSM, Teoman Y, Yüregir KR (1982). The effect of operating temperature on the velocity of minimum
fluidization, bed voidage and general behaviour, Powder Technology, 31(1).
51
2. Terminal Velocity
Background: A particle falling from rest in a fluid will initially experience a high acceleration as the
shear stress drag will be small. However, as the particle accelerates the drag force increases, causing the
acceleration to reduce. Eventually a force balance is achieved when the acceleration is zero and a
maximum or terminal relative velocity is reached by the single particle (Rhodes 2008). Stokes' law is key
to understanding a wide variety of physical processes such as swimming of microorganisms and
sedimentation of tiny particles in air and water (Dey et al. 2019).
Problem: Estimate the terminal velocity for 80-to-100-mesh particles of limestone (ρ= 2800 kg/m 3)
falling in water at 30°C.
Dp for 100 mesh=0.147 mm and 80 mesh= 0.175 mm, at 30°C : µ=0.801 cp, ρ=995.7 kg/m3
Solution:
The drag coefficient (CD) and Reynolds number (Re) are related to Ar:
Ar
2 4
C D⋅Re = ⋅Ar=153.05
3
24 0.687
C D= (1+0.15⋅Re )
Re
24 0.687 2
(1+0.15⋅Re )⋅Re −153.05=0
Re
Expressing the last equation as Python function:
At this stage we need some intuition to be able to solve the equation with one of the root finding methods.
Recalling that Reynold’s number for different regions are: Stokes (~ <1), Intermediate (1<Re<1000) and
Newton’s (1000 < Re <2·105) gives the lower and upper bounds of Reynold’s number and therefore
bisection method seems a convenient choice to apply.
The brute-force approach of bisection method yielded the Reynold’s after 27 iterations. Let’s see if regula
falsi approach would yield faster convergence:
plt.scatter(x=x, y=y)
plt.show()
53
How about, modified regula falsi? Can it alleviate the problem of very slow convergence?
Finally, since we have all the insight we need, let’s “cheat” and choose the interval as [1,10] and see the
results: Brute-force: 20 iterations and regula-falsi: 9 iterations.
This was expected as at the given interval the function is strictly monotonously increasing and therefore
was a good candidate for regula falsi method.
C. Thermodynamics
Equation of state (EoS) establishes a relationship between pressure, temperature, and specific volume (P,
v, T) of a substance (Çengel et al. 2019). EoS is important in the modeling of a wide range of industrial
and natural processes and it is desired that EoS is accurate, consistent, easy to compute and robust
(Wilhelmsen et al. 201716).
The simplest and best-known EoS for substances in the gas phase is the ideal-gas equation. However, it is
rather simple and its range of applicability is limited in real gases. Therefore, it is desirable to have
equations that can represent the behavior of substances accurately over a larger region with no limitations.
Cubic equations of state compromise between generality and simplicity that is applicable in many process
engineering operations. As a matter of fact, cubic equations are the simplest equations capable of
representing both liquid and vapor behavior (Smith et al. 2017).
In this section 3 cubic equations of state were chosen to solve the problem proposed below:
Problem:
Given that the vapor pressure of water at 100°C is 101.3 kPa, find the molar volume of saturated-vapor
using the different equation of states (Question adapted from Smith et al. 2017).
16 Wilhelmsen Ø, Aasen A, Skaugen G, Aursand P, Austegard A, Aursand E, Gjennestad MA, Lund H, Linga G,
Hammer M (2017). Thermodynamic Modeling with Equations of State: Present Challenges with Established Methods.
Industrial & Engineering Chemistry Research, 56, 3503−3515.
55
1. Van der Waals Equation
Background: The equation is based on the work of 19 th century Dutch physicist Johannes Diderik van der
Waals(Wikipedia 202217). It is given by the following equation:
( )
P+
a
v
2 ⋅( v−b)= RT
The equation improves the ideal-gas equation by including two effects: (I) the intermolecular attraction
forces (a / v2) and (II) the volume occupied (b) by the molecules themselves (Çengel et al. 2019).
(
f (v )= P+
a
v )
2 ⋅(v−b)−RT
(C.1)
a, b= 553.6, 0.03049
P = 101.3 #kPa
R = 8.314462
T=100 + 273.15 #K
The bracketing, open or hybrid methods require at least one starting point. At this point all we know is
that at standard temperature and pressure 1 mol of ideal gas occupies 22.4 L. Although water vapor is not
an ideal gas (see Çengel et al. (2019) for exceptions to this claim), 22.4 L seems a reasonable initial guess. At
this point, we can employ Newton’s method to solve Eq. (C.1):
a 2 ab
f ' ( v)=P− 2
+ 3
v v
17 https://en.wikipedia.org/wiki/Van_der_Waals_equation
Now that we have Eq. (C.1) and its derivate an initial starting point (x0), we can proceed with the solution:
In conclusion, we found out that Van der Waals equation estimated (overestimated) the true value with
approximately 1% error.
57
2. Redlich / Kwong Equation
Background: Redlich/Kwong was was formulated by Otto Redlich and Joseph Neng Shun Kwong in
1949. The equation is generally more accurate than the Van der Waals and the ideal gas equations at
temperatures above the critical temperature. (Wikipedia 2022 18). The equation is as follows (Smith et al.
2017)
( Z− β )
Z=1+ β −q β
Z( Z+ β )
P
q= Ψ ⋅T −3/2 β =Ω⋅ r
Ω r
Tr
where Ω and Ψ are pure numbers, independent of substance but specific to a particular equation of state.
For RK equation Ω=0.08664 and Ψ=0.42748.
Solution:
Using the critical temperature (Tc) and critical pressure (Pc), we first calculate the reduced temperature (Tr)
and reduced pressure (Pr):
T 100+273.15 P 101.3
T r= = =0.576614 and P r= = =0.004579
Tc 647.14 P c 22120
0.42748
q= Ψ ⋅T r
−3/2 (−3/ 2)
= ⋅(0.576614 ) =11.268589
Ω 0.08664
Pr 0.004579
β =Ω⋅ =0.08664⋅ =0.00068810952
Tr 0.576614
(Z−0.00068810952)
f (Z )=Z−1.00068810952+0.00775402362 =0 (C.2)
Z (Z +0.00068810952)
Solution of Eq. (C.2) will yield the compressibility factor. Translating it to a Python function:
beta = 0.00068810952
f = lambda z: (z - 1) - beta + 0.00775402362 * ( (z-beta) / ( z*(z+beta) ) )
Since we are trying to find compressibility factor, which is roughly in the range of (0, 1) for water vapor,
the upper and lower bounds of the solution is pretty much known. However, before delving into choosing
a method for solution, let’s remember that all methods will evaluate the function first at the given initial
points. Therefore, looking at Eq. (C.2), it is clear that when Z=0 then f(Z)=∞. Therefore, we must avoid a
starting value of Z=0; however, it is possible to choose in the neighborhood of 0.0.
Since we know the lower and upper bounds, it is possible to employ Brent’s method:
Let’s also try the ITP method since we know the bracket that the root lies in:
Notice the considerable differences between the hybrid methods (Brent and ITP) and bisection methods!!
59
Does it make any sense that Z=0.9928, which is rather close to 1.0 ?
According to Çengel et al. (2019) “At very low pressures (PR << 1), gases behave as ideal gases
regardless of temperature”. Since we have already calculated PR as 0.004579, it makes sense that Z is
close to 1.
Let’s calculate the specific volume and compare with the experimental value of 1.67290 m 3/kg.
m3 Pa
0.9928×8.314462 ×373.15 K
ZRT mol⋅K m3 m3
V= = =0.030399242 =1.68885
P 101325 Pa mol kg
Is it acceptable?:
In conclusion, we find out that RK equation estimated (overestimated) the true value with less than 1%
error. Notice that RK equation gave a slightly better estimate than Van der Waals equation.
3. Peng-Robinson Equation
Background: It was developed in 1976 at The University of Alberta by Ding-Yu Peng and Donald
Robinson (Wikipedia 202219). It is used widely used in commercial process simulators, such as Aspen
Plus™. The equation is:
RT αa
P= −
v−b v 2 +2 bv−b 2
2 2
R TC RT C
a=0.45723553 b=0.07779607
PC PC
α =[ 1+Κ ( 1−√ T r )]
2
2
Κ=0.37464+1.5422 ω −0.26993 ω
Using the steam tables at 179.85°C saturated vapor pressur is Pws(179.85 °C) = 998.753 kPa. Now that
pressure and the critical pressure are both known, reduced pressure can easily be computed:
19 https://en.wikipedia.org/wiki/Cubic_equations_of_state
61
P ws 998.75296 kPa
P r= = =0.045151580
Pc 22120 kPa
ω =−1.0−log (0.045151580)=0.345327
Knowing the value of acentric factor allows the computation of K=0.875014. Once K and reduced
temperature (Tr=0.576614) are known, α can be computed as 1.465482.
At this point all we need is to express PR equation in the form of f(x)=0. Simply rewriting:
RT αa (C.3)
f (v )= P− + 2 =0
v−b v +2 bv −b 2
#Givens
T, P = 373.15, 101.3 #K and kPa
#Knowns
Tc, Pc= 647.14, 22120 #K and kPa
R= 8.314462
a = 0.45723553*(R**2*Tc**2) / Pc
b = 0.07779607*(R*Tc) / Pc
alpha = 1.465482
One of the root finding methods need to be selected to find the root of f(v). The derivative of Eq. (C.3) can
be conveniently taken if we would like to use Newton-Raphson method; however, now that we have
“experience” from the previous sections (C1 & C2), we will employ a bracketing method.
We could approach with the same intuition we did in Section C1: “... at standard temperature and
pressure 1 mol of ideal gas occupies 22.4 L.”. This gives us a starting point but also left us with the choice
of an open method that takes only a single parameter, i.e. the Newton-Raphson method.
RT (α a)⋅(2 v +2 b)
f ' ( v)= − 2 =0
( v−b) (v +2bv −b 2)2
2
The molar volume is 30.36157 L/mol and has exactly the same value that we found using bisection
method; however, with only 5 iterations.
Luckily, the answer is yes! If we define two variables A and B as follows (for reference, please see footnote at
page 61):
α aP bP
A= B=
2
R T
2
RT
3 2 2 2 3
Z −(1−B)Z +( A−2 B−3 B )Z−( AB−B −B )=0
Now all we need to do is to find the roots of a 3rd degree polynomial using a short script:
63
import numpy as np
A = alpha*a*P / (R**2*T**2)
B = b*P / (R*T)
poly = np.polynomial.Polynomial(coeffs[::-1])
print(poly.roots())
[7.3e-04 7.3e-03 9.9132e-01]
As expected, the 3rd degree polynomial returned 3 roots, 2 of which are close to zero. Remembering that
the roots represent the compressibility factor, the 2 roots did not make any sense. Therefore, we chose
Z=0.99132.
Z = max(poly.roots())
V = Z*R*T / P
print(V)
30.36157
All our approaches yielded the same value, v=30.36157 L/mol. After a straightforward unit conversion
one can find v=1.68675 m3 / kg.
Similar to Van der Waals and Redlich-Kwong equations, we conclude that Peng-Robinson equation
estimated (overestimated) the true value with less than 1% error.
D. Fluid Dynamics
Background: In 1939, Cyril F. Colebrook (1910–1997) combined the available data (experimental results
from the measurements of the flow rate and the pressure drop) for transition and turbulent flow in smooth
as well as rough pipes into the implicit relation known as the Colebrook equation.
The equation establishes a relationship between the friction factor (f) and the Reynolds number (Re), pipe
roughness (ε), and inside diameter (D) of pipe.
1
√f
=−2.0 log (
ϵ / D 2.51
+
3.7 Re √ f )
Problem: Water at 16°C (σ=998.922 kg/m3 and μ=0.0011 Pa·s) is flowing steadily in a 5 cm diameter
horizontal pipe made of stainless steel at a rate of 0.006 m 3/s. Determine the pressure drop for flow over a
60 m-long section of the pipe (Adapted from Çengel and Cimbala 2006).
Solution:
Given the volumetric flow rate ( V̇ ) and the diameter of the pipe (D), it is possible to calculate the
average velocity (V) of the fluid:
V̇ 0.006 m3 / s
V= = =0.191m/ s
A c Π(5⋅10−2 )2 / 4
Since density and viscosity of the fluid, average velocity and diameter of the pipe are known, Reynolds
number (Re) can be calculated:
ρ v D 998.2×0.191×5⋅10−2
Re= μ = =8666 >4000
0.0011
−3
ϵ = 0.002⋅10 m =4⋅10−5
−2
D 5⋅10 m
65
Putting these values into Colebrook’s equation and rearranging, we seek the root of:
g( f )=
1
√f (
+2.0 log 1.081⋅10−5 +
2.51
)
8666 √ f
=0 (D.1)
In order to use one of the root finding methods (bracket, open or hybrid), we need at least 1 initial guess.
We already know that friction factor must be greater than 0. Therefore we can be tempted to choose 0;
however, looking at the denominator of Eq. (D.1) it tells us that this is not possible. Therefore we can
choose a number slightly greater than zero and apply Newton’s method.
However, Eq. (D.1) is not very calculus friendly! Even if we take the derivative, we know that open
methods can diverge…
The answer is yes! Let’s first plot the Moody chart to get some insights:
plt.moody()
plt.show()
If one inspects Moody chart (Fig. D.1), it can be seen that the friction factor ranges from 0.01 to 0.1.
Therefore knowing this range allows us to use bracketing methods and be sure to find a root!
We find f=0.03214. It is seen that knowledge gained from Moody chart proved to be very useful.
However, it includes a wider range than needed.
Figure D.1: Moody chart (generated by scisuit Python package).
Can we narrow down the initial estimate and therefore obtain higher performance in finding the root?
Luckily, the answer is still yes! The paper published by Chen (1984) gives an explicit simple equation for
estimating the friction factor. Chen’s equation is as follows:
0.3
1 ϵ )b ]
f =c [ + K (
Re a D
For the constants a, b and K, Chen (1984) states that for turbulent regions c =0.3164, a=0.83 and when
b=1.0 then K=0.11. Writing Chen’s equation as Python function:
67
The estimate of the friction factor with Chen’s equation is indeed very close to the actual root which we
calculated as 0.03214. However, noting that we obtained only a single starting point our choice so far
becomes limited to Newton’s method.
For the Reynold’s numbers of 4000 and 10000, Chen (1984) gives the estimated percentage deviations
from Colebrook equation for different relative roughness. Let’s choose the maximum percentage
deviations for over- and under-estimates, which is [-13.2%, +4.5%], and expand the estimated value using
the percentage deviations (we believe true root will be in between):
a = f_est*(1 - 0.132)
b = f_est*(1 + 0.045)
Now that we have two starting points, a and b, and therefore can use Brent’s or ITP methods.
result_bq = brentq(g, a = a, b = b)
print(result_bq)
result_itp = itp(g, a = a, b = b)
print(result_itp)
Brent's method (inverse quadratic interpolation)
Root=0.03214, (4 iterations).
ITP method
Root=0.03214, Error=5.16e-05 (4 iterations).
This is equal to our first attempt but with roughly 43% less iterations needed as our estimate using Chen’s
equation was very close to the root.
2
L ρV
Δ P=f
D 2
2
60 m 998.2×0.191
Δ P=0.03214 ⋅ =702.23 Pa
5⋅10−2 m 2
E. Heat Transfer
Background: Consider the following temperature profile in a counter-flow heat exchanger where hot
fluid enters from one end and the cold fluid enters from the opposite end.
In order to quantify the heat transfer, a suitable mean temperature difference (MTD) across the heat
exchanger needs to be known. The suitable MTD in this case is called the logarithmic mean temperature
difference (LMTD) and is mathematically expressed as follows:
Problem: A hot process stream of heat capacity flow rate 100 kW/°C has inlet temperature T 1 = 125°C
and outlet temperature T2 = 50°C. It is known that the overall heat transfer coefficient is virtually
independent of the cooling water flow rate, and at the relevant process stream flow rate has a value
corresponding to UA = 175 kW/°C. If cooling water enters at 30°C, what is the exit temperature of the
cooling water stream and what is its flow rate? (Adapted from Paterson 1984)
69
Solution:
Had we known the flow rate of the water, the problem would have been a trivial one. However, note that
both the flow rate of the water and its exit temperature are unknown.
Since we know the heat capacity flow rate of the hot stream and its inlet and exit temperatures, the heat
transfer rate can be calculated:
kW
Q= ṁ⋅c p⋅Δ T =100 ×(125−50 )°C=7500 kW
°C
kW
Q=UA Δ T m →7500 kW =175 ×ΔT m → Δ T m=42.857 ° C
°C
In the equation for LMTD ΔT2 =50-30=20°C and ΔTm=42.857°C, and therefore to find the outlet
temperature of water ΔT1 should be computed. Placing the knowns (ΔT2 and ΔTm) the LMTD equation can
be rewritten as follows:
20− X
f ( X )=42.857− =0
ln[ ]
20
X
The exit temperature of water is the root of the above equation and therefore it needs to be written as
Python function:
Since we will be using a bracketing method two estimates of the temperature are needed. By looking at
Fig. (E.1), it can be seen that Tc1 can not be greater than T h1 and cannot be smaller or equal to Tc2, therefore
Tc2<Tc2<Th1. Thus the interval is [Tc2 + δ, Th1 – δ]. Here δ has been arbitrarily chosen as 1.0. Now that a
range has been established bracketing methods can be employed:
(
Δ T 1 = 7 Δ T 2 +6
Q
UA ) (
−4 √ 3 Δ T 22 +2 Δ T 2
Q
UA )
Note that in Paterson’s equation Q/UA is equal to LMTD, ΔTm=42.86°C. Placing the knowns (ΔT2 and
ΔTm) gives ΔT1=78.57°C. A simpler approach to the non-iterative solution has been proposed by Chen
(1987). Chen’s equation is as follows:
( )
2 2 1/3
Δ T 1 Δ T 2 +Δ T 1 Δ T 2
Δ T m=
2
Note that Chen’s equation can also be expressed as a polynomial of AX2 + BX + C = 0, where A=ΔT2, B=
ΔT22 and C=- 2ΔTm3. Therefore the root of the polynomial yields ΔT1.
import numpy as np
print(poly.roots())
[-99.2844 79.284]
Chen (1987) also proposed the following direct solution, where finding the roots of the polynomial can be
avoided:
√[ ( )]
2
−Δ T 2 ΔT2 2 Q 3
Δ T 1= + +
2 4 Δ T 2 UA
71
2. Transient Heat Conduction - A pathological case
Background: Many heat transfer problems are time dependent and such transient problems arise when
there is a change in the boundary conditions. When such conditions occur, in order to obtain the exact
spatial solution, the following equation needs also to be solved for different Biot (Bi) numbers and
positive roots should be obtained:
f (x )=x⋅tan (x )− Bi (E.1)
Problem:
Solution:
To narrow down the solution, let’s arbitrarily choose a Bi number and focus on it by assuming different Bi
numbers will follow the similar rationale. Let’s choose Bi as 100.
Bracketing the root here is not an easy task (see Fig. E.2). Taking the derivative of the function is
straightforward so our intuition tells us to use Newton’s method as it requires only a single initial estimate.
x0 = [0.1, 1, 2]
roots = [newton(f=f, x0=v, fprime=df) for v in x0]
print(*roots, sep="\n")
print(*f_v, sep="\n")
Newton method (Newton-Raphson)
Root=496.57036, Error=9.16e-06 (4 iterations).
8.829331932247442e-09
5.002220859751105e-12
5.070631914350088e-09
It is seen that although the starting points were fairly close to each other, they yielded completely different
roots. Furthermore, it can be seen from Fig. (E.2) that there is actually a root between 1 and 2 and none of
the starting points yielded a root between 1 and 2.
Since we “know” the lower and upper bounds, let’s try Brent’s method:
73
As already mentioned, bracketing the root is a challenge for this problem. Let’s arbitrarily choose a=1.57
and b=1.575, then we have negative and positive values of the function. Let’s use the powerful and
reliable methods to search for the root:
Ridder's method
Root=1.57079, (6 iterations).
ITP method
Root=1.57080, Error=4.28e+05 (10 iterations).
All of the chosen methods returned the root as ~1.5708, which is completely incorrect! [f(x) =
-427737.433]. Before explaining why this happened, let’s zoom in the function in the chosen interval:
Although this was an extreme case it should be noted that if there are suspected discontinuities in the
chosen interval it is always best to check the end-result before accepting it as root.
Let’s have another attempt and choose a=1 and b=1.57.
Verifying the result, f(1.555246953) = 0.0118. Although we have found the correct root, we were “lucky”
with our choice of the interval. Therefore, for such pathological cases, it is a good idea to create look-up
tables for further uses.
75
F. Evaporation (Unit Operations)
Background: Evaporation is employed to remove water from dilute liquid foods to obtain a concentrated
liquid product. Removal of water from foods provides microbiological and chemical stability. In order to
remove the water, heat is supplied via steam.
Problem: We are required to find the steam requirements of a double-effect forward-feed evaporator to
concentrate a liquid food material from 11 to 50% total solids concentrate. The feed rate is 10000 kg/h at
20°C. Inside the second effect the boiling of liquid takes place under vacuum at 70 °C. The saturated
steam is supplied to the first effect at 198.5 kPa. The condensate from the first effect exits at 120 °C and
from the second effect at 95 °C.
The overall heat-transfer coefficient in the first and second effect are 1000 and 800 W/(m2 °C),
respectively. The specific heats of the liquid food are C pF=3.8, CpI=3.0, and CpP=2.5 kJ/(kg °C) at initial,
intermediate, and final concentrations. Assume the heat exchanger areas and temperature gradients are
equal in each effect. (Adapted from Singh and Heldman 2008).
Solution:
mF⋅x F =m P⋅x P
2.78⋅0.11=mP⋅0.5→ m p=0.61kg
mF =mv 1 +mv 2 +m P
2.78=mv 1 +mv 2 +0.61
mv 1 +mv 2 =2.168 kg
mI⋅C pI⋅T I +mv 1⋅H g@ 95°C =mv 1⋅H f @95 °C +mv 2⋅H g @70° C +m P⋅C pP⋅T P
Q=UA Δ T
mS
=1.288
mv 1
77
Putting All Equations Together
mv 1 +mv 2 =2.168 kg
mI⋅285+mv 1⋅2668.1−mS⋅2202.59=211.28
mS −1.288⋅mv 1 =0
( ) () ( )
1 1 0 0 mv 1 2.168
m
−2202.59 285 , x= v 2 , b= 211.28
A= 2668.1 0
2270.5 −2626.8 0 285 mS 106.75
−1.28 0 1 0 mI 0
import numpy as np
A = np.array([
[1, 1, 0, 0],
[2668.1, 0, -2202.59, 285],
[2270.5, -2626.8, 0, 285],
[-1.28, 0, 1, 0] ])
x = np.linalg.solve(A, b)
print(x)
[1.10733 1.06067 1.41738 1.32886]
Therefore mv1=1.10733 kg/s, mv2=1.06067 kg/s, mS=1.41738 kg/s and mI=1.32886 kg/s
References
Chapra SC, Canale RP (2013). Numerical methods for engineers, seventh edition. McGraw Hill
Education.
Chen JJJ (1984). A simple explicit formula for the estimation of pipe friction factor. Proceedings of the
Institution of Civil Engineers, 77, 49-55
Chen JJJ (1987). Comments on improvements on a replacement for the logarithmic mean. Chemical
Engineering Science, 42(10), 2488-2489.
Çengel YA, Boles MA, Kanaoglu M (2019). Thermodynamics: an engineering approach, 9 th edition,
McGraw-Hill Education.
Çengel YA, Cimbala JM (2006). Fluid mechanics: fundamentals and applications. McGraw-Hill
Education.
Dey S, Zeeshan Ali SK, Padhi E (2019). Terminal fall velocity: the legacy of Stokes from the perspective
of fluvial hydraulics. Available at: https://royalsocietypublishing.org/doi/10.1098/rspa.2019.0277
Gupta RK (2019). Numerical Methods Fundamentals and Applications. Cambridge University Press.
Holman JP (2008). Heat Transfer 10th Edition, McGraw-Hill series in mechanical engineering.
Incropera FP, DeWitt DP (1996). Fundamentals of Heat and Mass Transfer, Fourth Edition. John Wiley
& Sons Ltd.
Jaluria Y (2020). Design and optimization of thermal systems, 3rd Edition, CRC Press/Taylor & Francis
Group.
Oliveira IFD, Takahashi RHC (2020).An Enhancement of the Bisection Method Average Performance
Preserving Minmax Optimality. ACM Transactions on Mathematical Software, 47(1), Article 5.
Paterson WR (1984). A replacement for the logarithmic mean. Chemical Engineering Science, 39 (11),
1635-1636
Press WH, Teukolsky SA, Vetterling WT, Flannery BP (2007). Numerical Recipes The Art of Scientific
Computing. Cambridge University Press.
Rhodes M. (2008). Introduction to Particle Technology, 2nd edition. John Wiley & Sons Ltd.
79
Ridders CJF (1979). A New Algorithm for Computing a Single Root of a Real Continuous Function.
IEEE Transactions on Circuits and Systems, 26 (11).
Shilton NC, Niranjan K (1993). Fluidization and Its Applications to Food Processing. Food Structure,
12, 199-215.
Singh RP, Heldman D. (2008). Introduction to Food Engineering 4th Edition. Academic Press.
Smith JM, Van Ness HC, Abbott MM, Swihart MT (2017). Introduction to chemical engineering
thermodynamics, 8th edition. McGraw-Hill Education.
Appendix
Test Functions
To test and compare the above-mentioned methods following functions were used. The derivative of the
functions are presented in the second column.
+ √ 2⋅sin( x))− √
2 3 Anon (2021a)
x 3 4x 0.39942
2
f1 = x ⋅( + √ 2 x 2 cos ( x)+2 √ 2 x sin ( x)
3 18 3
11 10 Anon (2021a)
f2 = 11⋅x −1 121 x 0.80413
35 34 Anon (2021a)
f3 = 35⋅x −1 1225 x 0.90341
−9 −9 x −9 −9 x Anon (2021a)
f4 = 2∗( x⋅e −e )+1 2 e +18e 0.07701
2 9 8 Anon (2021a)
f5 = x −(1 – x) 2 x +9(1− x) 0.25921
−9 x 9 −9 x 9x 8 Anon (2021a)
f6 = ( x−1)⋅e +x e ⋅(10−9 x+9 e x ) 0.53674
2 x 1 1 x 0.44754 Anon (2021a)
f7 = x +sin( )− 2 x + cos ( )
9 4 9 9
1 1 1 0.11111 Anon (2021a)
f8 = ⋅(9− )
8 x 8x
2
It is insightful to plot some of the functions to be able to understand the presented methods better and
therefore 4 functions were selected for this purpose. All plots were generated using Wolfram Alpha.
81
11
f =11⋅x −1 f =( x−1)⋅e
−9 x
+x
9
10 2
f =x – 1 f =x −5