CHAPTER 2 Solution of Non Linear Equation
CHAPTER 2 Solution of Non Linear Equation
Introduction
Numerical methods are scientific in the sense that they represent systematic techniques for
solving mathematical problems. However, there is a certain degree of art, subjective
judgment, and compromise associated with their effective use in engineering practice. For
each problem, you may be confronted with several alternative numerical methods and many
different types of computers. Thus, the elegance and efficiency of different approaches to
problems is highly individualistic and correlated with your ability to choose wisely among
options. Unfortunately, as with any intuitive process, the factors influencing this choice are
difficult to communicate. Only by experience can these skills be fully comprehended and
honed. However, because these skills play such a prominent role in the effective
implementation of the methods, this section is included as an introduction to some of the
trade-offs that you must consider when selecting a numerical method and the tools for
implementing the method. It is hoped that the discussion that follows will influence your
orientation when approaching subsequent material. Also, it is hoped that you will refer back
to this material when you are confronted with choices and trade-offs in the remainder of the
course.
You will probably be introduced to the applied aspects of numerical methods by confronting
a problem in one of the above areas. Numerical methods will be required because the
problem cannot be solved efficiently using analytical techniques. You should be cognizant of
the fact that your professional activities will eventually involve problems in all the above
areas. Thus, the study of numerical methods and the selection of automatic computation
equipment should, at the minimum, consider these basic types of problems. More advanced
problems may require capabilities of handling areas such as functional approximation,
integral equations, etc.
1|Page
Program Development Cost versus Software Cost versus Run-Time Cost.
Once the types of mathematical problems to be solved have been identified and the computer
system has been selected, it is appropriate to consider software and run-time costs. Software
development may represent a substantial effort in many engineering projects and may
therefore be a significant cost. In this regard, it is particularly important that you be very well
acquainted with the theoretical and practical aspects of the relevant numerical methods. In
addition, you should be familiar with professionally developed software. Low-cost software
is widely available to implement numerical methods that may be readily adapted to a broad
variety of problems.
o Breadth of Application.
2|Page
Some numerical methods can be applied to only a limited class of problems or to problems
that satisfy certain mathematical restrictions. Other methods are not affected by such
limitations. You must evaluate whether it is worth your effort to develop programs that
employ techniques that are appropriate for only a limited number of problems. The fact that
such techniques may be widely used suggests that they have advantages that will often
outweigh their disadvantages. Obviously, trade-offs are occurring.
o Special Requirements.
Some numerical techniques attempt to increase accuracy and rate of convergence using
additional or special information. An example would be to use estimated or theoretical values
of errors to improve accuracy. However, these improvements are generally not achieved
without some inconvenience in terms of added computing costs or increased program
complexity.
o Programming Effort Required.
Efforts to improve rates of convergence, stability, and accuracy can be creative and
ingenious. When improvements can be made without increasing the programming
complexity, they may be considered elegant and will probably find immediate use in the
engineering profession. However, if they require more complicated programs, you are again
faced with a trade-off situation that may or may not favor the new method. It is clear that the
above discussion concerning a choice of numerical methods reduces to one of cost and
accuracy. The costs are those involved with computer time and program development.
Appropriate accuracy is a question of professional judgment and ethics.
Mathematical Behavior of the Function, Equation, or Data.
In selecting a particular
numerical method, type of computer, and type of software, you must consider the complexity
of your functions, equations, or data. Simple equations and smooth data may be appropriately
handled by simple numerical algorithms and inexpensive computers. The opposite is true for
complicated equations and data exhibiting discontinuities.
Ease of Application (User-Friendly?).
Some numerical methods are easy to apply; others are difficult. This may be a consideration
when choosing one method over another. This same idea applies to decisions regarding
program development costs versus professionally developed software. It may take
considerable effort to convert a difficult program to one that is user-friendly.
Maintenance.
Programs for solving engineering problems require maintenance because during application,
difficulties invariably occur. Maintenance may require changing the program code or
expanding the documentation. Simple programs and numerical algorithms are simpler to
maintain.
The chapters that follow involve the development of various types of numerical methods for
various types of mathematical problems. Several alternative methods will be given in each
chapter. These various methods are presented because there is no single “best” method.
There is no best method because there are many trade-offs that must be considered when
applying the methods to practical problems.
3|Page
Methods used in Root Finding:
A simple method for obtaining an estimate of the root of the equation f(x) = 0 is to make a plot of
the function and observe where it crosses the x axis. This point, which represents the x value for
which f(x) = 0, provides a rough approximation of the root.
Graphical techniques are of limited practical value because they are not precise. However, graphical
methods can be utilized to obtain rough estimates of roots. These estimates can be employed as
starting guesses for numerical methods discussed in this and the next chapter. Aside from providing
rough estimates of the root, graphical interpretations are important tools for understanding the
properties of the functions and anticipating the pitfalls of the numerical methods.
In general, if f(x) is real and continuous in the interval from xl to xu and f(xl) and f(xu) have opposite
signs, that is, f(xl) f(xu) < 0 then there is at least one real root between xl and xu.
The bisection method, which is alternatively called binary chopping, interval halving, or Bolzano’s
method, is one type of incremental search method in which the interval is always divided in half. If a
function changes sign over an interval, the function value at the midpoint is evaluated. The location of
4|Page
the root is then determined as lying at the midpoint of the subinterval within which the sign change
occurs. The process is repeated to obtain refined estimates.
The first step in bisection is to guess two values of the unknown (in the present problem, c) that give
values for f(c) with different signs. From graphical method above, we can see that the function
changes sign between values of 12 and 16. Therefore, the initial estimate of the root xr lies at the
midpoint of the interval
5|Page
Thus, after six iterations εa finally falls below εs = 0.5%, and the computation can be terminated.
Although it is always dangerous to draw general conclusions from a single example, it can be
ε ε
demonstrated that a will always be greater than t for the bisection method.
3) False-Position method
A shortcoming of the bisection method is that, in dividing the interval from xl to xu into equal halves,
no account is taken of the magnitudes of f(xl) and f(xu). For example, if f(xl) is much closer to zero
than f(xu), it is likely that the root is closer to xl than to xu. An alternative method that exploits this
graphical insight is to join f(xl) and f(xu) by a straight line. The intersection of this line with the x axis
represents an improved estimate of the root. The fact that the replacement of the curve by a straight
line gives a “false position” of the root is the origin of the name, method of false position, or in Latin,
regula falsi. It is also called the linear interpolation method.
Using similar triangles (Figure below), the intersection of the straight line with the x axis can be
estimated as:
6|Page
Derivation:
Example:
Use the false-position method to determine the root of the same equation
investigated in graphical methods with initial guesses of xl = 12 and xu =
16.
Although the false-position method would seem to always be the bracketing method of preference,
there are cases where it performs poorly. In fact, as in the following example, there are certain cases
where bisection yields superior results.
7|Page
4) Simple fixed point iteration
Open methods employ a formula to predict the root. Such a formula can be developed for simple
fixed-point iteration (or, as it is also called, one-point iteration or successive substitution) by
rearranging the function f(x) = 0 so that x is on the left-hand side of the equation:
x = g(x)
whereas sin x = 0 could be put into the form of by adding x to both sides to yield x = sin x + x
The utility of x=g(x) is that it provides a formula to predict a new value of x as a function of an old
value of x. Thus, given an initial guess at the root xi, x=g(x) can be used to compute a new estimate
xi+1 as expressed by the iterative formula.
Example:
Use simple fixed-point iteration to locate the root of f(x) =e− x −x .
Solution. The function can be separated directly and expressed in the form of:
x i+1 = e− x
8|Page
Notice that the true percent relative error for each iteration of Example is roughly proportional
(by a factor of about 0.5 to 0.6) to the error from the previous iteration. This property, called
linear convergence, is characteristic of fixed-point iteration.
Consequently, if |g’(x)|< 1, the errors decrease with each iteration.
For |g’(x)|> 1, the errors grow.
Perhaps the most widely used of all root-locating formulas is the Newton-Raphson equation. If the
initial guess at the root is xi, a tangent can be extended from the point [xi, f(xi)]. The point where
this tangent crosses the x axis usually represents an improved estimate of the root.
9|Page
The Newton-Raphson method can be derived on the basis of this (following figure) geometrical
interpretation (an alternative method based on the Taylor series is also described).
Taylor’s Theorem: If the function f and its first n + 1 derivatives are continuous on an interval
containing a and x, then the value of the function at x is given by:
10 | P a g e
Example:
Although the Newton-Raphson method is often very efficient, there are situations where it
performs poorly. A special case - multiple roots - will be addressed later in this chapter.
However, even when dealing with simple roots, difficulties can also arise, as in the following
example.
11 | P a g e
There is no general convergence criterion for Newton-Raphson. Its convergence depends on the
nature of the function and on the accuracy of the initial guess. The only remedy is to have an
initial guess that is “sufficiently” close to the root. And for some functions, no guess will work!
Good guesses are usually predicated on knowledge of the physical problem setting or on
devices such as graphs that provide insight into the behavior of the solution. The lack of a
general convergence criterion also suggests that good computer software should be designed to
recognize slow convergence or divergence.
This approximation can be substituted into Newton-Raphson to yield the following iterative
equation:
12 | P a g e
, which is the formula for the secant method. Notice that the approach requires two initial
estimates of x. However, because f(x) is not required to change signs between the estimates, it
is not classified as a bracketing method.
Although the secant method may be divergent, when it converges it usually does so at a
quicker rate than the false-position method. The inferiority of the false-position method is due
to one end staying fixed to maintain the bracketing of the root. This property, which is an
advantage in that it prevents divergence, is a shortcoming with regard to the rate of
convergence; it makes the finite-difference estimate a less-accurate approximation of the
derivative.
13 | P a g e
f(x) = e-x -x
7) Brent’s method
Wouldn’t it be nice to have a hybrid approach that combined the reliability of bracketing with
the speed of the open methods? Brent’s root-location method is a clever algorithm that does just
that by applying a speedy open method wherever possible, but reverting to a reliable bracketing
method if necessary. The approach was developed by Richard Brent (1973) based on an earlier
algorithm of Theodorus Dekker (1969).
The bracketing technique is the trusty bisection method whereas two different open methods are
employed. The first is the secant method, the second is inverse quadratic interpolation.
Inverse quadratic interpolation is similar in spirit to the secant method. The secant method is
based on computing a straight line that goes through two guesses. The intersection of this
straight line with the x axis represents the new root estimate. For this reason, it is sometimes
referred to as a linear interpolation method.
Now suppose that we had three points. In that case, we could determine a quadratic function of
that goes through the three points. Just as with the linear secant method, the intersection of this
parabola with the x axis would represent the new root estimate. Using a curve rather than a
straight line often yields a better estimate. Although this would seem to represent a great
improvement, the approach has a fundamental flaw: It is possible that the parabola might not
intersect the x-axis! Such would be the case when the resulting parabola had complex roots. The
difficulty can be rectified by employing inverse quadratic interpolation. That is, rather than
using a parabola in x, we can fit the points with a parabola in y. This amounts to reversing the
axes and creating a “sideways” parabola (the curve, x = f (y)).
Comparison of (a) the secant method and (b) inverse quadratic interpolation.
14 | P a g e
If the three points are designated as ( x i−2, y i−2), ( x i−1, y i−1), and ( x i, y i ), a quadratic function
of y that passes through the points can be generated as:
15 | P a g e
8) Multiple roots
A multiple root corresponds to a point where a function is tangent to the x axis. For example, a
double root results from
f(x) = (x − 3)(x − 1)(x − 1) (6.11)
or, multiplying terms, f(x) = x 3 − 5x2 + 7x − 3. The equation has a double root because one
value of x makes two terms in Eq. (6.11) equal to zero. Graphically, this corresponds to the
curve touching the x axis tangentially at the double root. A triple root corresponds to the case
where one x value makes three terms in an equation equal to zero, as in
f(x) = (x − 3)(x − 1)(x − 1)(x − 1) or, multiplying terms, f(x) = x4 − 6x3 + 12x2 − 10x + 3.
In general, odd multiple roots cross the axis, whereas even ones do not.
Multiple roots pose some difficulties for many of the numerical methods:
a) The fact that the function does not change sign at even multiple roots precludes the use
of the reliable bracketing methods. Thus, of the methods covered in this book, you are
limited to the open methods that may diverge.
b) Another possible problem is related to the fact that not only f (x) but also f ‘(x) goes to
zero at the root. This poses problems for both the Newton-Raphson and secant methods,
which both contain the derivative (or its estimate) in the denominator of their respective
formulas. This could result in division by zero when the solution converges very close
to the root. A simple way to circumvent these problems is based on the fact that it can
16 | P a g e
be demonstrated theoretically (Ralston and Rabinowitz, 1978) that f (x) will always
reach zero before f ‘(x). Therefore, if a zero check for f (x) is incorporated into the
computer program, the computation can be terminated before f ‘(x) reaches zero.
c) It can be demonstrated that the Newton-Raphson and secant methods are linearly, rather
than quadratically, convergent for multiple roots (Ralston and Rabinowitz, 1978).
Modifications have been proposed to alleviate this problem. Ralston and Rabinowitz
(1978) have indicated that a slight change in the formulation returns it to quadratic
convergence, as in
It can be shown that this function has roots at all the same locations
as the original function. Therefore, substituting into Newton-Raphson method:
Example: Use both the standard and modified Newton-Raphson methods to evaluate the
multiple root of f(x) = (x − 3)(x − 1)(x − 1), with an initial guess of x0 = 0.
17 | P a g e
The preceding example illustrates the trade-offs involved in opting for the modified Newton-
Raphson method. Although it is preferable for multiple roots, it is somewhat less efficient and
requires more computational effort than the standard method for simple roots. It should be
noted that a modified version of the secant method suited for multiple roots can also be
developed. The resulting formula is (Ralston and Rabinowitz, 1978)
9) Roots of Polynomials
18 | P a g e
We now apply the discussion of the previous section to the special case where f (x) is a polynomial of
degree n. We want to find the roots (real or complex) of the polynomial equation
We note the following facts regarding Equations with real or complex coefficients.
To use Newton-Raphson method for computing the root of an equation, we need to evaluate f(x i) and
f’(xi). We describe a method, known as Horner's method, for calculating f(x) and f’ (x) where f is a
polynomial. Recall that for any polynomial as given in Equ.1 we can write
and R is a constant. Equations (Equ.1, Equ.2, and Equ.3) are compatible provided:
bn-1 = an
………………….
………………….
b0 = a1 + x0 b1
R = a0 + x0 b0
Thus if (x – x0 ) is not a factor of f (x), then f(x 0 ) = R = a0 + x0 b0. Therefore, start successively
compute bn-1 ,... , b0 . We can then determine f(x0). To compute f' (x0), we first differentiate
Equ.2 and obtain f' (x) = q(x) + (x - x0) q' (x). Thus f’(x0) = q(x0)
19 | P a g e
We can improve the accuracy of the root to any desired number of decimal places by proceeding to
compute X3, X4, ...
20 | P a g e