0% found this document useful (0 votes)
12 views14 pages

APM2613 - Lesson 2 - 0 - 2023

Apm2613 notes

Uploaded by

mmenzi101
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views14 pages

APM2613 - Lesson 2 - 0 - 2023

Apm2613 notes

Uploaded by

mmenzi101
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

APM2613/LNS2/0/2023

Tutorial letter LNS2/0/2023

Numerical Methods I
APM2613

Year module

Department of Mathematical Sciences

IMPORTANT INFORMATION:
Nonlinear equations:
This tutorial letter contains Lesson 2, additional notes on
solutions of nonlinear equations. Please read this information
along with Chapter 2 and Section 10.2 to supplement the
textbook. The contents of this topic constitute a substantial
portion of Assignment 1. So you need to read it alongside
attempting this assignment.

BARCODE

university
Define tomorrow. of south africa
1 Lesson 2: Solutions of Non-linear Equations
Reading
Chapter 2 and Chapter 10.2 of textbook

1.1 Objectives and Outcomes


1.1.1 Objectives
The objectives of this lesson is to highlight aspects of the study of solutions of equations in one or more
variables to supplement Chapter 2 and Chapter 10.2 of the syllabus:

• To highlight the key points of the fixed point method;

• To highlight the key points of the Bisection method;

• To highlight the key points of Newton’s method for one equation, as well as a system of equations;

• To highlight the key points of the Secant method and its variations;

• In studying each method to highlight key aspects of investigating convergence and error of these
methods;

• To highlight the key points of other methods used for specific purposes in the study of solutions of
nonlinear equations;

1.1.2 Learning Outcomes


At the end of this lesson you should be able to:

• Understand how the following methods/algorithms/schemes are formulated (the theory):

– Bisection method;
– Fixed point method;
– Newton’s method and its extensions: Secant method; Regula falsi method;
– Horner’s method and Muller’s method
– Newton’s method for systems of two equations in two unknowns.

• Illustrate the difference between the various methods geometrically.

• Implement the associated algorithms to solve nonlinear equations.

• To analyse the error and convergence of these methods.

• Identify the strengths and weaknesses of each algorithm.

• To identify and apply methods used to accelerate convergence.

2
APM2613/LNS2/0/2023

1.2 Introduction
In this part of the course, the question of solving non-linear equations arising in many situations where ana-
lytic methods may not be used is addressed by presenting a number of numerical schemes/algorithms for this
purpose. In most cases there are no exact analytical formulas to solve such equations. Various schemes have
been formulated to generate approximate solutions, starting with simple ones and proceeding to other meth-
ods which were formulated due to their advantage as related to convergence and computational efficiency.

This chapter focuses on solving the root equation of the form

f (x) = 0

where f (x) is a function defined on an interval [a, b]. The root of this equation, also known as the zero of
the function is the x-intercept of the function f (x). Assuming some ’nice’ characteristics of the function on
the interval [a, b], some foundational properties of calculus are used as the basis of the existence of a root in
this interval.

Apart from the bisection method, the search for a root of f (x) = 0 numerically involves finding solutions
of the root equation of the form
x = g(x).
Solutions of the latter are generally called fixed points in the sense that you seek a value x (a fixed point)
such that g(x) gives you back the same value, x. Hence this form of the root finding problem is also referred
to as the fixed-point formula. The numerical methods here discussed and used seek to find expressions of
g(x) based on various systematic mathematical arguments. Thus all the methods, starting with the simple
fixed-point method, Newton’s method and its variants, and other methods, can be viewed as different ways
of formulating the function g(x) of the fixed-point formulaequation.

Hence it is important to be aware of the form of the root-finding problem: whether it is the root equation
form of the fixed-point form or a general non-linear equation. Every non-linear equation can be written in
the form f (x) = 0 so as to identify the function f (x).
While the methods discussed in Chapter 2 are for equations in one variable, it is useful to view Chapter 10.2
as an extension of Chapter 2 since it deals with extending Newton’s method to solve a system of non-linear
equations instead of one equation. While also other methods can be extended to the case of functions of
more than one variable, the focus of this module is on extending Newton’s method only.

It is very important to note that the notes presented here do not in anyway replace the need to read the
relevant chapter/section in the textbook. Similar summaries are actually given in the companion website
of the textbook.

1.3 Nature of the problem


At the start of this section of study it is of paramount importance to understand the terminology used for the
discussion at hand. Note the distinction between the two root-finding problem formulations:

f (x) = 0 (1)

and
x = g(x) (2)

3
also respectively referred to as the root problem and the fixed-point problem. The root problem involves
rearranging the given equation so that one side of the equation is 0. A solution of this problem in the usual
x-intercept of the graph of f (x). It is also worth noting the use of the terms root, zero and fixed-point in the
context of these two equations.

For the fixed-point problem, one side of the equation (obtainable in a number of ways) is x. Many of the
techniques involve finding an expression of g(x) that ’hopefully’ converges to a fixed-point x (or p as per
terminology of the textbook). The solution process involves iterative computations of g(x) to find successive
approximations to the solution.

Geometrically, this would be the point of intersection of the graph of y = g(x) and the line y = x.The
iteration formula g(x) may converge to the required solution or diverge away from it, sometimes to another
solution if it exists or simply away in an unbounded manner. The conditions for convergence become part of
the analysis. Such convergence sometimes depend on the choice of the initial value of the approximation.

So in discussing each method the elements of the process are in focus: formulation; implementation and
analysis of error and convergence.

1.3.1 Other resources


The textbook authors have created a companion website that, in particular, has good animation that de-
mostrate how the various methods of this chapter work. It is worth visiting this website for an appreciation
of the progression of each method from an initial point to the required solution. The direction to the com-
panion website is given in Section 1.4 (p38) of the textbook:

https://sites.google.com/site/numericalanalysis1burden/

In what follows, a quick overview of the various methods for solving the root equation f (x) = 0 is given.

1.4 Bisection Method


This method is for solving the problem in the form of equation (1) and only requires that the chosen initial
interval (a, b) contains the root, a confirmation of which is via the Intermediate Value Theorem. Calcula-
tion of the root p is carried out by simply halving the current interval. The calculation is then followed by a
decision of which endpoint a or b to discard by reapplying the Intermediate Value Theorem. The simple test
of the Intermediate Value Theorem for an interval (a, b) is for the product f (a) · f (b) < 0.

So the Bisection method starts by identifying a suitable initial interval by applying the Intermediate Value
Theorem to a pair of values. It proceeds by finding the next iterate as the midpoint of the current interval,
followed by checking the sign of its functional value and discarding the point with the same functional value
sign as the new iterate. This procedure is repeated successively and subsequent intervals get smaller and
they enclose the desired root. Example 1 on p50 illustrates the implementation of this method. Note that the
function f (x) is only used in the intermediate value test.

1.4.1 Convergence
The nice thing about the Bisection Method is that the repeated application of the intermediate point test af-
ter each calculation of the midpoint ensures that the iterations converge to the root because each time it is

4
APM2613/LNS2/0/2023

bracketed by the endpoints.

Furthermore, the convergence rate of the method is simple and given in Theorem 2.1.

1.4.2 Termination criterion


Since the next iterate is obtained simply as the midpoint of the current interval, it is possible to determine the
number of iterations n required to attain a specified tolerance ϵ by solving the inequality
b−a
< ϵ, b>a (3)
2n
for n.

Example 2 on p 52 illustrates this calculation.


Note that this calculation does not involve the function f (x).

1.5 Fixed-point Iteration


Fixed-point iteration is the core of the discussion of the methods presented in Chapter 2. In simple terms, a
fixed-point iteration formula involves rearranging equation (1) in various ways to obtain equation (2). The
different ways of rearranging (1) lead to different expressions of g(x) in (2).

Theorem 2.3 states a sufficient condition for the existence and uniqueness of a fixed point:

(i) Existence: g(x) ∈ [a, b] and g ∈ C[a, b] for all x ∈ [a, b];
i.e. g maps itself in [a, b] and is continuous on that interval.

(ii) Uniqueness: g ′ (x) ≤ k for all x ∈ [a, b]; i.e. there are no asymptotes in (a, b).

Note that both hypotheses of Theorem 2.3 must be satisfied.

5
Example
The root equation f (x) = 2x3 − 7x + 2 = 0 can be rearranged in various expressions
of x = g(x), i.e. of (2), as shown in the table below. The initial value and the result
of repeated iterations are also given.

Iteration Initial value Result


2(x3i + 1)
xi+1 = 1 Converges to 0.292893
7
2 Diverges
7xi − 2
xi + 1 = 1 Converges to 1.707106
2x2i
2 to 1.707106
  12
7xi − 2
xi+1 = 1 Converges to 1.707106
2xi
2 Converges to 1.707106
  13
7xi − 2
xi+1 = 1 Converges to 1.707106
2
2 Converges to 1.707106

Since the given function f (x) is a polynomial of degree 3, it has 3 roots. The initial values chosen led to two
of the 3 roots. Other choices might have led to the other root. Also worth noting is that
You can think of other iteration formulae (2) and try them using different initial values.

In a sense, the rest of the methods discussed for solving nonlinear equations entails finding different ex-
pressions of g(x).

1.5.1 Convergence Criterion


The question of convergence for a method or formula that can involve different expressions of the function
g(x), may depend on the formula itself and the initial value chosen for the iterations, as the above example
illustrates. Thus convergence requires finding a common way of testing whether a given formula converges
in a given interval.

The Fixed-point Theorem (Theorem 2.4) is a simple test for convergence that can be applied to an iterative
formula. Simply stated, for a function g(x) that is continuous on [a, b], that maps the given interval onto
itself, and is differentiable on the given interval, it is sufficient for g(x) to converge if

|g ′ (x)| ≤ k < 1, 0 < k < 1, for all x ∈ (a, b). (4)

Note that the test is performed for a given interval (a, b).

Order of Convergence

The speed or rate of convergence of an iterative scheme is the next question.

6
APM2613/LNS2/0/2023

Let ϵi be the error in the i-th iteration; i.e. ϵi = pi − p.

If g(x) and its n derivatives are known at p, then g(pi ) can be expanded in a Taylor series about p to obtain

g ′′ (p) g ′′′ (p)


g(xi ) = g(p) + g ′ (p)(xi − p) + (xi − p)2 + (xi − p)3 + . . . (5)
2! 3!
ϵ2
= g ′ (p) + g ′ (p)ϵi + g ′′ (p) i + . . . (6)
2!
or equivalently as
ϵ2i ϵ3
xi+1 − p = g ′ (p)ϵi + g ′′ (p) + g ′′′ (p) i + · · · = ϵi+1
2! 3!
since g(pi ) = pi+1 , g(p) = p, pi − p = ϵi .

We observe that the error in the (i + 1)-th term has been expressed in terms of the i-th error. Hence if ϵi is
fairly small, ϵni gets smaller with higher powers of ϵi , thus causing the first term to be dominant over other
terms. However, if ϵi is large, little can be said about convergence.

If g ′ (p) ̸= 0 then ϵi+1 ∝ ϵi . If on the other hand g ′ (p) = 0 and g ′′ (p) ̸= 0, then ϵi+1 ∝ ϵ2i .

Definition: The order of convergence is the order of the lowest non-zero derivative of g(x) at x = p.

The order of convergence of an iterative method is not unique but depends on the iterative formula being
used. Thus the speed of convergence varies with the choice of the formula g(x). The higher the order, the
faster the convergence.

It therefore seems that the choice of the iteration function g(x) is crucial in finding the root of equation (1).

Example

For
f (x) = x2 − 5x + 4 = 0, with p = 1; 4
and
x2 + 4 2x
x = g(x) = ; g ′ (x) =
5 5
so that
2 8
g ′ (1) =
̸= 0; g ′ (4) = ̸= 0
5 5
hence the process is first order convergent.

A geometric test for convergence

We recall that geometrically, the solution of x = g(x) is essentially the value of x at the intersection of the
curve y = g(x) and the line y = x.

7
Geometrically, on the superimposed graphs of y = x and y = g(x), the test involves starting at the point x0
and first moving vertically until the curve y = g(x) is reached. From this point movement is horizontally un-
til the line y = x is reached, then vertically to g(x), and so on, as shown on the figures on p59 of the textbook.

The relationship between convergence and the slope g ′ (x) of g(x) is illustrated in Figure 2.6(b) (monotonic)
and Figure 2.6(a) (oscillating) in the textbook.
(see also the animation in the textbook companion website)

The methods discussed starting from Section 2.3 are different expressions of the iterative formula x = g(x)
where the right-hand side g(x) is obtained in a systematic way.

1.6 Newton’s Method


The Newton’s (also known simply as Newton-Raphson’s) Method is popular for it’s fast convergence.
f (x)
It makes use of the function f (x) from equation (1) and the iteration equation g(x) = x − ′ , leading to
f (x)
the iterative formula
f (xi )
xi+1 = xi − ′ . (7)
f (xi )
Geometrical Application

Geometrically, the procedure involves the following steps on the graph of f (x):
1. Picking an initial guess, x0 = p0 ;
2. Obtaining a new value x1 = p1 by drawing the tangent to the curve at at x = x0 , y = f (x0 ), and
extending it to the x-axis;
3. Repeating 2 above for another value x2 using x1 as the new guess. Other subsequent approximations
are obtained in the same way.
The result of carrying out these steps is illustrated in Figure 2.7 of the textbook.

Another formulation of Newton’s method is as follows.

Let ∆x = x0 − x1 so that x1 = x0 − ∆x.


The slope of the tangent to the curve at the point (x0 , f (x0 )) is
f (x0 )
f ′ (x0 ) =
∆x
and hence
f (x0 )
x1 = x 0 − .
f ′ (x0 )
In general the equation of a line through (xk , f (xk )) with slope m is given by

f (x) = f (xk ) + m(x − xk ).

Using this, the approximation of the root of f (x) = 0 satisfies

0 = f (xk ) + m(x − xk ) = f (xk ) + m(xk+1 − xk ).

8
APM2613/LNS2/0/2023

Solving for xk+1 yields


f (xk ) f (xk )
xk+1 = xk − = xk − ′ = g(xk )
m f (xk )
as in (7) above.

1.6.1 Convergence
Note the special expression of the iterative formula x = g(x) in equation (7). The choice of g(x) is such that
its first derivative at x̄ (or p) vanishes. Hence the method is a second order method.

(See a more formal discussion of convergence using Newton’s Method on p69 of the textbook.)

1.6.2 Pitfalls of Newton’s Method


while Newton’s Method converges fast, sometimes it may fail to converge but oscillate between two values
because of one of the following reasons:

(i) If there is no real root.

1. If the iterate corresponds to a turning point (or very close to one) as illustrated in Figure 1

2. If f (x) is symmetrical about the root as shown in Figure 2

3. If x0 is not close enough to p such that some other part of the function ”catches” the iteration. This is
illustrated in Figure 3.

4. If the roots are too close to each other the slope may be horizontal and hence fail to intersect the curve
anywhere except at the point.

Figure 1: Flat tangent

p0
p

Figure 1: Flat spot


Figure 2: Symmetric roots f to stop the iterations. One
If any of the above situations occur it may be necessary to use another criterion
alternative is to stop after a predetermined number of iterations. another criterion is to stop when |xi+1 −xi | <
ϵ, where ϵ is some tolerance value, and f (xi ) = 0, or at least |f (xi )| < ϵ.
p1
p p0=p2

9
Figure 2: Symmetric roots f
Figure 2: Symmetric roots f

p1
p1 p p0=p2
p p0=p2

Figure 2: Symmetric iterations

Figure 3: Runaway root


Figure 3: Runaway root

f
f
p0 p1 p2
p0 p1 p2

Figure 3: Runaway iterations

1.7 Secant Method


This method is a modification of Newton’s method obtained by replacing the derivative in equation (7) by
the limit definition of f ′ (x) at xk−1 and assuming that xk−1 and xn−2 are close enough to remove the limit.
The method involves two initial approximations x0 and x1 (or p0 and p1 )

The iterative function of the iterative formula of the Secant method now involves the two previous iterates
xk−1 and xk .
Geometrically, instead of using the tangent at a guess point xk , the method uses the secant line through two
current guess points, (xk , f (xk )) and (xk−1 , f (xk−1 )). Otherwise the idea of using the intersection of this
line with the x-axis as the next approximation is the same as for Newton’s method.

1.7.1 Convergence
Although this method is efficient, convergence to the required root is not guaranteed. The next method
addresses this shortfall of the Secant method.

10
APM2613/LNS2/0/2023

1.8 Method of False Position


To improve on the uncertainty of the Secant Method, the method of false position (also called the Regula
Falsi Method), uses the Intermediate Value Theorem to ensure that the next choice of the endpoints of the
interval brackets (or contains) the root being sought.

Geometrically, the application process involves the following steps:

1. Choosing two initial guesses, say x0 = a and x1 = b such that f (a) and f (b) have opposite signs; i.e.
f (a)f (b) < 0;

2. Joining the points f (a) and f (b) with a straight line (the secant line); Identifying the point xM at the
x-intercept of the secant line in 2. above.

3. Examining the value of f (xM ) and compare with f (a) and f (b), and if

(a) f (xM )f (a) > 0, set a = xM ;


(b) f (xM )f (b) > 0, set b = xM .

4. Repeating the process using xM and either a or b as the new (endpoint) values.

Note that the Secant and Regula Falsi methods have the same iterative formula, but the latter has the brack-
eting test as an extra step.

The Bisection and Regula Falsi methods have the following disadvantages to be aware of when using them:

(i) Calculation of the intermediate point becomes tricky when the other two approximations on either side
are close together;

(ii) Round-off error may cause problems as the intermediate point approaches the root in that since f (xM )
may be slightly in error, f (xM ) may be positive when it should be negative or vice versa.

(iii) When the three current points are close to the root, their functional values may become very small;
i.e. the evaluations may result in an underflow when testing for signs by multiplication of two small
numbers, hence the test may fail.

Usually, Newton’s method may be used to refine (or speed up) an answer obtained by other methods such as
the Bisection method.

1.9 A Note on Convergence


While brief discussion of convergence has been included for some of the methods discussed, Section 2.4
of the textbook discusses the notion of order of convergence for iterative methods in general. A definition
of order of convergence is given in terms of a general sequence. Two orders are highlighted - linear (order
1) and quadratic (order 2), as perhaps common orders of convergence, with quadratic convergence giving
satisfactory fast convergence where is holds. A few theorems are also given for testing convergence of the
iterative formula pn = g(pn−1 ) and for the existence of a root p of f (x) in an interval (a, b). This section
is worth reading in detail and can be useful in determining existence of roots and analysing convergence of
iterative methods whose analysis of convergence is not specifically covered.

11
1.9.1 Speeding up Convergence
Having introduced the notion of convergence in the above section, Section 2.5 presents two methods,namely
Aitken’s and Steffensen’s methods, that can be used to speed up convergence of methods known to converge
only linearly. Of particular note is that the formulae given are for refining the n-iterate pn , and hence are
only in terms of pn , pn+1 , pn+2 , n ≥ 0. Thus their application necessitates starting with three values.

1.10 Methods for Polynomial Functions


Horner’s method and Muller’s method have been presented as useful techniques for handling polynomial
functions, with specific mention of possible existence of complex roots for Muller’s method.

1.10.1 Horner’s Method


A major strength of Horner’s method is the computational economy embedded in it. The main objective for
using this method is to economise on computations by using synthetic division to evaluate P (xi ) and P ′ (xi )
which uses only the coefficients of P (x).

It is used in the evaluation of the polynomial values when Newton’s method is applied to a polynomial
equation P (x) = 0, whose Newton’s iterative formula would be

P (xi )
xi+1 = xi −] .
P ′ (xi )
Example 2 on p93 illustrate the use of this method.

1.10.2 Muller’s Method


Although Muller’s method is particularly useful in handling complex roots where the root equation is a poly-
nomial equation, it can be used for any root-finding problem.

Unlike the other methods based on using an intersection of a line with the x-axis to find the next approx-
imation, Muller’s method uses the intersection of a parabola to find the next approximation. Intuitively, a
quadratic approximation of a polynomial is better than a linear approximation. So Muller’s method is con-
sidered a stronger choice for approximating the next iterate.

Its formulation is based on using a quadratic polynomial of the form

P (x) = a(x − p2 )2 + b(x − p2 ) + c

that passes through the points corresponding to three approximate points xi−2 , xi−1 and xi , i = 2, 3, . . . at a
time. This requirement is used to obtain the values of the coefficients a, b and c. The iterative formula uses
these values, and has the form (see top of p97)
2c
p3 = p2 − √ .
b + sgn(b) b2 − 4ac
The iterative process continues by discarding xi−2 and uses the new iterate and the other two to repeat the
computations of new values of a, b and c.

12
APM2613/LNS2/0/2023

1.11 Newton’s Method for Multi-variable Functions


In Chapter 2 the focus of solving nonlinear equations is on functions of a single variable. While Newton’s
method is still fresh, it is worth looking at how it is applied in the case where the function whose root is
sought involves several variables. Section 10.2 addresses this issue.

We highlight the similarities and differences in Newton’s formula in the two cases, using a function

F (x, y) or F (x1 , x2 ) or F (x),

where  
x1
x=
x2
While Newton’s method for f (x) = 0 uses the iteration (fixed-point) formula

x = g(x) = x − ϕ(x)f (x), with ϕ(x) = 1/f ′ (x)

Newton’s method for a function, F (x1 , x2 ) = 0 of two variables,


 
F1 (x1 , x2 )
F (x1 , x2 ) = ,
F2 (x1 , x2 )
uses
x = G(x) = x − JF (x)−1 F (x) (8)
where JF (x)−1 is now the inverse of a matrix of partial derivatives with respect to each of the variables
x1 , and x2 . This matrix of derivatives is called the Jacobian of the function F (x1 , x2 ), and is defined by

∂F1 (x1 , x2 ) ∂F1 (x1 , x2 )


 

JF (x1 , x2 ) = 
 ∂x1 ∂x2 
∂F2 (x1 , x2 ) ∂F2 (x1 , x2 ) 
∂x1 ∂x2
That is
∂F1 (x1 , x2 ) ∂F1 (x1 , x2 ) −1
 

JF (x)−1 = 
 ∂x1 ∂x2 
∂F2 (x1 , x2 ) ∂F2 (x1 , x2 ) 
∂x1 ∂x2
Finding the inverse of this matrix follows on standard methods used in linear algebra.

Briefly, for a function of two variables this is a 2 × 2 matrix whose


 inverse
 is relatively easy to find using
a b
techniques of finding the inverse of a general 2 × 2 matrix A = . For such a matrix
c d
   
−1 1 d −b 1 d −b
A = =
|A| −c a ad − bc −c a

For a discussion of the iterative equation (8), the Jacobian of a general n-variable function F(x1 , x2 , . . . , xn )
see p652. The inverse of such a matrix can be computed using similar computations of the inverse of an
n × n matrix in linear algebra.

13
1.12 Notations and Terminology
Each chapter concludes with a section on numerical software and chapter review. The notion of an ’algo-
rithm’ has been alluded to in previous tutorial letters. It is worth emphasizing that the given algorithms are a
general guide of implementing a technique using any software. They are to be translated accordingly to suite
the language of the software of your choice.

Finally, it is worth noting some notations and terminology used in this chapter that may recur in subsequent
chapters. An understanding of these will make understanding of presentations in subsequent reading and
study more fluent.

• {pn }∞
n=1 denotes a sequence whose terms are pn for n satrting from 1 to ∞.

• TOL is the textbook notation for ’tolerance’ or desired error.

• O(·), usually used in discussing convergence, denotes the order (or rate) of convergence.

• sgn(x) is the sign of x - see p52

• ∆pn denotes the forward difference - current value subtracted from the next.
(k)
• pi denotes the value of the iterate pi in the k-th iteration/step.

• x = (x1 , x2 , · · · , xn ) denotes an n-component vector.

• |A| = det(A) is the determinant of the n × n matrix.

• ∥x∥2m denotes the m-norm of the vector x.

This list of terms and notation may not be exhaustive, but included here to emphasise the importance of
understanding the notation used in the text (or any text for that matter) so that it does not get in your way
when studying.

14

You might also like