50% found this document useful (2 votes)
1K views320 pages

Numerical Methods, Robert W. Hornbeck PDF

Uploaded by

Roy Vallejo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
50% found this document useful (2 votes)
1K views320 pages

Numerical Methods, Robert W. Hornbeck PDF

Uploaded by

Roy Vallejo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 320
Numerical Methods Robert W. Hornbeck, Ph. D. Associate Professor of Mechanical Engineering Carnegie-Mellon University QUANTUM PUBLISHERS, INC. 257 PARK AVENUE SOUTH, NEW YORK, N.Y. 10010 Copyright © 1975 by QUANTUM PUBLISHERS, INC. All rights reserved. No part ofthis book ‘may be reproduced in any form or by any ‘means without written permission from the Dublishers. Printed inthe United States of America 9 Mlustrated by K&S Graphics To the memory of my Father Preface The purpose of this book is to present, in as clear a fashion as possible, a logically structured collection of the fundamental tools of numerical methods. The methods presented are all well suited to the digital computer solution of problems in many areas of seience and engineering. Every effort has been made to include the most modern and efficient techniques available in the rapidly developing field of numerical methods, without neglecting the broad base of those older, well-established techniques which are still in widespread use. ‘Since the emphasis of the book is on the understanding and use of the various methods, proofs have been included only where they might enhance understanding or provide motivation for the study of a particular method. A number of illustrative example problems have been integrated into the main body of the text at points where the presentation of a method can best be reinforced by the immediate use of the method. In addition, an extensive assortment of detailed solved problems has been included at the end of cach chapter, illustrating virtually every topic considered in that chapter and illuminating the fine points and potential difficulties of the various methods. The only mathematical background required of the reader is the usual introductory calculus sequence. This overall approach makes the book suitable not only as a text in a structured classroom situation, but also for self-study and as a supplement to all other texts in the subject. This book does not follow the currently popular practice of providing a complete computer program for each method discussed. Based on the author's wide experience, this practice tends to encourage the student to simply reproduce and run the programs, rather than to actually attempt to understand the method in depth. In addition, such programs tend to restrict the scope of the book to a single computer language (usually FORTRAN), while for various reasons it may be desirable or convenient for the reader to ‘employ other languages, such as PL/I, APL, or BASIC, ‘Although many of the examples and illustrative problems given in the text actually represent the results of the mathematical modeling of physical situations, they are usually presented in mathematical terms, Since the text is not cast in the rigid mold of any single discipline in engineering or science, this permits an instructor to show the relevance of the methods and problems to any desired area. However, the author would caution against the use of complex problems involving much physical insight until the basic numerical techniques have been mastered, since a student may become lost or misled in the physi and hence not gain the desired experience in numerical methods. PREFACE Chapter 1 provides an introduction to the power (and limitations) of numerical methods, some motivation for the engineer or scientist to study these methods, and a short discussion of digital computing from the user’s point of view. The basic building blocks ‘of numerical methods, the Taylor series and the finite difference calculus, are presented in Chapters 2 and 3. Interpolation is the subject of Chapter 4. Despite the fact that this topic continues to be of great practical importance, and also forms the theoretical basis for much of numerical analysis, many current texts for some reason either ignore it completely, or do not give it adequate attention. Chapter 5 is devoted to finding the roots of equations. Many methods of root solving can be developed directly from the Taylor series, but the concept of inverse interpolation is also useful Both direct and iterative methods are presented in Chapter 6 for solving sets of simultaneous linear algebraic equations. Particular attention is given to the problem of ill-conditioning and to the solution of very large sets of equations. In Chapter 7, the concepts of interpolation are extended to functional approximation, and using the tools gained in Chapter 6, least squares data fitting can be examined in an effective manner. ‘Numerical integration is considered in Chapter 8. Several highly accurate and efficient techniques are included, and methods of dealing with singularities are examined in some detail. In Chapter 9, a wide variety of numerical techniques is presented for solving ordinary differential equations. Approaches are considered for solving both initial value and boundary value problems. The accuracy and efficiency of each of the methods is carefully considered. ‘The subject of Chapter 10 is the algebraic eigenvalue problem. Much emphasis is given to the selection of the most efficient technique to deal with the particular problem at hand. in Chapter 11, the book is concluded with an introduction to the numerical solution of partial differential equations, particularly of the parabolic and elliptic types. The power and potential of finite element methods are also discussed briefly This material is more than ample for a one-semester course at the junior or senior any of the engineering or science disciplines, If sufficient time is not available to cover all of the subject matter, itis suggested that Section 7.2, the more advanced portions of Chapter 10, and Chapter I1/be considered as possible omissions which would stil leave the logical structure of the book intact. ‘The author wishes to express his gratitude to all of his students and colleagues who, through their encouragement and suggestions, have made this book possible. Particular thanks are due to Jean Stiles, who deciphered the author's handwriting and typed the manuscript in her usual expert manner, and to Earl Feldman, who independently verified the flow charts by writing and running computer programs from them. Finally, the author would like to thank Nicolas Monti and Michae! Schaum of Quantum Publishers for their continuing interest, advice, and encouragement, level i Ronert W. Horwneck Contents Chapter 1 Introductory Topics 1.0 Introduction 44 What Are Numerical Methods? 1.2 Are There Limits to the Capability of Numerical Methods? 43° Why Study Numerical Methods? ‘Computer Languages ‘The Verification Problem Do Computers Make Mistakes? ‘The Need to Get Involved Chapter 2 The Taylor Series Chapter 3. The Finite Difference Calculus 3.0 Introduction 3A Forward and Backward Differences 3.2 Higher Order Forward and Backward Difference Expressions 33° Central Differences 34 Differences and Polynomials Chapter 4 Interpolation and Extrapolation Introduction Generation of Difference Tables Gregory-Newton Interpolation Formulas Interpolation with Central Differences Interpolation with Nonequally Spaced Data; Lagrange Polynomials Chebyshev Interpolation; Chebyshev Polynomials 46 Interpolation with Cubic Spline Functions 16 6 16 20 at 23 35 35 37 at 45 a 50 Chapter 5 Chapter 6 Chapter 7 Chapter 8 Chapter 9 CONTENTS Roots of Equations 5.0 Introduction 8A Bisection 52 Newton's Method (Newton-Raphson) 5.3 Modified Newton’s Method 5.4 The Secant Method 5.5 Root Solving as Inverse Interpolation 5.6 A Brief Note on Special Methods for Finding Roots of Polynomials ‘The Solution of Simultaneous Linear Algebraic Equations and Matrix Inversion 6.0 Introduction 6:1 Basic Matrix Terminology and Operations 6.2 Matrix Representation and Formal Solution of Simultaneous Linear Equations 6.3 An Overview of Equation Solving 6.44 Gauss Elimination and Gauss-ordan Elimination 6S Matrix Inversion by Gauss-Jordan Elimination 6.6 _Ill-Conditioned Matrices and Sets of Equations 6.7 Gauss-Siede! Iteration and Concepts of Relaxation Least-Squares Curve Fitting and Functional Approximation 7.9 Introduction TA Least-Squares Fitting of Discrete Points 7.2 The Approximation of Continuous Functions Numerical Integré Introduction The Trapezoidal Rule ‘Simpson’s Rule Romberg Integration Gauss Quadrature Multiple Integrals Integrals with Infinite Limits Dealing with Singularities Numerical Integration Methods in Perspective ‘The Numerical Solution of Ordinary Differential Equations Introduction ‘The General Initial Value Problem ‘The Euler Method ‘Truncation Error 64 64 65 66 69 70 n 3 85 85 85 38 1 98 100 101 121 121 122 125 144 144 148 148 150 154 159 160 162 165 185, 185 186 189 190 CONTENTS Chapter 10 Chapter 11 Convergence and Stability Runge-Kutta Type Formulas ‘The Adams Formulas—A Class of Multistep Formulas Predictor-Corrector Methods ‘The Solution of Sets of Simultaneous First-Order Differential Equs- 9.9 Boundary Value Problems 9.10 The State of the Art Matrix Eigenvalue Problems 10.0 Introduction 40.1 The General Problem 10.2 Reduction of the Problem AX = ABX to HX = AX: The Choleski Decomposition 10.3 The Power Method 40.4 Similarity and Orthogonal Transformations 10.5 The Jacobi Method 10.8 Householder’s Method 10.7 The LR and QR Algorithms 10.8 The QL Algorithm 10.9 A Review of Methods for Symmetric Matrices 10.10 Eigenvalues of Unsymmetric Matrices 10-11. Algorithms Available as ALGOL Procedures Introduction to Partial Differential Equations 41.0 Introduction 11.4 Classification of Second-Order Partial Differential Equations 14.2 Numerical Methods for the Solution of Parabolic Equations 14.3 Numerical Methods for the Solution of Elliptic Equations 14.4 Numerical Methods for the Solution of Hyperbolic Equations 14.5 Finite Flement Methods ‘Appendix Interpretation of Flow Charts Tables of Weights and Zeros for Gauss Quadrature A FORTRAN IV Subroutine for Matrix Inversion References Answers to Problems 192 198 196 199 202 203 208 227 227 227 229 23 235 236 2a 244 246 250 250 251 269 269 269 270 276 281 282 201 201 293 294 296 298 309 Chapter 1 Introductory Topics 1.0 INTRODUCTION We will begin this introductory chapter by briefly discussing the purpose and power of ‘numerical methods as well as their limitations, and then presenting a justification for the detailed study of these methods. 4.1. WHAT ARE NUMERICAL METHODS? Numerical methods are a class of methods for solving a wide variety of mathemat- ical problems. These problems can, of course, have their origins as mathematical models of physical situations. This class of methods is unusual in that only arithmetic ‘operations and logic are employed, thus the methods can be employed directly on digital ‘computers. Although in the strictest sense of the term, anything from the fingers to an abacus can be considered as a digital computer, we will use the term here to refer to electronic stored program computers which have been in reasonably widespread use since the middle 1950"s. | Numerical methods actually predate electronic computers by many years, and in fact, many of the currently used methods date in some form from virtually the beginnings of modern mathematics. However, the use of these methods was relatively limited until the advent of the mechanical desk calculator and then increased dramatically as, ina real sense, the methods came of age with the introduction of the electronic digital ‘computer. ‘The combination of numerical methods and digital computers has created a tool of immense power in mathematical analysis. For example, numerical methods are capable arities, complex geometries, and large systems of coupled equa- tions which are necessary for the accurate simulation of many real physical situations. Classical mathematics, even in the hands of the most ingenious applied mathematician, cannot cope with many of these problems at the level required by today’s technology. As a result, numerical methods have displaced classical mathematical analysis in many in- dustrial and research applications to the extent that (for better or for worse) classical analytical approaches are seldom considered, even for problems where analytical solu- tions could be obtained, since the numerical methods are so easy and inexpensive to employ and are often available as prepackaged programs. 2 NUMERICAL METHODS 4.2 ARE THERE LIMITS TO THE CAPABILITY OF NUMERICAL METHODS? ‘The answer to this question is an emphatic “yes.” It is the view of many laymen and of, entirely too many scientists and engineers who should know better, that if a problem can- not be solved in any other way, all one has to do is “put it on the computer.” This state of affairs is undoubtedly due to the enormous power of numerical methods which we have discussed in the preceding section. However, it is unfortunately true that there are many problems which are still impossible (in some cases we should use the word impractical”) to solve using numerical methods. For some of these problems no accurate and complete mathematical nodel has yet been found, so obviously it is impossible to consider a ‘numerical solution. Other problems are simply so enormous that their solution is beyond practical limits in terms of current computer technology. For example, it has been estimated that to obtain a detailed time-dependent solution to turbulent fluid problems, in- cluding the effects of the smallest eddies, would require on the order of 30 years. This estimate was based on 1968 technology and is probably off by no more than a factor of S or so based on today's technology. Of course, the entire question of practicality is strongly. dependent upon how much one is willing to pay to obtain an answer. Some problems are so important that industry or government is willing to spend many millions of dollars to obtain the necessary computing capacity and speed to make it practical to solve problems. which had previously been considered impractical to solve. In any case, although the boundaries are constantly being pushed back, there remain many problems which are beyond the reach of present technology, either in the formulation of the mathematical ‘model or in terms of actual computing capability. 1.3 WHY STUDY NUMERICAL METHODS? It may seem strange, in view of their widespread use in virtually every facet of science, technology, and government, that the author should feel an obligation to justify the study of numerical methods. For present and prospective numerical analysts and computer scientists, certainly no justification is necessary. For engineers and scientists, however, the justification might appear to some to be less apparent. In recent years many large computer programs, each requiring several man-years of work, have been developed to simulate complex physical problems. These programs are usually designed to be used by those without extensive knowledge of their inner workings. In addition, there are ever-expanding libraries of subprograms to per- form a wide variety of mathematical tasks using sophisticated numerical methods. In the face of these facts, one might indeed wonder whether there is a need for engineers and scientists to acquire a working knowledge of numerical methods. However, the engineer or scientist who expects to be able to locate a prepackaged program or library subprogram to perform every desired task will be sadly disappointed. The selection and application of a numerical method in any specific situation is still more of an art than a science, and the computer user who does not have the ability and knowledge to select and tailor a numerical method for the specific problem at hand, and to carry out the actual program- ‘ming of the method, will find severe limitations on the range of problems which can be handled. Obviously, when proven prepackaged programs or subprograms are available which are suited to the task at hand, itis by far the most efficient course to employ them. A. working knowledge of numerical methods is highly valuable even in these cases, however, CHAPTER 1. INTRODUCTORY TOPICS 3 since the user of such programs and library subprograms wil difficulties. These difficulties can stem from many causes, including the following: (a) No complex physical situation can be exactly simulated by a mathematical model. (This is an extremely crucial point, but is outside the scope of the present discussion.) (6) No numerical method is completely trouble-free in all situations. (c) Nonumerical method is completely error-free. (4) No numerical method is optimal for all situations. (There can be considerable overlap among (b), (c), and (d)._ We will not be concerned with precise definitions here, only broad concepts.) The difficulties with the numerical methods can result ina prepackaged program or library subprogram yielding erroneous re- sults, or no results at all. In addition, the user searching for a library subprogram to perform a certain task may find an overwhelming variety and number of subprograms which appear generally applicable, but the descriptive material will seldom give any cation of the efficiency of the subprogram or its suitability for solving the specific problem at hand. ‘The user with any of these problems, but no knowledge of numerical methods, must then seek out someone with the necessary information (perhaps a numerical analyst), if indeed such a consultant is available. In this situation, however, it may be difficult for the user to ask the right questions and for the consultant to give useable answers, since the background of the two may be vastly different. ‘We can thus see that there is a strong justification for the scientist or engineer to acquire a working knowledge of numerical methods. This knowledge enables the compu- ter user to select, modify, and program a method suitable for any specific task, aids in the selection and use of prepackaged programs and library subprograms, and makes it possible for the user to communicate with a specialist in an efficient and intelligent way when seeking help for a particularly difficult problem. Finally, it should be recognized that the bulk of what has come to be known as “methods development” (which is to all intents and purposes the writing of large programs to simulate complex physical problems) is done by engineers and scientists, and not by numerical analysts. Obviously, however, the most efficient and accurate numerical techniques must be employed in such work, and 1 thorough knowledge of numerical methods is essential for the engineers and scientists involved in such a project. ‘We now turn briefly to a discussion of several computer-related topics which are not, ‘numerical methods in themselves, but which are of considerable interest to anyone who must actually implement numerical methods on a digital computer. 1.4 COMPUTER LANGUAGES Most of the readers of this book will have had some experience in programming in a “high level” computer language such as FORTRAN, ALGOL, or BASIC. These languages allow the user to write programs in a form which includes algebraic formulas and Engli like logical and input-output statements. Such high level languages are virtually indepen- dent of the machine on which the programs will be run. Through the use of a computer program called a compiler (or translator), the high level program can then be converted into the fundamental machine code of the particular machine on which the program will actually be executed. 4 NUMERICAL METHODS By far the most widely used algebraic language for scientific purposes is FORTRAN IV or minor modifications thereof. With a few exceptions, ALGOL is seldom use for scientific computation today, but is widely used as a universal international language for describing algorithms. BASIC is quite popular as a language for use on time-sharing sys- tems and is usually used for relatively simple programming tasks. Other high level lan- guages which the scientific user may encounter are APL (also reasonably widely used on time-sharing systems and suitable for tasks ranging from the very simple to the highly sophisticated), MAD (an obsolescent ALGOL-like language), and PL-1 (a powerful lan- guage currently of interest primarily to computer scientists). The appearance of each new language is greeted with some trepidation by the average user since it means a new set of rules which may have to be learned, and possible confusion with other languages. However, any reasonably flexible person will find little difficulty in adapting to a new language if necessary. A much more important issue is the economic one, since the development of large computer programs is very expensive, and the conversion of large programs from one language to another can be a major task involving many months of work. This is one of the primary reasons why FORTRAN IV. is the current scientific “standard” and is unlikely to be displaced in the near future. 1.8 THE VERIFICATION PROBLEM ‘One of the most vital and yet most dificult tasks which must be carried out in obtaining a numerical solution to any problem is to verify that the computer program and the final solution are correct. First it must be established that the program is working as the programmer intended, ie. thatthe coding is correct. This can usualy be established by generous printing of intermediate results, and if necessary by making spot checks by hand for desk calculator computations. The second part of the verification procedure is to establish that the algorithm being employed will yield the correct solution. Since the correct solution to the problem is presumably not known beforehand (or we would not bother to obtain a numerical solution), this portion of the verification procedure will ssually be indirect. Thi + approach could consist, for example, of looking at various limiting cases of the problem for which known solutions are available. These limiting cases might be simulated withthe program under consideration by setting certain terms to zero, letting certain constants or conditions become very large or very small, or by bypassing temporarily certain sections of the program and/or inserting other small t_m- porary sections. In many cases the verification procedure can actually be more expensive and time consuming than obtaining the final desired answer. However, the confidence which one cean place in the final results is directly related to the time and care which are invested in the verification process. In estimating the time and cost required to obtain a numerical solution, itis essential that allowance be made for the verification process. Up to this point we have been concerned with the verification of a program written by the average user to solve a specific problem. The process of verification for a general program or library subprogram, which would be employed by many users to solve a wi variety of problems, would be similar but necessarily even more extensive and pains- taking, and would include a series of “worst case” trials to test the ability of the program to cope with known difficult problems. CHAPTER 1 INTRODUCTORY TOPICS 5 1.6 DO COMPUTERS MAKE MISTAKES? f course, in one sense or another, computers can and do make mistakes. We should note, however, that the vast majority of errors encountered in the course of computation fare the user's own errors. It is sometimes very difficult to accept that a hard-to-locate error is one’s own. Nevertheless, the most efficient procedure in tracking down errors is invariably to assume that this is the case, until the possiblity has essentially been elimi nated. If computer errors are encountered, they can be characterized as either hardware or software errors, True hardware errors are relatively rare, and we are not in a position to discuss them here. Software errors (which are really just some other programmer's errors) are more common, and could typically include errors in the computer executive system, errors in the compiler which result in incorrect object (machine language) code, and errors in library subprograms. Errors in the computer executive system (also variously called the exec, monitor, operating system, supervisor, and other names) can be quite confusing to the user. Modern systems usually incorporate the capability of handling several programs at ‘once in order to most effectively utilize the hardware (this is often called multiprocessing) or of allowing a number of users to compute and “converse” with the system from re- mote terminals (often called time-sharing). Some systems even combine these ‘capabilities. Most of the difficulties which the user encounters from the executive sys- tem come from unforeseen interactions of one program with another. These can result in anything from total system failure (a “crash”) to erratic behavior and incorrect results from the individual user’s program. ‘These errors are seldom repeatable, and simply re- running the program will usually rectify matters. Errors in compilers are particularly frustrating to the user, since a high level program which is perfectly correct can produce incorrect machine code and hence incorrect results. Fortunately, due to extensive verification, serious compiler errors are usually not encountered (and a list of known minor errors is usually available from the computer systems personnel). However, with those compilers which incorporate optimization, serious and almost unpredictable errors can occur. Optimization can be interpreted in this context as an effort to generate the most efficient possible machine code from a given program written in the high level language. This optimization effort includes changing the order of operations from that specified in the high level code in order to (hopefully) ‘obtain the same answers in less computer time. Optimization can result in remarkable savings in computing time in many cases, but the better the optimizer (in the efficiency sense) the more likely itis to result in incorrect machine code. In most cases itis possible to “turn off” the optimization facility of the compiler or to find a similar compiler without optimization or with relatively simple and error-free optimization.* It is recommended that this course be followed if possible in debugging and for the initial runs of a program, and that a compiler with a very high degree of optimization be employed only for production runs where efficiency is all-important. The highly optimized version should ‘of course be verified by comparison with results obtained without optimization. “Tris sometimes remarkably difficult to obtain information about whether a given compiler isan optimizing com- piler and whether or not i is possible to “turnoff” the optimization. However diligent search through the ‘manufacturer's manuals will usually yield the information in some form. 6 NUMERICAL METHODS Errors in library subprograms are generally the product of ineffective verification procedures, and usually cannot be dealt with by the user except by reporting the faulty routine to the responsible systems personnel. Finally, we should note that, although the errors are not mistakes in the usual sense, any computation carried out with a finite number of decimal places will result in roundoff error, and any numerical method has error which is inherent in the application of the method. These errors are best discussed with the presentation of each method, and we will not consider them further at this point. 1.7 THE NEED TO GET INVOLVED Numerical methods cannot be simply read about, they must be used in order to be understood. Accordingly, it is vital that the reader actually solve problems using the numerical methods described in this book. In closing this introductory chapter, the author would like to point out from personal experience that the best test of whether one understands a method is not to carry out a hand calculation (although this can be useful in the early stages of attempting to understand the logic) but to write a computer program, and thus to relinquish personal decision making to the impersonal computer. It is remarkable how hazy concepts can become clear under the resulting pressure to be completely precise and unambiguous. Chapter 2 The Taylor Series 2.0 INTRODUCTION ‘The Taylor series is the foundation of numerical methods. Many of the numerical tech- niques are derived directly from the Taylor series, as are the estimates of the errors involved in employing these techniques. The reader should be acquainted with the ‘Taylor series from his earlier studies, but we shall make a brief presentation of the subject here since our emphasis will be somewhat different from that of the conventional calculus, If the value of a function f(x) can be expressed in a region of x close to x = a by the infinite power series @- f2) = fla) +(x adp(ay+ 52 pray CaF peg) sot ayters (21) then f(x) is said to be analytic in the region near x = a, and the series (2.1) is unique and called the Taylor series expansion of f(x) in the neighborhood of x =a. Itis difficult to specify general conditions under which the series (2.1) will exist and be convergent, but it is evident that all derivatives of f(x) at x = a must exist and be finite. If the Taylor series exists, then knowing f(a) and all of the derivatives of f at x = a, we can find the value of f(x) at some x different from a, as long as we remain “sufficiently close” to x = a If as {x ~ a] is increased, a point is reached where the power series (2.1) is no longer convergent, then we are no longer “sulfciently close” to x = a and are outside the radius of convergence of the power series. Some series will converge for all |x ~a| (have an infinite radius of convergence) while others will converge only for values of [x ~ al below a certain limit. If the series is convergent, then the value of f(x) will be exact if an infinite number of terms are taken in the series. It is much more interesting and useful to us, however, to find out how well we can approximate f(x) near x = a by taking only afew terms of (2.1). This can be graphically illustrated by examining Fig. 2.1. Suppose we wish to find f(b). From (2.1), 100) = fla) +(b — a)p(a) + 2S pea) + PA pmeay (22) If only the first term of (2.1) is used, then the function is assumed to be a constant, f(a), as shown in Fig.2.1a. If the first two terms are used, then the slope of f at x =a is taken 7 a NUMERICAL METHODS $e) 50) fa) Approximation to Fes) 10) fa) Approximation to BO)=fay=0b= aif) fe) 10) sas} Approximation to 110)= flay+(b~ af tay+ (6) Thee terms Fig. 2.1 Approximations resulting from truncating Taylor series. into account by using a straight line from f(a) with a slope f'(a) as shown in Fi 2.1. Considering three terms allows the use of the curvature due to fa) as shown in Fig. 2.1c,etc. Each additional term improves the accuracy in the approximation for f(b). It will be useful for us to adopt some standard terminology and conventions about the truncation of a Taylor series. We begin by examining the error made in truncating a Taylor series. The error in the Taylor series (2.1) for f(x) when the series is truncated after the term containing (x ~ a)" is not greater than dx ja DY a" 23) Where the subscript “max” denotes the maximum magnitude of the derivative on the interval from a to x. It would seem that nothing can be gained from this error bound, since if the value of the (n + 1) derivative of f on the entire interval must be known to evaluate the error, then CHAPTER 2 THE TAYLOR SERIES 8 the value of f(x) on this interval would also be known, and there would be no need to carry out the expansion in the first place. This frustrating state of affairs is fairly com- mon in numerical analysis, however, and much useful information can be gained from (2.3), We have no control over the behavior of f (or any of its derivatives), nor over the constant (n+ 1)!. We do have control over how close x is chosen to be to a, or in other ‘words on the quantity (x—a)""'. Thus we use the terminology that the error (2.3) is of the order of (x—a)""', or is O(x ~a)""'. If the series expression for f(x) is truncated after the first three terms, we say that f(x) is accurate to O(x a)’, since Sa p@)+00—aP 24) It should be noted that the use of the notation (x —a)’ implies nothing about the constants or derivatives multiplying (x — a)’; for example, 7(x — a)" is of O(x — a)’. Thus the quantity (x — a)’ might equally well be described as a quantity varying as (x — a)’. It we take a fourth term in the series (2.1) for f(x) before truncating, we obtain $03) = fla) +(x — a)p(ay+ 25 pays $5 yma) oe -ay 25) Sx) = fla) +(x — ayf'(a) + In general, the four term representation (2.5) gives a more accurate approximation to f(x) for a given value of x than does the three term representation (2.4). ‘Thus one would ‘expect that for this function, the error term ©(x ~ a) would be less than the error term (x — a) in the three term series (2.4). We can generalize this result to the statement that if we confine ourselves to a Taylor series for a given function, then the following relationship holds between the error terms of a series truncated at n terms and the same series truncated at n +1 terms: O(a)" < O(x~ ay" 26) Note that this relationship will hold whether o not |x ~ al <1, as long as we are within the radius of convergence of the series. (Strictly speaking, (2.6) may not hold for the first few terms of some series, particularly if (x~a) is large in magnitude. However, the relationship will be true for large n, and for our purposes (2.6) should be considered as the general trend.) It should be noted that for certain series expansions some terms may vanish. For example, a truncated series representation of f(x) composed of the first four terms of (2.1) ‘might have exactly the same error as a truncated series composed of the first five terms of (2.1). See Problem 2.2 for an illustration of such a case. 10 NUMERICAL METHODS, IMlustrative Problems 2.1 Find the Taylor series expansion for sinx near x =0 using the series (2.1). 0, a@=0. The series (2.1) becomes poo. Since we are expanding about sings n()~% cos) +5 sin 0) and since sin(0)=0 and cos (0) = 2.2 Truncate the Taylor series for the sine found in Problem 2.1 to give a representa- tion of O(x)'. Show that this representation is in fact of O(x)" From Problem 2.1, (0)+x cos )-¥ sin (0) sin(x)= eos) + 0) =2-H 4009 But if we carry one more term in the series, sin) = sin O42 co-Fia Lem O+ sin (Q)+ (8 ai wwe see that this additional term is exactly zero since sin(0)=0. Therefore sncsyax E0007 This two term representation is thus actually of 6(3)' rather than O(x)' 2.9 Using the Taylor series expansion for e* about x =0, find e** to 6(0.5)'. Bound the error using the error expression (2.3) and compare your result with the actual From equation (2.1), a1 095405 , 05! , 05) 140540705 05 or, 10 600.57, Sey Os): 1+05+0.9'= 1625 Now according to (2.3), the error in this quantity should not be greater than CHAPTER 2 THE TAYLOR SERIES " ae | os) oo ea (0.020833) ‘where max denotes the maximum magnitude on 0% x=0.5. leq. =e" = 16487213, $0 the error is no greater in magnitude than (1.6487213)(0.0208333) = 0.0343831 ‘The actual error is 1.625 = 1,6487213 ~ 16250000 = 0,0037213 ‘hich les within the error bound. Notice that in this case the error which was 6(0.5)° was ‘actually 0.0237213 or about 0.190.5). 24 Find the Taylor series expansion about x log, (x). What is the radius of convergence of this series? for fx ‘The function and its derivatives are foo log. (I=) -1 reo re=qap noyeaz2 £O=qayp ra “a= ‘Using the series (2.1) with a = 0, we obtain toe 1-2)=-( To find the radius of convergence of the series, we apply the ratio test. We take the limit ‘The ratio test states that the series converges absolutely (even if all terms have the same sign) if this ratio is less than one. Thus the radius of convergence of the series is |x| <1. The ratio test tells us nothing if] = 1, ut we note that ifx = +1, then we have the negative of in | 2] = i xe Tins sss Which is the familiar divergent harmonic series. If x =—1, we have the series 118+ which is convergent. The series is thus convergent for —1= x <1 2 NUMERICAL METHODS 2.5 Using the results of Problem 2.4, obtain the geometric series for 1/(—x), |x| <1 This series is of great value in much numerical and approximation work. From Problem 2.4, log. (i Differentiating this series with respect to x, we obtain 3 a 2.6 Find e*" sin (0.1) to 6(0.1)" by using the Taylor series expansion for each function and multiplying them. From Problems 2.1 and 2.3, we have oy sino. 01-9 soy, or = 1401+ 22°. oy We have taken sufficient terms in each series so that when the two series are multiplied together, the largest error term will be O(0.1). (This will come from the product ofthe first term inthe sine series and the error term 0(0.1)*inthe e" series. Allother error terms willbe smaller.) Then sino [1901402 ooon'fon-22s ees] 1+ (0.174 300)" + 000." 110333 + 000.1)" 2.7 Show that f(x) =e" cannot be expanded in a Taylor series about x =0. We have fe £6) recente) ‘The function f(x) is bounded at x = 0, but all of the derivatives of f involve negative powers of x which result in those derivatives becoming unbounded at x=0. Thus f(x) ‘does not satisfy the conditions at x =0 for an expansion in a Taylor series about x = 0. CHAPTER 2. THE TAYLOR SERIES 13 2.8 Given the function f(x) x=0: *, consider the following Taylor series expansions about $6) =fO)+3FO+ 510+ 008)" $0) =fO+F'O+ OC? Let x= 2. Show that the error term (x)' < O(x)' despite the fact that (2)° > (2. ‘The two series are ete te z f= 14 x + OG? owxy If we let x=2, fa en l424t + 0ay = 142460) From tubles, e*=7.3891, so 73891 =5+60)) 73891 =3+-60y and thus the error terms are ccay = 2.3891 012) = 4.3891 [Note that for this problem we have been able to evaluste exactly the error terms since wwe knew e7, In general, this would not be possible 29 Using the Taylor series expansions about x =0, evaluate e™** to O(x)*. First, we use the Taylor series for the exponential: Now we employ the Taylor series expansion for sin x, taking enough terms each time to ensure that the result is accurate to at least O(3) beso? 6 +60)" ay] eget +0004) cat ~2 soy +! PF OW, ogy =tts-Ss 00 oo 6 slex+E+oay 4 NUMERICAL METHODS 2.10 Show that f(x) =(x 1)" cannot be expanded in a Taylor series about x =0 or x=, but can be expanded about x=2. Carry out the expansion about x fo)=( = feo=fo-7" 2 Ley ays Pay=-fo-0 =3@-1" rea=to-1 For an expansion about x =0, the quantities f(0), (0), 0), ete. are needed. These in- volve noninteger powers of (~1),e.g.(~1)"". These cannot be evaluated to give real values and thus this expansion is impossible. For an expansion about x = 1, we need quantities such as £0) =" Loy £0) =30y ray~-jor” While (1) is bounded, all of the derivatives /°(), "1, ete. are unbounded. ‘Thus the expansion about x = | is impossible. ‘The Taylor series expansion about x =? is fo) a-nrse-afbeor] & faa)=1+ ‘This series is convergent for [x ~2| <1 2.11 By a technique entirely different from the Taylor series expansion, we find the following series: oe eee tans ax 4242 TE Is this the Taylor series expansion about x = 0? Yes, since the Taylor series expansion about x =0 (ie. in powers of x) is unique. CHAPTER 2 THE TAYLOR SERIES 18 Problems 242 Find the Taylor series expansion for sinh x about x =0. 243 Find sinh0.9 to 0(0.9)' by using the Taylor series expansion from Problem 2.12. 2.14 Bound the error on your answer to Problem 2.13 by using the error expression (2.3) and ‘compare your result with the actual error. (Note: sinh0.9= 1,0265167, cosh0.9 = 14330864.) 248 Find the Taylor series expansion of sinx about x = m4. 2.46 Obtain the Taylor series expansion for 1/(1~ x) by using the series for 1/(1~ x) from Prob- lem 25. 27 Is it possible to expand log.x in a Taylor series about x = 0? Discuss. 2.8 Find an expression for sin x cos x accurate to 6(x)’ by using the Taylor series for the individual functions about x =0. 249 Evaluate cos (sin x) to O(x) by using the known Taylor series for sin x and cos x. 2.20 Examine the Taylor series about x =0 for e" when x=4. The fourth term in the series, x73} is larger than the third term, x'/21. Does this mean the series is divergent? Explain this apparent anomaly. 221° Consider two functions g(x) and h(x), related in such a way that g'(x)= h(x) and WG) = g(x) and that g(0) = O and 0) the Taylor series expansions for g(x) and h(x) using only this information. 2.22 Show that the Taylor series expansion of f(x)= x" about x = 1 simply reproduces 2 223. The error function erf(x) is defined as erf(x)= (IV @)fie™* dt. Find erf(1) to OCI) by expanding erf (x) in a Taylor series about x =0. (The value of erf (1) = 0.84270079 to eight decimal places.) Chapter 3 The Finite Difference Calculus 3.0 INTRODUCTION In conventional calculus the operation of differentiation of a function is a well-defined for- mal procedure with the operations highly dependent on the form of the function involved. Many different types of rules are needed for different functions. In numeri- cal methods a digital computer is employed which can only perform the standard arithme- tic operations of addition, subtraction, multiplication, and division, and certain logical operations.* Thus we need a technique for differentiating functions by employing only arithmetic operations. The finite difference calculus satisfies this need. 3.1. FORWARD AND BACKWARD DIFFERENCES: Consider a function f(x) which is analytic (can be expanded in a Taylor series) in the neighborhood of a point x as shown in Fig. 3.1. We find f(x +h) by expanding f(x) in a Taylor series about x: fatnmsorenca+treo+ pet en eawation (3.1) for f(8) yields 100) = FEED) Bray prays G2) Fig. 3.4 Certain sophisticated digital computer programs have been written which can perform formal, analytical ferentiation ofa rather wide variety of functions, but this toni i beyond the scape ofthis book, 16 CHAPTER 2 THE TAYLOR SERIES. 18 Problems 2.42 Find the Taylor series expansion for sinh x about x = 0. 249 Find sinh0.9 to 0(0.9) by using the Taylor series expansion from Problem 2.12 244 Bound the error on your answer to Problem 2.13 by using the error expression (2.3) and ‘compare your result with the actual error. (Note: sinh0.9= 1.026S167, cosh 09 = 14320864) 2A8 Find the Taylor series expansion of sin x about x = 7/4, 2.6 Obtain the Taylor series expansion for 1/(1—x) by using the series for (1 ~x) from Prob- lem 2.5. 2A7 Is it possible to expand log. x in a Taylor series about x= 07 Discuss. 2.8 Find an expression for sin x cos x accurate to 6(x)' by using the Taylor series for the individual functions about x = 0 219 Evaluate cos (sin x) to 6(2)* by using the known Taylor series for sin x and cos x 2.20 Examine the Taylor series about x =0 for e* when x =4. ‘The fourth term in the series, x73, is larger than the third term, x'/21. Does this mean the series is divergent? Explain this apparent anomaly. 221 Consider two functions g(x) and h(x), related in such a way that g'(x)= h(x) and WG) = g(2) and that g) = Oand h(O) = 1. Find the Taylor series expansions for g(x) and ‘ACx) using only this information, 2.22 Show that the Taylor series expansion of f(x)= x" about x = 1 simply reproduces 2° 223° The error function erf (x) is defined as erf(x)=(2/Vmfie" dt. Find erf(1) to 6(1)* by expanding erf (x) in a Taylor series about x =0. (The value of erf (1) = 0.84270079 to eight ‘decimal places.) Chapter 3 The Finite Difference Calculus 3.0. INTRODUCTION In conventional calculus the operation of differentiation of a function is a well-defined for- mal procedure with the operations highly dependent on the form of the function involved. Many different types of rules are needed for different functions. In numeri- cal methods a digital computer is employed which can only perform the standard arithme- tic operations of addition, subtraction, multiplication, and division, and certain logical operations.* Thus we need a technique for differentiating functions by employing only arithmetic operations. The finite difference calculus satisfies this need. 3.1. FORWARD AND BACKWARD DIFFERENCES. Consider a function f(x) which is analyti neighborhood of a point x as shown in We find f(x +h) by expanding f(x (can be expanded in a Taylor series) in the ig. 3.1. in a Taylor series about x: fatm=seornco+ tpt pey+-- an Solving equation (3.1) for f'(x) yields (th £0 G2) fo Fig. 3.1 “Certain sophisticated distal computer prosrams have been writen which can perform formal, analytical b, then extrapolation is required. Even under the best of circumstances, extrapolation contains a strong element of uncertainty. Unlike interpolation, where the function is firmly anchored on both sides of the point where a value is to be obtained, in extrapolation the function is fixed on only one side and is relatively free to wander on the other side. If the function is known at discrete, evenly-spaced points, then the Gregory-Newton forward or backward polynomial interpolation formulas are commonly employed for ex- trapolation, with the last known point used as the base line. (The choice of a forward or backward formula will of course depend on whether x >b or x | om [em] ox | roomes| aoe | 62) este | ec) | es ees ‘The remaining second difference (encircled) is given by Aflys~ Affe-.=0.710—0.48 021. Since this is essentially the same as the assumed second differences, we can be reasonably confident we have found the error, and that f(4) = 0.960 is more nearly “correct” than the original value of 0.893, The function f(x) now closely resembles a second degree polynomial since A°f i virtually constant. Tt should be noted that such “correction” can be very dangerous uniess one is very certain that the point is actually in error. It was stated in Sec. 4.4 that in Lagrange interpolation, if the spacing between any ‘two points was large compared with the spacing of all other points in the table, then could “wander” in that widely spaced region resulting in excessive error in approximating the actual function. Consider the function f(x) sinx, 0 Peels + Fig. 45 Problems 4.11 Given the following tabulated function: aa| ona acinar | ara ars 4 s | 2% | os | 1 fe) This tabulated function isa polynomial. Find the depree of the polynomial and the coefi- cient of the highest power of x. 412 Prepare a forward difference table for the following function: 3 4 5 x t 2 fo | 6 10 4 | 3s | 430 Now. assuming the function is a polynomial ill out all blanks in the table and interpolate For (4.31) using forward difference interpolation with x =4 as a base line. 413 435 436 vate 439 420 NUMERICAL METHODS Given the following tabulated function: x ° 03 [os | 09 12 15 18 roy | -3000 | -o72 [2.183 | 6asz | 14s7o | stag | os.c28 Find (a) (1.09), (b) (093), (c) £0.42), (d) $020. Using Lagrange interpolation, find f(4.3) for the following function: x o | 20 38 3 f) 0 ose | or | 028 | —o18s Write a computer program to perform Gregory-Newton forward interpolation. The pro- gram should require as input an arbitrary number (8 o less) of evenly-spaced values of x, the corresponding values of f(x), and the intermediate value of x at which f(x) is desired, The ‘output should include the diflerence table and the interpolated value of /(). Write a computer program to perform Lagrange interpolation, The input should include an arbitrary number ( or less) of arbtrarly-spaced values of x, the number of such values, the corresponding values of f(x), and the value of x at which f(x) is desired, (If you are using FORTRAN, remember that zero subscripts are not allowed, and some suitable adjustments will have to be made to the formulation in this chapter.) Employ the program written for Problem 4.15 to find f(1.3) for the following function: x 0 1 fey | 4 | -250 | ser | 1087 | 2357 [ 2493 | —1295 | 2450 Using the program written for Problem 4.16, find f(6.) for the following function: x [o 12 [17 | 28 | 44 [ss | 20 | 80 fey | 1000 | 0671 | 0398 | -0.185 | -0342 | 0.092 | 0300 | 0.72 Resolve Problem 4.13 by using the Lagrange interpolation program written for Problem 4.16. Should the results agree? How closely? Discuss the relative advantages and disad- ‘vantages of using polynomial interpolation based on difference tables as compared with Lagrange interpolation, assuming equally-spaced dat Using the principles of Chebyshev interpolation, construct a sixth-degree polynomial ap- proximation to f(x)= e**" sin x on the interval 0x <4z. Plot the polynomial and the original function on the same graph and comment on the approximation CHAPTER 4 aan 422 428 Using a natural cubic spline, interpolate for /(3.4) given the follo INTERPOLATION AND EXTRAPOLATION 63 cequally-spaced func- ton oe |. r—sSSs Aesin sing natural ub spline, nd 0) ven the following unequal spaced abused cas >) D7 LD ] [| |. 2 Given te folowing function Lr——=s fx) 2.014 3.221 4.701 770 13.594 23.580 Find £60 Given the following tabulated anton a -. 7 | [2s | ve | ow Find (5.0) Chapter 5 Roots of Equations 5.0 INTRODUCTION Root solving typically consists of finding the values of x which satisfy relationships such Ax’+Bxt=Cr+D tan Kx ‘These are not truly equations in the sense that they are only satisfied for certain values of x. Depending on the problem, these values of x may be real or complex and may be either finite or infinite in. number. ‘The procedure for finding the roots will always be to collect all terms on one side of the equal sign; for example, Ax?+Bx’-Cr-D=0 tan Kx — For any values of x other than the roots, these equalities will not be satisfied, so that in general Ax? + Bx?— Cx —D = f(x) tan Kx —x = g(x) Finding the roots of these equations is now equivalent to finding the values of x for which (0) or g(x) is zero. For this reason the roats of equations are often called the zeros of the equations. We now examine methods of finding the roots of a general function f(x). Unless otherwise stated, we shall deal only with finding the real roots of equations with real coeflicients 64 CHAPTER 5 ROOTS OF EQUATIONS 65 5.1 BISECTION Bisection is a “brute force” technique for root solving which is too inefficient for hand computation but is ideally suited to machine computation. Consider first the simplest possible case: a function f(x) which is known to have one and only one real root in the interval a 0, then there are an even number of roots in the subinterval (or none). Bisection will always find a root if the subinterval ‘chosen for the next bisection is one in which f(s.)*f(%)<0, However, if there are several roots, several bisections may be necessary initially in order to find a subinterval with this behavior. No generalized algorithm will be presented for bisection on an interval with an arbitrary number of roots. I is seldom useful to find only one arbitrary member of a set of roots in an interval, and the complexity of a bisection algorithm to avoid finding the same root more than once would be very great. As we shall see in our discussions of all of the root-solving methods, there is no substitute for a prior rough knowledge of the behavior of the function and the approxi- ‘mate location of the roots. This makes it possible to use a small enough initial subinter- val for bisection to isolate any desired root. The approximate behavior of the function 66 NUMERICAL METHODS : Kee Ny (x1, fn) < ts ey <2. Pee <> oe ) I IN ae ue Oe ted can be determined from a graph plotted on a computer, either on a plotter or the printer, or from a computer tabulation of the function at reasonably fine intervals. Often even simple hand computations or plotting wll be suficient to avoid such frustrating ex- periences as finding an unwanted root or missing the desired root. An additional advan- tage of a rough plot of the function is that such a plot often makes it possible to identify the presence of troublesome tangent points, where the function touches the x axis, but does not cross it, resulting in a multiple root. Bisection will nether locate these tangent points nor indicate their presence. Bisection will, however, find any multiple root at Which the function crosses the axi 5.2 NEWTON'S METHOD (NEWTON-RAPHSON) Consider a point x. which is not a root of the function f(x), but is “reasonably close" toa root. We expand f(x) in a Taylor series about 6: Fla) = fo) += x9 (r+ ESE™ Foo on If f(x) is set equal to zero, then x must be a root and the right-hand side of (5.1) constitutes an equation for the root x. Unfortunately, the equation is a polynomial of degree infinity. However, an approximate value of the root x can be obtained by setting f(x) to zero and taking only the first wo terms of the right-hand side of (5.1) to yield = flo) +(x — xf") (52) Solving for x gives Load, Fad (53) CHAPTER 5 ROOTS OF EQUATIONS a x= 3 = L290 xomen d= fey (a) Now x represents an improved estimate of the root, and can replace xo in (5.3) to yield fan even better estimate of the root on the next iteration. ‘The general expression for ‘Newton's method can thus be written as ov £0") Fo™) where the superscript n denotes values obtained on the nth iteration and n + 1 indicates values to be found on the (n + Ith iteration. This iterative procedure will converge to a +001 for most functions, andif it does converge, it will usualy do so extremely rapidly. A flow chart of the algorithm is shown in Fig. 53. 6s) INPUT 4 4 ~-f2 — jice —| ROOT eco) = rene Fig. 6.3 Newton's method. ‘The algorithm is terminated when the magnitude of the computed change in the value of the root, 6, is less than some predetermined quantity «. This does not guarantee an accuracy of € in the root. Although more sophisticated convergence analyses are pos- sible, a useful and conservative rule of thumb is to choose € as one-tenth of the permis sible error in the root. An additional point should be made concerning Fig. 5.3, and in fact all flow charts given in this chapter. No error exits have been provided in case the method diverges or does not find a root in a reasonable number of iterations. A computer program written from this flow chart should include such exits as the programmer feels necessary, but it should be noted that these exits require logic which will increase the running time of the program. If enough is known about the character of the function, such exits may not be necessary. Despite its rapid convergence, Newton's method has some difficulties with certain types of functions. These difficulties can best be examined, and the most intelligent use 68 NUMERICAL METHODS made of this powerful method. by considering the graphical interpretation of the algorithm. Figure 5.4 shows the first iteration for a typical function, ‘The next guess for the root, x", is the intersection with the x axis of a straight line tangent to the function at Xo. The value x" is much closer to the root than the original guess xo, and itis clear that succeeding iterations will converge rapidly to the root. Fig. 5.4 Consider next the simple oscillatory function shown in Fig. 5.5. The first guess, ts, is reasonably close to the root A. However, the tangent line strikes the axis at x°°, which is closer to the root B. The next iteration yields x®, and it becomes clear that the procedure will converge to the root B. lustrates one of the possible difficulties of ‘Newton's method: an initial guess which is close to one root may result in convergence to a different more distant root. There is no simple method for avoiding this type of behavior with certain functions. However, the rough plots or tabulations of the function discussed earlier will usually be sufficient to permit first guesses from which the method will eventually yield the desired roots. In any case, these plots will ensure that the programmer is aware of the presence of any roots which the method may have missed. Newton's method also has a tendency to home-in on a local minimum or maximum in a function (not a root) and then as the zero slope region is approached to be thrown far from any region of interest. The algorithm can also occasionally oscillate back and forth between two regions containing roots for a fairly large number of iterations before finding either root. These difficulties can be readily avoided with some prior knowledge of the behavior of the function. It should be noted that some difficulty will be encountered in attempting to use ‘Newton's method to find multiple roots. For smooth functions, these multiple roots cor- respond to points where the function becomes tangent to the x axis and then may or may not cross the axis. This behavior means that as f(x) approaches zero, so does $x). While Newton's method can be shown to be formally convergent for such roots, the rate of convergence is slow. and in practice can make the computation of multiple roots difficult and expensive. A modified Newton's method, which is very well suited to multiple roots, will be discussed in the next section CHAPTER 5 ROOTS OF EQUATIONS 69 fe) Fig. 85 In the illustrative problems at the end of this chapter, specific examples are pre- sented which demonstrate some of the possible difficulties in the use of Newton's method for general root solving, 5.3 MODIFIED NEWTON'S METHOD The difficulty of Newton's method in dealing with multiple roots leads us to consider a ‘modification of the method discussed by Ralston[3]. As before, we wish to find the roots of a function f(x). Define a new function u(x), given by (66) ‘The function w(x) has the same roots as does f(x), since w(x) becomes zero everywhere that f(x) is zero. Suppose now that f(x) has a multiple root at x= of multiplicity x (This could ‘occur, for example, if f(x) contained a factor (x ~c).) Then u(x) may be readily shown to have a root at x=. of multiplicity r, or a simple root. Since Newton's method is effective for simple roots, we can apply Newton's method to u(x) instead of #02). Applying equation (5.5) gives nt) xO) = gins uc") OO 2 8 wey (7) Equation (5.6) gives u(x), and this can be differentiated to yield (x) = LOW = fOOF) Woy SESS wy a1 L200 8) For 7” NUMERICAL METHODS: The algorithm may be written in flow chart form as shown by Fig. 5.6. This al- gorithm is somewhat more expensive than the conventional Newton's method in the sense that it requires the computation of f"(x), but the algorithm retains the same convergence rate as the conventional Newton's method regardless of the multiplicity of the root. INPUT fre cor vaet Fig. 5.6 Modified Newton's method. The advantage of this method over the conventional Newton's method in finding multiple roots is illustrated in Problem 5.6, See also Problem 5.7. 8.4 THE SECANT METHOD ‘The secant method is essentially a modification of the conventional Newton's method. the derivative replaced by a difference expression. This is advantageous if the function is difficult to differentiate, and is also convenient to program in the sense that it is only necessary to supply a function subprogram to the method rather than subprograms for both the function and its derivative. Replacing the derivative in (5.5) by a simple differ- ence representation yields CHAPTER 5 ROOTS OF EQUATIONS m ine fe) 2 8 0 a agg 9) To use this method, f(x"*"") must be saved. This is the value of f from two iterations previous tothe present one. Since no such vale wil be availabe forthe fs iteration, {vo diferent intial guesses for the Toot, x4 and Xi, must be supplied iniily 0 the algorithm. This algorithm is shown in Fig. 57. INPUT been e “Cw tote Fig. 5.7 The secant method For most functions, the secant method will not be as rapidly convergent as the conventional Newton's method, but its advantages may outweigh this somewhat de- creased convergence rate. If f"(x) is very time consuming to evaluate, then the secant method may actually require less computer time than Newton's method 5.5 ROOT SOLVING AS INVERSE INTERPOLATION Suppose that in the neighborhood of a root of f(x) we tabulate f(x) at intervals of x (not necessarily evenly spaced). Interpolation, as we have seen in Chapter 4, consists of finding a value for f(x) at some predetermined x between the tabulated points. Root rR NUMERICAL METHODS: solving, on the other hand, consists of finding the x at which f(x) takes on a predetermined value Zero). Root solving thus may be thought of as inverse interpolation. However, in order to actually use interpolation methods, itis necessary that f(x) be an invertible func- tion of x. This means that in the region of interest, to every value of f(x) there must correspond one and only one x. (This must be true for all points inthe region, not just the ‘tabulated ones.) The function f(x) may be shown to be invertible if, in the region, f(x) is continuous and differentiable and f"(x) does not pass through zero. Under these condi- tions we may write x as x(f) Consider as an example the function f(x) = tan x ~2x tabulated at intervals of 0.05 near a root (Table 5.1). This function satisfies the conditions of invertibility on the interval shown. The table may thus be considered as an unevenly spaced tabulation of, X(P) vs. f- In order to find the root, we must find (0). Table 5.1 x fa) tos | —0.3s66s46 tio | 0.235240 11s | 000655030 120 | +0.721s13 It may be shown (see Ref. 3 for a complete discussion) that many of the iterative root-solving techniques (including Newton's method) may be interpreted as inverse interpolation. However, we shall consider here only the simplest inverse interpolation technique, polynomial interpolation on x(f). This method will find an approximate value for the root, with the accuracy of the approximation depending on the spacing of the tabulated points and the behavior of f(x) in the neighborhood of the root. Returning to the Table 5.1, we employ Lagrange polynomial interpolation to find 20). It is convenient to rewrite Table 5.1 in the form of Table 5.2 Table 5.2 i f x0 0 | 035668 | 10s 1 | ~023524 | 10 2 | -o06sso | is 3 | sors | 120 Now, (0) = ps0) = 1.05P40) + 1.10P (0) + 1.15P0) + 1.20P,(0) where a (0 +0.23524)(0 +0.06550)(0 ~ 0.17215) ye (— 0.35668 + 0.23524)(— 0.35668 + 0.06550)(— 0.35668 — 0.17215) CHAPTER § ROOTS OF EQUATIONS 8 and similarly PO) = —0.47895 P40) = 1.22976 P40) = 0.10734 which yields (0) ~ ps0) = 1.16514 ‘The exact root is 1.16556, so this answer is in error by about 4.2 x 10™ 5.6 A BRIEF NOTE ON SPECIAL METHODS FOR FINDING ROOTS OF POLYNOMIALS All of the methods which have been discussed in this chapter will find most of the real roots of polynomials with real coefficients. (As has been mentioned, many of the ‘methods do have difficulty in finding multiple roots corresponding to points where the function is tangent to the x axis.) However, there exist techniques specifically suited to finding all of the roots, single or multiple, real or complex, of polynomials with real coefficients. We shall not give the computational details for any of these methods, since the algorithms are quite complicated, particularly if provision is made to find all multiple and complex roots. Ralston[3] gives complete descriptions of many of these methods with discussions of convergence rates and applicability to digital computation. These tech- niques include Graeffe’s root-squaring method, the Lehmer-Schur method, and various methods based on synthetic division. Illustrative Problems 8.1 The function f(x) =x°-0.9x ~8.5 has one real root in the interval 2< x =3 How many bisections would be required to locate this root to an accuracy of <= 10"? Since the root is orginally known to be in an interval 3-2 = 1 unit wide, after one bisection the root will be isolated to an interval 1/2 unit wide. After two bisections the interval wil be 1/2 unit wide, and aterm bisectons the interval willbe 12" unit wide. If the algorithm in Fig. 5.2 is used, then the root is assumed to be a the center of the last {interval found, and the error inthe root will be no more than one-half of that interval. Thus the error eriteron wil be satisfied if HB)-ph ~1 as x. Itis also apparent that there are no negative roots. We will employ Newton's ‘method (Fig. 5.3) tofind the root. It would appear that x = 1 would be a good frst guess, but to show the strong convergence rate of Newton’s method, we shall pick a poorer guess, ‘This gives 10) 1.472366, f(G)=~0.354275 which yields an estimate for x of KEE +B = 3.0~4.155999 =~ 1.155999 ‘The next iteration yields L 1.155999) ___ 3.213548 1.155999) ~~ 7.388477 = 1.155999 + 1.348437 48437 189438 ‘The next four iterations give 714063, 8 = 052461 1782542, 8 = 0.06850 =0.783595, 5 =0.00105 783596, 8 = 0.890 10 ‘The last value of x is the root accurate to six decimal places, CHAPTER 5 ROOTS OF EQUATIONS 6 ‘The rapid convergence rate with such a poor guess might tend to inspire such con- fidence in the method that one could question the need for the rough plot of f(x). This false confidence can be dispelled by guessing x=8. This gives f8)=-1812011, (8) = 0.067668 and ‘The next iteration yields x= 869.1519 along with a computer underflow in the exponential routine. What has happened is that the guess was beyond the local minimum around x = 6, and the method is now proceeding vainly toward x = +e attempting to find 2 root as the function asymptotically approaches f@)=-1. 5.3 Find V7 by using root-solvi ‘This problem may be restated as finding the roots of x*~7'=0. The roots occur as positive and negative pairs. We shall seek only the positive root. Newton’s method will be used: 1g methods. fo f(a)=2x VB, %4=3 should be a reasonable guess. The first iteration is toda fod" 2 xy+8 = 303333333 ‘The second iteration is (2.666667) 7 __ 2 cease —1 - -0.0208333 {6666667 ~ 0.0208333 = 2.6458334 Since 0.3333333 x (666661 8 xexts ‘The thind iteration is 8 = ~0.0000820, 26457514 All eight digits of this value of x are correct! We have carried seven decimal places in this problem to illustrate the power of Newton's method in taking square root, In fact. New- ton’s method is used for virtually all square root routines for digital computers as weil as on those desk calculators which include a square root capability 5.4 Find the smallest positive root of the function fe) Asin x|—4 ‘A rough sketch of this function is shown in Fig. 5.9. ‘The function closely approaches the x axis near x =2.4, buta closely spaced tabulation of the function inthis region indicates no root. The smallest positive root is thus between x=3 and x=4. This function has a NUMERICAL METHODS fe Fig. 5.9 discontinuous derivative at x=3 (actually at x = =) and at various larger values of x as Well. Root-solving methods based on the use ofthe derivative or difference approximations ‘must be used with care for this function, not only because of the discontinuities but also because of the local maximum near x = 2.4. Bisection would thus seem tobe the safest and simplest approach, since all of these factors can simply be ignored. With initia values of x, = 3:2 and x» = 3.6, the bisection algorithm (Fis. 5.2) produces a root of x = 3.478508 accurate to € = 10° in 19 bisection. For purposes of illustration, we will also apply the secant method (Fig. 5.7) to this problem, To begin this algorithm, we need two closely spaced initial puesses forthe root: Wwe choose xj=36 and x» =37. Now, 10.6) = 1.735085, 8.) 253452 ‘The difference estimate of the derivative is £8.9-{G.D _ ~ 1.51897 =O. T 5.18397 Then {G9 __ 1.735055 __ 9 san69 =~ 5.1397 15.18897 and x = x45 = 3.6~0.114269 = 3.485730. For the next iteration, the derivative is esti ‘mated by 1.485730) ~ {8.6 _-1.635724 SARSTIO=3.6 ~~ 0.114279 — 431455 and #1348530) __ 0.099331 ass ~~ 1431455 X45 = 3.485730 0.006939 = 3.478790 CHAPTER 5 ROOTS OF EQUATIONS n 85 Two more iterations give x =3.478508 with 8 = 0.347 10° ‘identical with that obtained from bisection Suppose now that we had not bothered to find out the behavior of the function, but simply picked two values of x, say x= 2.8 and xw=29, to use in applying the secant method. These values seem reasonably close to the root, but are actually far enough away to result in disaster, since they are to the left side of the discontinuity in the derivative at x=, The firs iteration produces x =2.576, Many more iterations result in values of scattered rather randomly from x= 1.007 to x =2.643 and it becomes obvious that the ‘method is attempting to find the nonexistent root atthe local maximum near x= 24. This, process will never converge to anything unless the method accidently hits @ point on the function with a sufficiently small slope to throw the next guess far from the local maximum and into a region where a root exists ‘This value of the root is Find the smallest positive root (other than zero) of, foe 1 cos x cosh x A rough sketch of this function inthe range of the first positive root is shown in Fig, ‘We choose the secant method and pick two first guesses of x= 4.4 and xa = 4.5. Six iterations yield a value for the root of =473004, 6 0.442 10 This root is correct to the five decimal places shown and was easily obtained. Much more interesting is the failure of the secant method which occurs if we are Slightly less accurate with our frst two guesses. Suppose we guess x=3.8 and m= 39. Four iterations with the secant method produce 0.407 10° 361997, 8 (Our first impression would be that we have found a negative root (tis not unusual for a root-solving algorithm to produce a root rather distant from the initial guess) and that we hhave located this root extremely accurately since 8 is so small. In fact, this value of x is not @ root of f(x) at all! We will follow the steps of the secant method algorithm to find out 78 NUMERICAL METHODS exactly what happened. We first note from Fig. 5.10 that the guesses x)= 3.8 and x» =3. are slightly to the left ofthe local minimum at x~4, The difference approximation to the derivative is £G.9-109) =O Or 2.5131 Since the points are so close to the local minimum, this is a relatively small slope, and corresponds to the dotted line shown in Fig. 5.10. ‘The value of 6 is eo) Z 151 ets eae and +B = 38-743558 = ~3.63598 Graphically, this isthe point st which the dotted line intersects the x axis. The function (2) is an even function, so f(—x)=flx). We would thus expect f(~3.63598) to be reasonably close in value to (3.8), and we find (3.63598) = ~17.70966 ‘The next difference approximation to the derivative becomes £{-3.63598)~ f(.8) _ ~17.70966 ~~ 18.68744) _ = 3.635983. = 83598 Ose ‘This isa very small slope, and corresponds to the virtually horizontal line joining (3.8) and (3.63598). The intersection of this line with the x axis will obviously be very far out the ‘To find the location, we compute and thus X-+8 = ~3.63598 — 134.68077 = —138.31674 Due to the character of cosh x this could result in a very large value of f(x), and in fact, we find J(-13831674) 5655 * 10°C) ‘The next value of 8 is - f(- 13831674) 0.5655 10" [ESI 9D] === 7 a) 13831674 =~ 3.63598) = 13468077 ‘The quantity 0.5655 10 is so overpowering that this yields simply 34.6807 +8 =~ 13831674 + 134.6807 63598 which puts us back to where the previous iteration started. The next 6 is (= 3.63598) (-17:70366) (S83. em- [PE tsar) 3.63598 —(- 13831674) 13468077 0.4074 10 CHAPTER 5 ROOTS OF EQUATIONS ~ 56 "The method thus appears to have converged even though it obviously has not. What has happened is that by beginning near 2 local minimum, we have been thrown far from the region of our initial guess. Subsequent difference approximations to the derivatives, span such large ranges of x that they do not in any way approximate local derivatives. Eventually, we reach such a large value of f(x) relative to the value from the preceding iteration that the difference approximation to the slope is essentially infinite, and the method appears to have converged regardless of the value of x which is finally reached. ‘Note that all of these meaningless computations could be avoided simply by initially sketching the function, and recognizing the problems which can be caused by a local minimum below the axis. Find the positive real roots of the function (2) = x" ~8.6x° ~35.51x" + 464.4x — 998.46 AA sketch of this function is shown in Fig. 5.11. This sketch is based on a tabulation of 4(2) on the interval 0.x = 10 using a very coarse interval of 1 unit. Fig. 6.11 1. appears that there may be a multiple root (and a tangent point) near x =4 and a simple root between x=7 and x=8. The root(s) near x =4 could also be two closely spaced real roots if the function crosses the axis, or perhaps no real root(s) at all if the unetion does not touch the axis (the tabulation is too coarse to tell for sure). ‘We begin by finding the simple root between x= 7 and x =8. Newton's method (in Fig. 5.3) should be suitable for this root. We choose x, = 7.0 as the initial guess and ask that the final magnitude of 8 be less than 10. We need the frst derivative £3 ix? 25.8x°~T1.02x +4644 NUMERICAL METHODS Now applying the algorithm of Newton's method, (7.0) = ~36.4800, 36.4500 75.0600 Ke x48 7040485612 = 7.485612 0.485612 ‘The second iteration is $(7A8S612)= 20.6451, f°(7-485612) = 164.891 0.125205 +8 = 7485612 0.125205 = 7.36041 ‘The next three iterations yield x=734857, 3 = 0.118107 x=734847, 6 =—0.102« 10" xe73MB7, 8 0.75810" The root has now been located accurately. Note that the value of 8 decreases very rapidly. For most simple roots, the value of 8 on any iteration is of the order of [37 from the previous iteration. This holds for the present problem. Newton's method is thus said to have a convergence rate which is quadratic for most functions. ‘We now turn to the possible multiple root. A finer tabulation of f(x) near x = reveals that f(x) becomes very small but never changes sign. Since the function is simple to differentiate, we choose the modified Newton's method discussed in Sec. 5.3 in anticipation ‘of « multiple root. We must supply subprograms to compute f(x) and the derivatives PU) 4P-258e°- 71.028 +4644, GR) = 120? 51.64 — 71.02 We choose x, 4.0 as the intial guess, and ask that |6| on the final iteration be less in ‘magnitude than € = 10° fAO)=-342, — f40)=2352,— fAO=—B542 Now £4) = 3.482 4) = £9) 0 38 = 0.145408 y= 1 LOLA | CSA 85.40 6 Gay Gasz ~ 0471906 and WC) __ =0.145408 _ 9 396139 0.471906, +8 =4.0+0.308129= 4.308129 ‘Three more iterations yield 4300001, 8 =—0.812* 107 8 =~0.807% 107 8 = 0.660% 107 CHAPTER 5 ROOTS OF EQUATIONS 8 87 ‘Thus we have found the multiple root very accurately in four iterations. Note that the convergence rate of the modified Newton's method for this multiple root is quadratic. For comparison, we can also find the multiple root by using the conventional Newton's ‘method, with the same initial guess of x =4.0 and the same € of 10% J40=-342, [A0)=2.52 £40) __-340 0) 7350 ~ 045408 $8 =4.0+0,145408 145408 ‘The next four iterations produce 22138, 8 = 0.075974 26033, 8 = 0.038952 28007, 8 =0.019740 20001, 8 =0.009939 The method is obviously converging very slowly, with each 5 only about one-half of the ‘magnitude of the preceding 6 ‘This convergence rate is termed linear. In all, 19 iterations are necessary to obtain x= 4.300000, § = 061210" ‘The advantages of the modified Newton's method for multiple roots is obvious. ‘One very significant practical computational point should be mentioned. For the frst attempt at finding this root, the function subprograms for f. "and f" used the standard in- teger exponentiation capability of the FORTRAN IV compiler to compute x‘and x. Ttwas ‘not possible to obtain convergence to the desired accuracy even with the modified Newton's ‘method, and in fact the estimates of the root varied from x = 4.2777 to x = 4.5233 with [| never smaller than 0.001. It was then determined that if x* was computed as (x)(x)(3X2), convergence could be readily obtained. ‘The small amount of error involved in the com- piler’s use of the logarithm routine for exponentiation apparently resulted in the function never touching the axis at all. (Most compilers use products of x with itself for low integer Powers of x, but shift to the log routine for high powers. The compiler used here shifted at x4) The solution for multiple roots can obviously be very sensitive to small errors in the ‘computation of the function, Find all roots of f(x) = 3° = 12.423" + 50.444 — 66,552 in the interval 4= x <6, ‘A rough sketch of the function is shown in Fig. 5.12. If there are any roots of f(x) in this interval, they are near x = 4.8 and consist of either ‘4 multiple rotor two closely-spaced simple roots. A finely-spaced tabulation reveals that the function does go slightly negative, so there are apparently two closely-spaced simple roots. We apply the conventional Newton's method with a first guess of )=45 and «= 10% ‘The method requires eight iterations to reach convergence, yielding a NUMERICAL METHODS Fig. 5.12 11263 04966 02313 8 0.01032 5 = 0.00364 611 107 184% 10" 0.168% 10° [Note that the convergence rate is virtually linear, rather than quadratic, until the root is approached very closely. This is because the function approaches the axis with a very small and slowly varying slope, clearly very similar to the behavior near a multiple root. In forder to find the other root, we use Newton's method with an initial guess of x = 48 and €= 10%, Seven iterations produce 0.250% 10" 4.72000, with essentially the same convergence rate as before. ‘Since the convergence rate of the conventional Newton’s method corresponds more closely for this problem tothe rate which usually results from multiple roots, it is reasonable to.consider the suitability of the modified Newton's method (Fig. 5.3) inthis case. Using an initial euess of xs 4.5 with the modified Newton's method yields 6258, 19258 70134, 0.00876 x=4,70010, 0.00123 0.100% 10" 0.07% 10 CHAPTER 5 ROOTS OF EQUATIONS 83 This method needed five iterations to reach a [6] of 10%, and while the convergence rate clearly greater than that of the conventional Newton's method, note that the second iteration overshot the root. This overshoot is due to the presence of the second derivative in the method. Incontrast, while the conventional method required three more iterations, the Foot was approached monotonically from below. If we now attempt to find the larger root using the modified Newton's method with x)=48 and e = 10° we obtain on the first iter r=471014, 8 = ~0.08986 ‘The fist estimate has overshot the root to essentially the midpoint of the interval between the two roots. The next four iterations yield 71024, 8 =0.106x 10" x=471045, 8 =0212%10" x= 471088, 8 =0.472x10" xe47II71, 6 =0.834%107 [Note thatthe magnitude of 8 is actually increasing on each iteration, as the method attempts to correct the overshoot. It takes three more iterations until |8| begins to decrease and a total of eleven iterations to converge to 3 392% 10% ‘Since the conventional method took only seven iterations to converge to the same root with ‘the same intial guess, the conventional method is clearly superior for this root. Asa matter of interest, ifa frst guess of x» = 4.9 used for the modified Newton's method, the method ‘overshoots so far that it converges to the smaller root (x = 4.70000) in only five iterations! ‘The conclusion must be thatthe conventional Newton's method is superior for simple roots, evenif they are very closely spaced, if only because of its geometrically easily predict- able behavior, and less costly computation per iteration. ‘The modified method may con- ‘verge more rapidly than the conventional method in some cases and more slowly in others, but the presence of the second derivative makes it dificult to predict when it may be best, or even to which root it may converge. Problems "5.8 Write a computer program to use bisection to find a root of a general function f(x) on the interval a 3 lel (647) for each value of i (each row). However, in practice, convergence can be obtained with much weaker diagonal dominance than this. In many cases convergence can even be obtained if a few of the equations have diagonal elements smaller than some other ele~ meats in those equations. Examples of Gauss-Siedel iteration are shown in several prob- lems at the end of the chapter. We will defer the presentation of the flow chart until after the discussion of relaxation. Relaxation originally evolved as a very sophisticated hand computation technique for solving large sets of simultaneous linear equations iteratively. The overall approach is not well suited to digital computer use because of the extensive logic required. How- ‘ever, some of the original concepts are embodied in the simple but powerful computer oriented method which we will now discuss briefly. ‘The method basically consists of calculating the value of each unknown by Gauss- Siedel iteration and then modifying the value before itis stored. ‘The fundamental opera- tional equation for this so-called “point” relaxation is af FAG a?) (648) 104 NUMERICAL METHODS As before, we will consider iteration (1 + 1) as the current iteration and iteration (1) as the preceding iteration. The quantity x!'"!" is the value of the unknown obtained on the current iteration by using Gauss-Siedel iteration. The quantity A is a pure number in the range 02 the process diverges.) We might note that the term "Gauss-Siedel value” is a slight misnomer here (unless A= 1), since the values from the current iteration which are utilized in its computation are not Gauss-Siedel values themselves, but have been modified by the relaxation formula (6.48) before they were stored. ‘Although definite exceptions can be shown, overrelaxation is usually employed to accelerate an already convergent iterative process, while underrelaxation is usually em- ployed to make a nonconvergent iterative process converge. The same relaxation factor usually applies for all of the equations in a set, although it may occasionally be worthwhile to use different factors for blocks of equations within a set which are drastically different in character. ‘The choice of an optimum value of A is a rather complex task which is beyond the scope of this text. See Ref. 6 for a discussion of this topic. In most circumstances itis practical to choose the value of A by trial and error. The use of relaxation factors is particularly important and useful in solving iteratively the very large sets of equations which result from the numerical solution of partial differential equations. Several exam- ple problems at the end of this chapter illustrate the choice and use of relaxation factors. A flow chart for Gauss-Siedel iteration including a relaxation capability will now be presented (Fig. 6.4). To obtain Gauss-Siedel iteration with this algorithm, set A= 1, An absolute convergence criterion has been used in Fig. 6.4. A relative criterion can be substituted if desired. In Sec. 6.3 it was stated that iterative techniques are used to solve the very largest sets of equations, perhaps sets as large as on the order of one hundred thousand equations. Very large sets of equations invariably have sparse coefficient matrices, and it is essential from a storage space standpoint that any solution technique to be used for such sets take full advantage of this sparseness, and only require the storage of those elements which are nonzero. Iterative techniques can easily satisfy this requirement in all cases, while direct techniques only satisfy the requirement if the coefficient matrix is banded, or in some cases if special programming techniques are used. However, the primary reason that such large sets can be handled with iterative techniques but not with direct techniques is that the roundoff characteristics of iterative techniques are much better. With direct, techniques, roundoff error can be incurred with each mathematical operation, and simply accumulates until the final answers are obtained. When iterative techniques are used, the presence of roundoff error in the unknowns at the end of any given iteration simply results in those unknowns being somewhat poorer estimates for the next iteration, For practical purposes, the roundoff error in the final converged values is only that accumulated in the final iteration. (isthe inital guess for the solution vector) TEMP — TEMP © TEMP —TEMPIe z TEMP ~x|>« Y moms en SACEMP— =) fete » -) (Solution vector is Kyo) Fig. 6.4 Gauss-Siedel iteration 108 NUMERICAL METHODS, In closing this section on iterative techniques, as well as the chapter on linear equa- tions, we should note that Gauss-Siede! iteration (including relaxation) is the most widely- used technique for solving nonlinear sets of simultaneous equations. Problem 6.10 is an illustration of a possible approach to such problems. Illustrative Problems 41 Show the detailed solution by Gauss-Jordan elimination of the following set: [+ ENG All operations will be rounded to four decimal places for compactness, When we refer to an operation on a row of the matrix, this operation will be performed not only on the coefficient matrix but also on the element of the right-hand side vector inthe corresponding, row location. (Some texts employ the concept of an “augmented” matrix, which is a onsquare matrix of n rows and m+! columns, with the right-hand side vector as the (x+2th column. All row operations then automatically include the righthand side vector.) We begin with the frst row as the pivot row and 3 asthe pivot element. The frst row is now divided by the pivot element: 103333 0.3339] 0.6667 1 4 1 2 2 1 2 10 ‘The first row is next multiplied by 1 (the second element in the fist column) and subtracted from the second row: 1 03333 -0.333397 0.666; 0 3.6667 1.3333 || 11.3333, 204 2 10 ‘The first row is now multiplied by 2 (the third element in the first column) and subtracted from the third row to give 1 03333-03339 0.6667 0 3.6667 1.3333 |] 11.3333 0 03333 2.66671 8.6667. ‘The first column has now been cleared. Recall that our goal is to transform the coefficient matrix into the identity matrix, at which point the right-hand side vector will become the solution vector. The second row now becomes the pivot row, and 3.6667 the pivot clement, Dividing the second row by the pivot element yields 1 03333-03333 70.6667 0 103636 || 3.0909 0 03333 2.6667] L8.6667, CHAPTER 6 SIMULTANEOUS LINEAR EQUATIONS AND MATRIX INVERSION 107 ‘The top element in the second column can be cleared to zero by multiplying the second (pivot) row by 0.3333 and subtracting the second row from the first 10 ~04545 77 -03636 © 103636 |) 3.0900 0 03333 2.6667!L 8.6667. “The bottom element in the second column can be cleared next by multiplying the second row by 0.3333 and subtracting the second row from the third 10 ~0454577-03635 0 10.3686 |] 3.0909 0 0 2sassJL 7.6368, ‘The bottom row is now the pivot row and 2.5455 the pivot element. Dividing the bottom row by this element gives Pi 0 ~0.4885)7-0.3635 ie 1 03636|} 3.0909 oo 1 3,000. “Multiplying the bottom row by ~0.4S4S and subtracting it from the fist row gives 100-77 1.00007 © 1 03636] 3.0909 0 0 1 JSL3.0000. Finally, the second element in the third column can be cleared by multiplying the third row by 0.3636 and subtracting it from the second: Yo cpr olf 2 0 Lior ‘The solution vector is thus 10000 x [2am i Te eat stn witout round ears 1 “ (i 3 6.2 Given the follor 11348 3.8326 1.1681 3.40179] 9.5342 0.5301 1.7875 2.5330 1.5435 || x. |_| 6.3941 3.4129 4.9317 8.7643 1.3142 || x [| 18.4231 12371 49998 so6721 o.0147JLx.} L16.9237 Solve this set by using Gauss-Jordan elimination with and without maximization of pivot elements and compare the result. ig set of equations: 108 63 NUMERICAL METHODS ‘The exact answers to this set are ‘The answers obtained by using Gauss-Jordan elimination in single precision on the IBM 360/67 are: Without Maximization With Maximization of Pivot Elements of Pivot Elements 21 = 09991369 x1 = 1.000006) ‘000077 = 1,000008 2 = 1.000001 = 1.000000 x. 1.000076 x. 1.000001 ‘The gain in accuracy when the pivot elements are maximized (by column shifting) should be apparent. This small set of equations has been deliberately formulated to produce rather poor results unless maximization of pivot elements is used. It should be clear that maximization of pivot elements may he necessary for small sets of equations as well as large sets, and that it is virtually impossible to identify the need for maximization by simply examining the coefficient matrix. The best practice is simply to employ maximization of pivot elements routinely for all equation solving unless there is some reason for not doing so (as is the case with sets having banded coefficient matrices). Develop an error correction technique for simultaneous linear equations and demonstrate its use. ‘The matrix representation of a set of simultaneous equations is CX=R If we calculate the vector X by any method, it will naturally contain some error, even if only ‘due toroundoff. Call this calculated vector X'. Now multiply this vector by C to yield cx'=R’ where R'is a newly-calculated vector, somewhat different from R. If we now subtract this equation from the original equation, we obtain CX X)=R-R If (X—X') is denoted as E, then CE=R-R’ Any standard technique can now be used to solve this equation for E. ‘Then XeX'4E If E could be determined exactly, then the error would be completely corrected and we would now have the exact value of X. In fact, F will be in error for the same reasons, Goundoff, etc.) that X’ was, so that we have only accomplished a partial correction. If necessary, the process can be repeated as many times as necessary to obtain an accurate solution vector. CHAPTER 6 SIMULTANEOUS LINEAR EQUATIONS AND MATRIX INVERSION 109 ‘To illustrate the method, we reconsider Problem 6.2. The calculated solution vector 1X’ obtained for that problem when maximization of pivot elements was not used was 0,5991369 1.000077 1.900001 1.000063, as compared to the exact answer ofall 1's. We multiply this vector by the original coefi- ‘cient matrix to form CX" 11348 3.8326 1.1651 3.4017')-0.9991369" 0.5301 1.7875 2.5330 1.5438! 1.000077 34129 49317 8.7643 1.3142] 1.000001 12371 49998 10.6721 0.0147]| 1.000063, “The result of this multiplication is the modifed right-hand side vector R’ 9.537769) __| 6.3939008 18.4206830, 16.9230290, x's Now 9.5342000-9.8337769 7] [0.004231 6.3941000~6.3939005 | _| 0.000199 18.4231000— 18.4206430 | ~ | 0.0024570 169237000 ~ 169220290 0.0006710, Using this vector as the right-hand side, we can solve the set RoR CE=R-R for E by Gauss-lordan elimination to yield 6.000863 =| ~S.oous772 0.000009 —0.0000763, ‘The improved estimate of X is given by 0.9991369-+0.0008632) [1.000001 ea go) 19000772 —0 000072 (ae 1.000010 00000008 || 10000001 10000753 0.0000763} [0000000 J Te errors in the elements of this corrected unknown vector are no more than 10. The ‘vast improvement is apparent, and no further correction is necessary. ‘In order (o obtain the best results from this correction technique, the calculation of R” and RR’ should be done using double precision arithmetic. The remaining calculations ean be done in single precision. If itis known beforehand that error correction will be required, then the most efficient ‘method is to calculate C™ rather than to solve the set, since once Cis known, it can be used to calculate both X’ and E. 110 64 NUMERICAL METHODS Solve the following set of equations: 403 -1Yx [6 7-2 3\)xf=|9 s - Bilal Ls, For purposes of illustration, we begin by using a Gauss-lordan program waitten from the algorithm in Fig. 62. This program does not employ maximization of pivot clements. The resulting solution vector is, 1.586206 X=] -0.4482759 000000 In order to check the accuracy of these answers, we substitute them into the first equation: 41.586206) + 3(~0.4482759) ~ 1(— 1.000000) = 5.999963 which is very close to the correct value of 6. Substitution into the other two equations shows that the solution vector also satisfies them to a high degree of accuracy. From all indications, we should feel secure that we have obtained an accurate solution. However, consider the following vector: 0.6206896° X=| 2.1724137 3.000000 ‘This is obviously very different from the original solution vector. If we substitute this ‘vector into the first equation, we find (0.6206896) + 3(2.1728137) ~1(3.000000) = 5.999995 ‘which is again very close tothe correct value of 6. Substitution in the other two equations Yields similar accuracy. Something is obviously very wrong when two entirely different sets of unknowns satisfy (Virtually exactly) the same set of equations. We might speculate that the set is very il-conditioned, and perhaps in a sense that is true; however, the problem. would seem to be more basic. In an effort to trace the diffculty, we employ the subroutine siven in the Appendix which maximizes pivot elements, and also supplies the magnitude of the determinant of the coefficient matrix. This subroutine yields ase x-[ cana | sare and det C|= 2.7657 10°. This solution vector does not even come close to satisfying any of the three equations! The most informative piece of information, however, is the mag- nitude of the determinant. Recall that the determinant is defined as the sum of various products of elements in the matrix. No divisions are involved. For the present problem the coefficient matrix is composed entirely of integer elements, so the determinant must be an integer, or, practically speaking, as near to an integer as can be calculated using floating. point arithmetic and allowing for @ reasonable amount of roundoff error. ‘The computed ‘value for |det C| of 2.7657 10" can thus only mean that the determinant, if computed with- out roundoff error, is exactly zero. The set is thus singular, and no unique solution exists, which explains our earlier problem with multiple solutions. The present problem was contrived by forming the thd equation asa linear combina tion of the first two equations. In practice, itis quite common in certain physical problems CHAPTER 6 SIMULTANEOUS LINEAR EQUATIONS AND MATRIX INVERSION m1 to.accidently form a singular set by applying a physical principle which is not independent of the other physical principles used to construct the set. This singular character ofthe set can sometimes be difficult to detect, particularly if the results of the solution happen to be physically reasonable. Maximization of pivot elements can often help, since the resulting solution vector" will usually not satisfy the equations. Evaluation of the determinant of the coefficient matrix can also be helpful, as we have shown in the present problem. 6.5 A classical example of an ill-conditioned matrix is the so-called Hilbert ‘matrix. This is a symmetric matrix of the form 1 12 us ie wn 1B ua WS + in) 1s a us We ++ Aa +2) Yn An +1) Wn #2) Min +3) 12n Formulate and solve several problems involving the inverse of this matrix and the solution of sets having it as the coefficient matrix, in order to examine the signs and effects of ill-conditioning. Consider several values of n. Consider the following set of three equations: ban msc) pas fiz im tls] [ie Btw ells! Lia ‘The coefficient matrix of this set is the Hillert matrix H with m=3, The right-hand side vector has been chosen so that the exact solution vector is “This permits easy observation of the effects of ill-conditioning on the solution vector as well as on the coefficient matrix. We begin by inverting the coefficient matrix, using the Gauss- Jordan subroutine given in the Appendix. The computed inverse is aoret —36005s 0% p=[ sees! pre sme | Soowee —teoanse oan cal haf te mum ens in cach ow faa ae of rr 1 ich i of Manta cs) ten hs fan conned ats wil ase ish siete Gen mgt. The icone nhs ese ens fore Sedu man Anthems aft condone to eve roma dete Meet emaric anf compare hen hn toon war avenncaion 2) sags Mesut lw bn nvr a spas hoot sd we tau laet #11 VEE 001319 112 NUMERICAL METHODS: which is another definite indication of the presence of il-conditioning. In order to examine the effects of ill-conditioning, we will use the same subroutine to reinvert the matrix and thus find (H™'y'. If the effects of il-conditioning are significant, then this reiaverted matrix will show significant variations from the original matrix H. The reinverted matrix is, 109999908 0,4990933 0,3333282 Cary" =| 04999988 0.3333294 0.2499971, 03333294 0,2499973 0.199907, which shows some error, but S decimal places of the original matrix have been retained, The solution vector which results from multiplying H" by the right-hand side is 1.000031 X=! 0.959980 1.000046 ‘The elements of this vector are in error in the fifth decimal place as compared to the correct solution vector of all I's. Thus for this case we have dtected the presence of ill- conditioning, but have not found its effets to be very serious. Consider next a similar set of 5 equations: 1 42 1B We USTTs] 13760 1B 14 ys 6) x: 87/60 WB v4 ys 6 U7 |! x, |=) 459/400 wa us 16 17 U8] xe} | 743/40 us us 7 us u9tis,} Lisrsaszo, For reasons of space we will not reproduce the entie inverse; however, the largest element inthe inverse is 177807.6, or six orders of magnitude larger than the elements in the original matrix. The normalized determinant is BELL - 6.2638 10" “To assess the effects, we examine VaR ‘These are indications of sever illconitionin. 10099 0.034 03361 05037 03360 02521 03363 02501 0.2016 02525 02018 0.1680 2031 01682 0.1440 0.25% 0.2018 0.2017 0.1681 0.1680 0.1440 0.1440 0.1259 0.1260 0.1119. ‘The errors have clearly become significant for this case, since the third decimal place in almost all of the elements of this reinverted matrix are incorrect. which should be all 1's, is found to be 0.9992676 1.015625 ‘The solution vector, CHAPTER 6 SIMULTANEOUS LINEAR EQUATIONS AND MATRIX INVERSION 113 ‘The effects of illconditioning ate apparent, Clearly, sets any larger than this will cause considerable trouble. I is interesting to examine the solution vector (which should be all Vs) for n=6: 099462897) 1.285156 04375000, 8.000000 6.000000, L 3.187500 ‘This result is complete nonsense and the solution has been clearly overwhelmed by roundof! error. As might be expected, (H'’)" for n = 6 looks quite different from H. If te solution vector is obtained directly by Gauss-Jordan elimination without finding Has an inter- ‘mediate step, then the number of arithmetic operations is smaller and the solution vector does not become complete nonsense until n = 8. ‘These results were obtained on the IBM 360/67 in single precision arithmetic (which uses essentially a seven digit word). Larger Hilbert matrices can be inverted with meaning- ful results if double precision arithmetic is employed, or if other machines with larger word sizes are used. x 6.6 Carry out the first three iterations of the Gauss-S of equations: jedel method for the following set 8x) +2x:43x5=30 Ay -9H 42 = 2x; 43x, + 6x5 =31 ‘The set is strongly diagonally dominant, so no rearrangement is needed and Gauss- Siedel iteration should converge. Solving each equation for the unknown which has the largest coefficient (in magnitude) gives =3x, xin2e 3-26-38 6 We use an initial guess of 1 x=|1 1s “The first iteration is 30. “we 1250 3.1250-241) =9 = 31=26.1250)— 10.4583) 6 4583 3.8959 Note that the most current estimate for x, of 3.1250 was used in solving for x; and that the most current estimates of x, and x; were used to solve for x. The second iteration is 114 67 NUMERICAL METHODS so H=HOASED= 30.099) 9145 2.1745 ~28 °959) _ 5 5565 ae aS a “The third iteration gives 222209955090 39 =20M 205079 31=20.0220 30.98) _ 57) ‘The iteration is clearly converging. After three iterations, the estimate ofthe solution vee~ tor is 2.0220 x =| 0.9999 3.9977, Solve the following set by Gauss-Siedel iteration: 30-5 47 ITH] P18 n 16 17 10} x, |_] 26 560022 =18]] x [] 34 i oa iba i At first glance the set does not seem to be suitable for an iterative solution, since the ‘main diagonal elements are not the largest elements in each row. However, by simply reor- dering the equations this can be partially remedied oo Ad re 7-2 7] xn] ) 2 3 -s 47 lls |") 18 nw 17 tolled Los. ‘The main diagonal elements are now the largest elements in magnitude in each row except for the last row. The diagonal dominance in the first three equations is suficiently strong hat the small diagonal element inthe fourth equation may not cause divergence. We try an itil guess of CHAPTER 6 SIMULTANEOUS LINEAR EQUATIONS AND MATRIX INVERSION 118 with an absolute convergence criterion of € = 0.0001. The first iteration gives 0339286 1.230789 0.066725 0.144090, After 10 iterations, xm 0.930569 1901519 1359500 1.729954. ‘The process satisfies the convergence criterion after 35 iterations and gives 1.076888 1.990028 x") param = 1.906078. en ‘The Gauss-Siedel procedure clearly converges with no problems for this set with Cw. 10. However, itis interesting to note that if Cu is 9 or smaller then the procedure is divergent. Clearly the presence of any small main diagonal elements can pose a significant threat to the convergence of Gauss-Siedel iteration, However, if iterative techniques are indicated for other reasons, they are defintely worth trying even in the presence of a few small main diagonal elements. In some cases, underrelaxation of the offending equation(s) ‘can turn a divergent procedure into a convergent one. Given the following set of equations: $0404 4IPNT PIT 36 5 Sijm{_[20 6 6 7 6|lx| | 25 77:7 8dlad La. Using an initial guess of zero for all unknowns, and an absolute convergence criter- ion of € = 0.0001, carry out iterative solutions with various relaxation factors and compare the number of iterations necessary to attain convergence. ‘The exact solution to this set is v ' mali 1 For all iterative solutions considered here, the answers obtained were accurate to at least 3 decimal places. The relaxation factors employed and the corresponding number of itera- tions necessary to attain convergence are tabulated below: 116 NUMERICAL METHODS A | Herations to Convergence os 4 10 6 2 33 us 84 17 M7 It is apparent that Gauss-Siedel iteration (A = 1) is best for this set. In fact, if an iterative scheme is to be employed for almost any problem, Gauss-Siedel iteration should be tried first. The exception is for problems such as Problem 6.9, where previous experience makes it quite clear that overrelaxation will accelerate convergence in almost all cases. Searching for a nearly optimum A is clearly advantageous only if virtually the same set (pechaps with a slightly different coefficient matrix or right-hand side) is to be solved many times. 6.9 Given the following set of 10 equations: a x] [-0s 1-201 x] |-1s ce % 15 1-2 Ls 1 lies 1-2 | % 1s 1 -2JL xl L+os, ‘This set (although quite small) is typical of the sets of equations which arise from the ‘numerical solution of partial differential equations. The matrix is tridiagonal, and in this cease is clearly small enough that direct methods should be used However, similar (but iuch larger) sets are often solved by iterative methods, so we will explore the use of these techniques as well, "The most effective direct method for such a set is the Gauss elimination algorithm for tridiagonal sets (Fig. 6.3), which yields the following results: y= 6.4091 x: 123183, x= 16.7274 xe 19.6365 xs= 21.0856 Explore various solution methods. We now turn tothe iterative methods. Gauss-Siedel iteration with an initial guess of 0 for all x and with an absolute convergence criterion of € =0.001 yields x)= 6.4054 0.9839 = 123113 193543, x)= 16.7181 xy = 16.2653 xe 19.6258, x = 11.6167 x= 210044 5.5883 CHAPTER 6 SIMULTANEOUS LINEAR EQUATIONS AND MATRIX INVERSION 17 in 80 iterations. These values are reasonably accurate (to 2 decimal places in most cases) land the accuracy could be improved as much as desired by using a smaller convergence criterion (at the expense of additional iterations). Sets of this type are often well suited to overrelaxation, Using a relaxation factor of ‘A = 1.7 and the same initial guess and convergence criterion as before yields essentially the ‘same answers as does Gauss-Siedel iteration, but requires only 27 iterations. Some addi- tional searching might yield a more nearly optimal value of A, which would mean that even fewer iterations would be required. However, the advantage of overrelaxation as com~ pared with Gauss-Siede iteration for this set should be apparent from this example. The Feason that matrices of this type are encountered so often and their characteristics so well known will become more obvious when the numerical solutions of ordinary and partial differential equations are discussed in Chapters 9 and 11 Given the following nonlinear set of algebraic equations: axty? 1 xtay +2218 ty tdre IS Solve this set by using Gauss-Siedel iteration. Nonlinear algebraic sets can have multiple solutions and this coud be true of the present problem. However, the fortunate state of affairs with most sets of nonlinear equa: tions is that the solution which is most easily obtained is the solution which is “wanted.” For example, ifthe set of equations comes from a physical problem, then the solution which is easiest to obtain is usually the only physically realizable solution. The remaining solutions are often complex (have imaginary components) or are physically impossible or unlikely. ‘A solution tothe present problem can be obtained by applying standard Gauss-Siedel iteration to the set which is already arranged in diagonally dominant form. Thus Meyiezy 18oxoe 4 ya We use an initial guess of x= y=z=1 and an absolute convergence criterion of € 0.0001. The first iteration yields say 4 oy Maay—t x « 2s 18=225-ay 182 2.25- 0 «56675 3.6875 2.25) a 1.3625 After 67 iterations, the convergence criterion is satisfied, and we find = 1.000112, y= 1.999962, = 2.999053, In fact, for this contrived problem, an exact solution is 3 yD, 1 must be emphasized that we had no guarantee that this iteration would converge despite the diagonal dominance of the set, since a general theory for the iterative solution of nonlinear equations is not availabe. 18 NUMERICAL METHODS Its interesting to examine the effects of relaxation on this iterative process. Using the same convergence criterion and intial guess as before, we find the following for various relaxation factors: A] Rerations to Convergence 06 4 08 38 19 6 1.2 | 300 (not yet converged) We find that underrelaxation accelerates convergence, with the optimum relaxation factor apparently near A=0.8. Even slight overrelaxation (A = 1.2) slows the convergence enormously. No general conclusion should be drawn from these results (in fact, general conclusions can hardly ever be drawn from nonlinear problems), but the reversal of overall, behavior in this case, as compared to most linear problems, is interesting and illustrates the degree of art rather than science which often must be exercised for the most effective solu- tion of nonlinear problems. Problems *6.11 Write a computer program to solve a set of simultaneous linear equations by Gauss-Jordan climination. Assume that no maximization of pivot elements is required. The program should be capable of solving sets of equations of arbitrary size, but no larger than 20 20. "642 Write a computer program to solve a tridiagonal set of simultaneous equations by Gauss elimination. The three diagonals of the coefficient matrix should each be stored in one dimensional arrays to minimize storage and to eliminate unnecessary operations on zero elements. The program should be capable of solving sets of equations to arbitrary size, up 10 n= 100. ‘Write a computer program to solve a set of simultaneous linear equations by Gauss-Siede! iteration or by point relaxation. The program should be capable of solving sets of arbitrary size, but no larger than 2020. Input should include the initial guess forthe unknowns, the convergence criterion (which may be absolute or relative, as you prefer), and the relaxation factor. Solve the following sets by using Gauss-Jordan elimination: 2 Bg » EEG CHAPTER 6 SIMULTANEOUS LINEAR EQUATIONS AND MATRIX INVERSION 19 6.15 Solve the following sets by using the Gauss-Jordan program written for Problem 6.11: 5 -30 gypsy par 11-9 1-9 2fx] | a7 2-107 5 -1 6 uiix| | 2 @ Pon 32 7 at m2} x 8 43 1-7 2 1 tial | oe 209 -8 Wont 4 -t]ix} | -10 Bt alles Po-r020 8-7 ~89PHq par 3-9 1-1 8 tfxl{ s -1 1 9 -9 2 ailal | 2 & ee ela a 71-2 4 1-1] | os ee Tel Lo Repeat Problem 6.15 by using the Gauss-Jordan subroutine in the Appendix to solve the sets. ms by using the Gauss elimination program Solve the following tridiagonal set of 10 equs written for Problem 6.12: at a] f=27 tt a | |-1s 1-44 x | |-15 a -4 a ffay | as 1 -athee} L-ts, 6:18 Construct a natural cubic spline which fits all of the following points, and interpolate for $0.3): TPP Phys 7) [om [oom [oss [ oat | om [ons [ors [ er [oom [om (Note: This problem is not out of place. The matrix of coeficients of the set of equations Which arises inthe course of solving this problem is tridiagonal, and the program written for Problem 6.12 should be used.) "6.19 Solve the following set by using the Gauss-Jordan subroutine in the Appendix. Examine the solution and the magnitude of the determinant, and comment. 83-9 7 4K] fio Ba ow Ae 43.-7 1 6f[s]-fi0 2-1 6 4 2i)x] [2 iat 9 rolls) [oe 120 NUMERICAL METHODS 6.20 Solve the following sets by Gauss-Siedel iteration: tos mile) " 1-02 2 314 a ® LP : “1207 ” mol 4 3 2 8 aren 2-1 7 2 1 1 waif. | | ee ee ee ee | “) 24 ren on a 4 atin [oa roa 4 tas 1 2 afer -2 bor toa on tall |e 304 5 1 2 8 walls] fon sort nat gat dled Lon dt ato storey pe r-r4o3 7 1 sifal [os stoner rn -t alls} | *(d) 112 1 -8 4-3 T |) xe]=] 91 4 ose to sia | a —.—rt—sSEe tor 2a a alle "6.21 Solve the tridiagonal set given in Problem 6.17 by using Gauss-Siede! iteration. "6.22 Repeat Problem 6.21 by using relaxation with relaxation factors of 13, 1.6, and 1.8. Compare the number of iterations required to the number needed for Gauss-Siede! iteration. Which is best for this problem? 6.23 Carry out the error correction technique described in Problem 6.3 o improve the answers obtained in Problem 6.5 for n = 3. Use the computed inverse of HT given in Problem 6.5 to obtain the error vector E. *6.24 Solve the following nonlinear set by Gauss-Siedel iteration’ Swit ytre87 wx 29-2273 woxtay t2= 1729 wax ty'etz 347 Chapter 7 Least-Squares Curve Fitting and Functional Approximation 7.0 INTRODUCTION In this chapter we will consider briefly the approximation of functions. The subject is, much too long and complex to cover in detail here. Entire books devoted to the subject, include Rice(7] and Meinardus{8]. However, we will attempt to provide enough infor- mation to enable the reader to construct some simple approximating functions, and to recognize and use the most common and effective approximating functions for digital computers. Methods will be examined for the approximation of both continuous func- tions and functions available only at discrete points. In the case of functions available only at discrete points, we will consider approxi- ‘mation by simple continuous functions, such as polynomials. Actually, we have already introduced one variety of such approximations in Chapter 4. In that chapter, we con- structed polynomial approximations to functions available at discrete points, and termed these polynomial approximations “interpolating polynomials.” In the present chapter we will show how simple approximations can be constructed which can be used to smooth noisy experimental or numerical data, and to provide a simple analytical expression in- stead of a collection of scattered points. In approximating continuous functions, the objective is usually to provide a “sim- pler” form than the original function. The approximation should be simpler than the ‘original function either in the sense that it is easier to handle analytically, or (more to the point of this text) that itis easier andjor faster to evaluate on a digital computer. As was, the case with functions available at discrete points, Chapter 4 also provided an introdue- tion to the approximation of continuous functions. Polynomial approximations to con- tinuous functions were constructed by using the concept of Chebyshev interpolation, in which the continuous function to be approximated was sampled at specific points (the zeros of the Chebyshev polynomials) and a polynomial was generated by using Lagrange interpolation. ‘This approximating (or interpolating) polynomial tended to have ‘minimum-maximum error in approximating the original function. ‘We reopen the discussion of functional approximation by again considering a function available only at discrete points. 12 122 NUMERICAL METHODS: 7.1 LEAST-SQUARES FITTING OF DISCRETE POINTS In constructing the interpolating polynomials of Chapter 4, the primary purpose was to provide information between tabulated points, and, as accurately as possible, to force the interpolating polynomial to assume exactly the value of the tabulated function at each of the points where the function was supplied. Consider, however, the nature of much ex- perimental data. Typically, such data include noise due to many different effects. (Hopefully, if the experiment is well designed, the data do not include systematic error which would tend to shift all of the data in one direction.)* The noisy data from an experiment might appear as shown in Fig. 7.1. (We assume the x values are accurate.) Fo Fig. 7.1 Using our knowledge of interpolation, it would be possible to construct (perhaps using Lagrange interpolation) a polynomial which fits the data exactly at each of the points, However, such a polynomial would not only include all of the noise in the data, it would necessarily be of a very high degree, and would oscillate wildly, perhaps straying far from the immediate region which contains the data. Such a high degree polynomial would also be very unwieldy to use as a continuous function which is representative of the data, ‘A better functional approximation to the data would be one which is simple in form (perhaps a polynomial of relatively low degree) and which tends to “smooth” the data (reduce the noise). If the noise in the data is assumed to be essentially random in character, then a reasonable smoothing functional approximation to the data in Fig. 7.1 might be the straight line which we have drawn “by eye” through the points. We must. of course, have a more precise and automatic way of constructing such approximating func- tions than by eye. If the function to be approximated is f(x), and the approximating function is denoted as g(x), then a measure of the accuracy of the approximation is the magnitude of the local distance between the two functions given by We will not attempt to discuss methods for recognizing or assessing noise or systematic error in experimental data. See any book oa the analysis of data, such as Pugh and Winslow/9) CHAPTER 7 LEAST-SQUARES AND FUNCTIONAL APPROXIMATION 123 d(x) =[f0)- 8) a ‘The approximating function g(x) should now be chosen such that, in some sense, d(x) is minimized over the entire region of x where the approximation is to apply. ‘The sense in which d(x) is minimized is clearly a vital factor in determining the character of the approximation. We have previously encountered minimization of d(x) in the Chebyshev sense: the minimization of the maximum value of d(x) over the interval. This criterion is usually not an effective one to use in selecting a continuous functional approximation to noisy data, simply because it permits individual points which may be badly in error to exert an overpowering influence on the approximating function. A single point, such as the point labeled “A" in Fig. 7.1, can force the approximating function to shift drastically toward it in order to minimize the maximum error which would tend to occur at that point. A much more favorable sense in which to minimize d(x) for this type of approximation is in the least-squares sense. If we denote the x coordinates at which data are available as the x, and if there are n such coordinates, then d(x) is minimized in the least-squares sense if e-$ae) oa ized For the approximation of functions known at discrete points, the most commonly- chesen form for g(x) is the polynomial. Thus if g(x) is of degree (0) = dot ax + ax? +++ ax! (73) and from (7.2), B= 3 is) eGo = [ats —fea)? = ¥ tata foo 74) Using (7), equation (7.4) becomes B= Satan taxi ts tax! foo? 7s) ‘The parameters which can be varied in order to minimize E are the (I + 1) coefficients of g(x). The minimization can be accomplished by setting equal to zero the partial deriva- tives of E with respect to each of these coefficients: aE 76) ‘The proof that (7.6) indeed does provide a minimum can be found in many references, i cluding [5]. The set of equations (7.6) provides (1 +1) equations in the (I + 1) unknowns 124 NUMERICAL METHODS 4, 4), ds, .--44,. Toillustrate the form of these equations, we will carry out the details of the differentiation for the first equation: a_ ad an Fac Bay to a + a + ax!~ f(a) a, ped sal THEA Dat ax, + aga? +--+ axe! flay) =0 on [S eos [adae +f] Similarly, the second equation is Soo on [ExJo+[E xJos [Sx ]as-+ [Sata =F see (78) It can be readily inferred from (7.7) and (7.8) that the complete set of simultaneous linear ‘equations in the coefficients of the polynomial is n Se Se a] PE 46) Ze Se Ue a| | Dxsc Si Ie Lx a: |=] S xf) 2) De De Dat a} [Sere where © signifies 37.1. Standard equation-solving techniques, such as Gauss-Jordan elimination, may be used to solve the set (7.9), but unfortunately the set is very poorly conditioned. ‘The number of equations which can be solved (and thus the degree of the approximating polynomial) is severely limited in most cases by roundoff error, and I = 7 or 8 will usually produce meaningless results on most machines using single precision arithmetic. One of, the problems is the large variation in the magnitude of the coefficients in any given row; Ex/is obviously much larger (or smaller) in magnitude than 2x, for any reasonably large value of 1. Double precision arithmetic can be of enormous help in maintaining accuracy and is recommended where available for least-squares work. Fortunately, relatively low order polynomials are usually the most useful for data fitting; higher order polynomials tend to simply reproduce the noise in the data and should not be used without good reason. By far the most widely-used functions for data fiting are straight lines, and data are often replotted on different scales (such as log-log scales) until the data assume such a form that a straight line is a reasonably good approximation{9, 10] The choice of the degree of polynomial to be used for the fitting of data can be somewhat difficult. The best situation is one in which it is known a priori that the data, should fall on a polynomial of a given degree. This degree of polynomial is then the obvious choice. Qualitative judgments can often be made by examining the data; for

You might also like