0% found this document useful (0 votes)

13 views46 pages

Numerical Methods I - Roundoff Errors

The document discusses round-off errors in numerical methods, highlighting how computers represent numbers using a fixed number of significant figures, which can lead to discrepancies. It explains the concepts of floating-point representation, normalization, and the limitations of representing both large and small numbers, as well as the implications of quantizing errors. Additionally, it covers methods for approximating values and the importance of machine epsilon in error characterization and numerical method applications.

Uploaded by

thekonan726

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views46 pages

Numerical Methods I - Roundoff Errors

Uploaded by

thekonan726

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 46

Azərbaycan Dövlət

Neft və Sənaye
Universiteti

Numerical
Methods I
Round-Off Errors
We can now proceed to the two types
of error connected directly with
numerical methods:
• ;
• .
Recall that
• errors originate from the fact that
computers retain only a fixed
number of significant figures during
Numbers such as or cannot be
expressed by a fixed number of
significant figures.

In addition because computers use a

representation they cannot precisely
represent certain exact numbers
(like ).
The discrepancy introduced by this
omission of significant figures is called
Numerical round-off errors are directly
related to the manner in which
numbers are stored in a computer
memory.
Fractional quantities are typically
represented in computers using form.
In this approach the number is
expressed as a fractional part called a
or and an integer part called an or
as in .
For instance the number could be
represented as in a floating-point
system.

shows one way that a floating-point

number could be
stored in a word:
• the first bit is reserved for the
• the next series of bits for the
exponent
Note that the mantissa is usually
normalized if it has leading zero
digits.
For example, suppose the quantity
was stored in a floating-point system
that allowed only four decimal places
to be stored.

Thus would be stored as

However in the process of doing this
the inclusion of the useless zero to the
right of the decimal

forces us to drop the digit in the fifth

decimal place
The number can be normalized to
remove the leading zero by multiplying
the mantissa by and lowering the
exponent by to give
Thus we retain an additional
significant figure when the number is
stored.
The consequence of normalization is
that the absolute value of is limited
i.e.
For example,
• for a system, would range between
and
• for a system would range between
and .

Floating-point representation
allows both fractions and very large
numbers to be expressed on the
However it has some disadvantages:
• For example floating-point numbers
take up more room and take longer
to process than integer numbers.
• More significantly, however, their
use introduces a source of error
because the mantissa holds only a
finite number of significant figures.

Thus a is introduced.
. Hypothetical Set of Floating-Point
Numbers
Create a hypothetical floating-point
number set for a machine that stores
information using bit words.

Employ the first bit for the sign of the

number the next three bits for the sign
and the magnitude of the exponent
and the last three bits for the
The smallest
possible positive
number is depicted
in .
The initial indicates that the quantity
is positive.

The in the second place designates

that the exponent has a negative sign.
Solution (continued)
The s in the third and fourth places
give a maximum value to the exponent
of

– therefore the exponent will be .

Finally the mantissa is specified by the
in the last three places which
conforms to
Solution (continued)
Although a smaller mantissa is
possible (e.g. ) the value of is used
because of the limit imposed by
normalization (recall Eq. ).

Thus the smallest possible positive

number for this system is

which is equal to in the base- system.

Solution (continued)
The next highest numbers are
developed by increasing the mantissa
as in
Solution (continued)
Notice that the base- equivalents are
spaced evenly with an interval of .

At this point to continue increasing we

must decrease the exponent to which
gives a value of

The mantissa is brought (decreased)

back to its smallest value of .
Solution (continued)
Therefore the next number is

This still represents a gap of

However, now when higher numbers

are generated by increasing the
mantissa the gap is lengthened to .
Solution (continued)
Next highest numbers are developed
by increasing the mantissa:
Solution (continued)
This pattern is repeated as each larger
quantity is formulated until a
maximum number is reached:
Solution (continued) The final number
set is depicted graphically in .
 manifests several aspects of floating-
point representation that have
significance regarding computer - :

• There is a limited range of real

quantities that may be represented.

Just as for the integer case, there

are large positive and negative
numbers that cannot be represented.
Attempts to employ numbers
outside the
acceptable range will result in what

• is called an in
However . addition to large
quantities the floating-point
representation has the added
limitation that very small numbers
cannot be represented.
This is illustrated by the underflow
• There are only a finite number of
quantities that can be represented
within the range – thus, the degree
of precision
Obviously, is limited.
irrational numbers
cannot be represented exactly.

Furthermore, rational numbers that

do not exactly match one of the
values in the set also cannot be
The errors introduced by
approximating both these cases are
referred to as .

The actual approximation is

accomplished in either of two ways:
or .
. Suppose the value of is to be stored on a
base number system carrying seven
significant figures.

One method of approximation would be to

merely omit or chop off the eighth and
higher d as with the introduction of an
associated error of .

This technique of retaining only the

significant terms was originally dubbed in
computer jargon.
.
Note that for the base number system in ,
means that any quantity falling within an
interval of length will be stored as the
quantity at the lower end of the interval.

Thus the upper error bound for chopping is .

.
Additionally a bias is introduced because all
errors are positive.

The shortcomings of chopping are

attributable to the fact that yields a lower
absolute error than chopping.

For instance in our example of the first

discarded digit is . If we round up the last
retained digit we will get .
.
Such reduces the error to

Note that for the base number system in ,

rounding means that any quantity falling
within an interval of length will be
represented as the nearest allowable
number.
Thus the upper error bound for rounding is
(as opposed to in case of truncating).
.
Additionally, no bias is introduced because
some errors are positive and some are
negative.
Some computers employ rounding. However,
this adds to the computational overhead,
and, consequently, many machines use
simple chopping.
• The interval between numbers
increases as the numbers grow in
magnitude.
It is this characteristic of course
that allows floating-point
representation to preserve
significant digits.
However it also means that
quantizing errors will be
proportional to the magnitude of the
For normalized floating-point numbers,
this proportionality can be expressed,
for cases where chopping is employed,
as

And for cases where rounding is

employed as

Here is referred to as the .

Machine epsilon can be computed as
follows

where is the number base and is the

number of significant digits in the
Determine
mantissa. the machine epsilon and
verify its effectiveness in
characterizing the errors of the
number system from .
.
The hypothetical floating-point system
from employed values of the base and
the number of mantissa bits .

Therefore, the machine epsilon

according to Eq. would be .

Consequently, the relative quantizing

error should be bounded by for .
The largest relative errors should
occur for those quantities that fall just
below the upper bound of the first
interval between successive
equispaced numbers (see ):
Those numbers falling in the
succeeding higher intervals would
have the same value of but a greater
value of and hence would have a lower
relative error.
An example of a maximum error would
be a value falling just below the upper
bound of the interval between and .
For this case the error would be less
than

Thus the error is as predicted by Eq. .


The magnitude dependence of quantizing
errors has a number of practical
applications in numerical methods.

Most of these relate to the commonly

employed operation of testing whether two
numbers are equal.
This occurs when testing convergence of
quantities as well as in the stopping
mechanism for iterative processes.
For these cases it should be clear that rather
than test whether the two quantities are
equal it is advisable to test whether their
difference is less than an acceptably small
tolerance.
In addition, the machine epsilon can be
employed in formulating stopping or
convergence criteria.
These all precautions ensures that programs
are portable — that is they are not
dependent on the computer on which they
Write pseudocode to automatically
determine the machine epsilon of a binary
computer, then implement it in some
programming language:
epsilon = 1.0;
while (epsilon + 1.0 > 1.0)
epsilon = epsilon / 2.0;
end
epsilon = 2.0 * epsilon;
 Extended Precision
Commercial computers use much larger
words than were previously described in
examples, and, consequently, allow
numbers to be expressed with more than
adequate precision.
For example computers that employ IEEE
format use bits for the mantissa which
translates into about seven significant
base digits of precision with a range of
With this acknowledged there are still
cases where round-off error becomes
critical.
For this reason most computers allow
the specification of , the most common
of which is .

It provides about to decimal digits of

precision and a range of approximately
to .
In many cases, the use of double-
precision quantities can greatly
mitigate the effect of round-off errors.

However a price is paid for such

remedies in that they also require
more memory and execution time.

The difference in execution time for a

small calculation might seem
However as your programs become
larger and more complicated added
execution time could become
considerable and have negative impact
on your effectiveness as a problem
Therefore, extended precision should
solver.
be selectively employed where it will
yield the maximum benefit at the least
cost in terms of execution time.
It should be noted that some of the
commonly used software packages (for
example Excel Mathcad) routinely use
to represent numerical quantities.

Others like MATLAB software allow

you to use if you desire so.
Thank you very much for
attention!

Numerical Methods Notes
100% (1)
Numerical Methods Notes
553 pages
Week
No ratings yet
Week
87 pages
许韬-Lecture Notes 01 - Introduction
No ratings yet
许韬-Lecture Notes 01 - Introduction
43 pages
Cacc
No ratings yet
Cacc
106 pages
Ch1 Intro Approximation Errors
No ratings yet
Ch1 Intro Approximation Errors
41 pages
MTH 214 Accuracy in Numerical Calculations and Error Analysis
No ratings yet
MTH 214 Accuracy in Numerical Calculations and Error Analysis
18 pages
Chapter 4
No ratings yet
Chapter 4
26 pages
Error Analysis (SR.02) PDF
No ratings yet
Error Analysis (SR.02) PDF
87 pages
FALLSEM2018-19 - MAT5009 - TH - TT531 - VL2018191004951 - Reference Material I - 16 - Rounding - Error Calculation
No ratings yet
FALLSEM2018-19 - MAT5009 - TH - TT531 - VL2018191004951 - Reference Material I - 16 - Rounding - Error Calculation
13 pages
Machine Arithmetic - Notes
No ratings yet
Machine Arithmetic - Notes
10 pages
Document From Avijit Mukherjee
No ratings yet
Document From Avijit Mukherjee
10 pages
Unit 1 CBNST
No ratings yet
Unit 1 CBNST
4 pages
Numerical Analysis - Patel
No ratings yet
Numerical Analysis - Patel
16 pages
Module 2 Roundoff and Truncation Errors
No ratings yet
Module 2 Roundoff and Truncation Errors
11 pages
Numerical Errors
No ratings yet
Numerical Errors
23 pages
CHAP 03e
No ratings yet
CHAP 03e
32 pages
Week 2 M1Lessons 2-3
No ratings yet
Week 2 M1Lessons 2-3
41 pages
3.1 Data Representation: 3.1.3 Real Numebrs and Normalized Floating-Point Representation
No ratings yet
3.1 Data Representation: 3.1.3 Real Numebrs and Normalized Floating-Point Representation
14 pages
Mathematics 2a Work Book (F)
100% (1)
Mathematics 2a Work Book (F)
151 pages
Round-Off Errors and Computer Arithmetic
No ratings yet
Round-Off Errors and Computer Arithmetic
19 pages
ch1 Anal Num
No ratings yet
ch1 Anal Num
17 pages
IEEE Standard 754 Floating Point Numbers
No ratings yet
IEEE Standard 754 Floating Point Numbers
7 pages
Approximation and Round-Off Errors: Speed: 48.X Mileage: 87324.4X
No ratings yet
Approximation and Round-Off Errors: Speed: 48.X Mileage: 87324.4X
27 pages
Basic Rules of Differentiation
100% (1)
Basic Rules of Differentiation
33 pages
ELEC3030 Notes v1
No ratings yet
ELEC3030 Notes v1
118 pages
8.3 Floating Point Numbers
No ratings yet
8.3 Floating Point Numbers
19 pages
Week 5: IEEE Floating Point Revision Guide For Phase Test
No ratings yet
Week 5: IEEE Floating Point Revision Guide For Phase Test
23 pages
Lecture 3
No ratings yet
Lecture 3
12 pages
IEEE Standard 754
No ratings yet
IEEE Standard 754
10 pages
HW 2
No ratings yet
HW 2
4 pages
L3 Source of Error, Floating-Point
No ratings yet
L3 Source of Error, Floating-Point
26 pages
Floating-Point Numbers and Round-Off Errors by Kusal Kaluarachchi Medium
No ratings yet
Floating-Point Numbers and Round-Off Errors by Kusal Kaluarachchi Medium
2 pages
Lab 3
No ratings yet
Lab 3
5 pages
Floating Point 6up
No ratings yet
Floating Point 6up
7 pages
Mws Gen Aae Spe Floatingpoint
No ratings yet
Mws Gen Aae Spe Floatingpoint
8 pages
Review: How To Represent Real Numbers
No ratings yet
Review: How To Represent Real Numbers
9 pages
Numerical Analysis: Lecture - 1
No ratings yet
Numerical Analysis: Lecture - 1
9 pages
4.16. Floating Point
No ratings yet
4.16. Floating Point
5 pages
ComputerArithmetic-and-Interpolation 2023
No ratings yet
ComputerArithmetic-and-Interpolation 2023
29 pages
1 5 Floating Point Representation
No ratings yet
1 5 Floating Point Representation
9 pages
Round-Off Errors and Computer Arithmetic
No ratings yet
Round-Off Errors and Computer Arithmetic
19 pages
Splines
No ratings yet
Splines
16 pages
What Are Floating Point Numbers?
No ratings yet
What Are Floating Point Numbers?
7 pages
Numerical Methods: Representing Numbers
No ratings yet
Numerical Methods: Representing Numbers
30 pages
NUMBERS - PPT Gen Aptitude - 1st Yr SRM
No ratings yet
NUMBERS - PPT Gen Aptitude - 1st Yr SRM
17 pages
Fractions Decimals and Percentages - Practice Booklet
No ratings yet
Fractions Decimals and Percentages - Practice Booklet
11 pages
(Turner) - Applied Scientific Computing - Chap - 02
No ratings yet
(Turner) - Applied Scientific Computing - Chap - 02
19 pages
Rounding Errors: Course Website
No ratings yet
Rounding Errors: Course Website
34 pages
Numeric Artifacts: Lecture Notes by John Schneider
No ratings yet
Numeric Artifacts: Lecture Notes by John Schneider
6 pages
(3.) Approximation and Errors (Part 1)
No ratings yet
(3.) Approximation and Errors (Part 1)
24 pages
Real Number Representation and Floating Point Arithmetic
No ratings yet
Real Number Representation and Floating Point Arithmetic
12 pages
Floating Point Representation: Major: All Engineering Majors Authors: Autar Kaw, Matthew Emmons
No ratings yet
Floating Point Representation: Major: All Engineering Majors Authors: Autar Kaw, Matthew Emmons
21 pages
Number System I
100% (1)
Number System I
18 pages
Computational Methods
No ratings yet
Computational Methods
7 pages
1.3 Error, Accuracy, and Stability: Preliminaries
No ratings yet
1.3 Error, Accuracy, and Stability: Preliminaries
4 pages
What Are The Number Systems
No ratings yet
What Are The Number Systems
3 pages
Computer Organization
No ratings yet
Computer Organization
22 pages
Bcs 054 PDF
No ratings yet
Bcs 054 PDF
79 pages
Mathematics Y9 PDF
No ratings yet
Mathematics Y9 PDF
150 pages
Scientific Computation (Floating Point Numbers)
No ratings yet
Scientific Computation (Floating Point Numbers)
4 pages
Properties: Electrical Engineering Community
No ratings yet
Properties: Electrical Engineering Community
4 pages
Numerical+Analysis+Chapter+1 2
No ratings yet
Numerical+Analysis+Chapter+1 2
13 pages
1.3 Error, Accuracy, and Stability: Preliminaries
No ratings yet
1.3 Error, Accuracy, and Stability: Preliminaries
4 pages
Class Test 1
No ratings yet
Class Test 1
2 pages
INTRODUCTION TO RATIONAL NUMBERS - PPSX
100% (1)
INTRODUCTION TO RATIONAL NUMBERS - PPSX
13 pages
Floating Point
No ratings yet
Floating Point
3 pages
Practice Worksheet For Significant Figures
No ratings yet
Practice Worksheet For Significant Figures
2 pages
Numerical Analysis Lecture Notes: 1. Computer Arithmetic
No ratings yet
Numerical Analysis Lecture Notes: 1. Computer Arithmetic
6 pages
Numerical Methods I - Foundations
No ratings yet
Numerical Methods I - Foundations
68 pages
Fundamentals of Programming-2
No ratings yet
Fundamentals of Programming-2
19 pages
Wa0003.
No ratings yet
Wa0003.
33 pages
11.2 Worksheet
No ratings yet
11.2 Worksheet
4 pages
Mr3391 Demp Unit I
No ratings yet
Mr3391 Demp Unit I
94 pages
Selina Concise Maths Solutions Class 7 Chapter 2 Rational Numbers
No ratings yet
Selina Concise Maths Solutions Class 7 Chapter 2 Rational Numbers
77 pages
Introduction To Twin Primes and Brun's Constant Computation
100% (1)
Introduction To Twin Primes and Brun's Constant Computation
11 pages
Chapter 5 Quadratic Equations
No ratings yet
Chapter 5 Quadratic Equations
16 pages
CA Math
No ratings yet
CA Math
83 pages
Complex Analysis 5
No ratings yet
Complex Analysis 5
55 pages
Statistics Lecture 2
No ratings yet
Statistics Lecture 2
3 pages
Complex Analysis 3
No ratings yet
Complex Analysis 3
46 pages
Complex Analysis 2
No ratings yet
Complex Analysis 2
44 pages
Multiply Decimals DLP
100% (1)
Multiply Decimals DLP
28 pages
Complex Analysis 4
No ratings yet
Complex Analysis 4
38 pages
Decimals Outcomes Test
No ratings yet
Decimals Outcomes Test
7 pages
Order of Operations Slides
No ratings yet
Order of Operations Slides
9 pages
Chapter 7 Topics
No ratings yet
Chapter 7 Topics
4 pages
GRE Magoosh Practice Questions
No ratings yet
GRE Magoosh Practice Questions
7 pages
Fractions Add Easy 001
No ratings yet
Fractions Add Easy 001
2 pages
Fractions Decimal Percents Day 9
No ratings yet
Fractions Decimal Percents Day 9
10 pages
Kvpy Paper III
No ratings yet
Kvpy Paper III
4 pages
Binomial Theorem Exercise
No ratings yet
Binomial Theorem Exercise
4 pages
Mathematical Association of America
No ratings yet
Mathematical Association of America
6 pages
Derivatives Using First Principles
No ratings yet
Derivatives Using First Principles
3 pages
Basic Math Notes
From Everand
Basic Math Notes
Ernest Bywater
5/5 (2)
Learn Excel Functions: Count, Countif, Sum and Sumif
From Everand
Learn Excel Functions: Count, Countif, Sum and Sumif
Rajan
5/5 (4)
Top Numerical Methods With Matlab For Beginners!
From Everand
Top Numerical Methods With Matlab For Beginners!
Andrei Besedin
No ratings yet
Random Sample Consensus: Robust Estimation in Computer Vision
From Everand
Random Sample Consensus: Robust Estimation in Computer Vision
Fouad Sabry
No ratings yet

Numerical Methods I - Roundoff Errors

Uploaded by

Numerical Methods I - Roundoff Errors

Uploaded by

Azərbaycan Dövlət

In addition because computers use a

shows one way that a floating-point

Thus would be stored as

forces us to drop the digit in the fifth

Employ the first bit for the sign of the

The in the second place designates

– therefore the exponent will be .

Thus the smallest possible positive

which is equal to in the base- system.

At this point to continue increasing we

The mantissa is brought (decreased)

This still represents a gap of

However, now when higher numbers

• There is a limited range of real

Just as for the integer case, there

Furthermore, rational numbers that

The actual approximation is

One method of approximation would be to

This technique of retaining only the

Thus the upper error bound for chopping is .

The shortcomings of chopping are

For instance in our example of the first

Note that for the base number system in ,

And for cases where rounding is

Here is referred to as the .

where is the number base and is the

Therefore, the machine epsilon

Consequently, the relative quantizing

Thus the error is as predicted by Eq. .

Most of these relate to the commonly

It provides about to decimal digits of

However a price is paid for such

The difference in execution time for a

Others like MATLAB software allow

You might also like