Arithmetic and Error
Notes for Part 1 of
CMSC 460 What we need to know about error:
-- how does error arise
Dianne P. O’Leary
-- how machines do arithmetic
Preliminaries: -- fixed point arithmetic
-- floating point arithmetic
• Mathematical modeling
• Computer arithmetic -- how errors are propagated in calculations.
• Errors
-- how to measure error
1999 - 2006 Dianne P. O'Leary 1 1999 - 2006 Dianne P. O'Leary 2
How does error arise? How does error arise?
Example: An engineer wants to study the stresses
in a bridge.
1999 - 2006 Dianne P. O'Leary 3 1999 - 2006 Dianne P. O'Leary 4
1
Measurement error Modeling error
Step 1: Gather lengths, angles, etc. for girders and wires. Step 2: Approximate system by finite elements.
1999 - 2006 Dianne P. O'Leary 5 1999 - 2006 Dianne P. O'Leary 6
Truncation error Roundoff error
Step 3: Numerical analyst develops an algorithm: the stress Step 4: The algorithm is programmed and run on a computer.
can be computed as the limit (as n becomes infinite) of some We need π. Approximate it by 3.1415926.
function G(n).
Can’t take this limit on a computer, so decide to use G(150).
3 1 4 1 5 9 2 6 5 3 5 ...
.
G(1), G(2), G(3), G(4), G(5), ...
1999 - 2006 Dianne P. O'Leary 7 1999 - 2006 Dianne P. O'Leary 8
2
Sources of error No mistakes!
1. Measurement error Note: No mistakes:
2. Modeling error • the engineer did not misread ruler,
3. Truncation error • the programmer did not make a typo in the definition of π,
4. Roundoff error • and the computer worked flawlessly.
But the engineer will want to know what the final answer
has to do with the stresses on the real bridge!
1999 - 2006 Dianne P. O'Leary 9 1999 - 2006 Dianne P. O'Leary 10
What does a numerical What does a computational
analyst do? scientist do?
-- design algorithms and analyze them. -- works as part of an interdisciplinary team.
-- develop mathematical software. -- intelligently uses mathematical software to analyze
mathematical models.
-- answer questions about how accurate the final answer is.
1999 - 2006 Dianne P. O'Leary 11 1999 - 2006 Dianne P. O'Leary 12
3
How machines do Machine Arithmetic:
arithmetic Fixed Point
How integers are stored in computers:
Each word (storage location) in a machine contains a fixed
number of digits.
Example: A machine with a 6-digit word might represent
1985 as
0 0 1 9 8 5
1999 - 2006 Dianne P. O'Leary 13 1999 - 2006 Dianne P. O'Leary 14
Fixed Point: Fixed Point:
Decimal vs. Binary Decimal vs. Binary
0 0 1 9 8 5 0 1 0 1 1 0
Most calculators use decimal (base 10) representation. Most computers use binary (base 2) representation.
Each digit is an integer between 0 and 9. Each digit is the integer 0 or 1.
The value of the number is If the number above is binary, its value is
3 2 1 0 4 3 2 1 0
1 x 10 + 9 x 10 + 8 x 10 + 5 x 10 . 1 x 2 + 0 x 2 + 1 x 2 + 1 x 2 + 0 x 2 . (or 22 in base 10)
1999 - 2006 Dianne P. O'Leary 15 1999 - 2006 Dianne P. O'Leary 16
4
Example of binary addition Example of binary addition
0 0 0 1 1 0+0 = 0 0 0 0 1 1 In decimal notation, 3
0+1 = 1
+ 0 1 0 1 0 1+0 = 1 + 0 1 0 1 0 +10
_______________________ _______________________
1 + 1 = 10 (binary) = 102 = 2
0 1 1 0 1 1 + 1 + 1 = 11 (binary)=112 = 3 0 1 1 0 1 =13
Note the “carry” here! Note the “carry” here!
1999 - 2006 Dianne P. O'Leary 17 1999 - 2006 Dianne P. O'Leary 18
Representing negative Range of fixed point
numbers numbers
Computers represent negative numbers using
“one’s complement”, “two’s complement”,
or sign-magnitude representation. Largest 5-digit (5 bit) binary number: 0 1 1 1 1 = 15
Sign magnitude is easiest, and enough for us:
if the first bit is zero, then the number is positive.
Otherwise, it is negative.
0 1 0 1 1 Denotes +11.
1 1 0 1 1 Denotes - 11.
1999 - 2006 Dianne P. O'Leary 19 1999 - 2006 Dianne P. O'Leary 20
5
Range of fixed point Range of fixed point
numbers numbers
Largest 5-digit (5 bit) binary number: 0 1 1 1 1 = 15 Largest 5-digit (5 bit) binary number: 0 1 1 1 1 = 15
Smallest: Smallest: 1 1 1 1 1 = -15
Smallest positive:
1999 - 2006 Dianne P. O'Leary 21 1999 - 2006 Dianne P. O'Leary 22
Range of fixed point
numbers Overflow
If we try to add these numbers: 0 1 1 1 1 = 15
Largest 5-digit (5 bit) binary number: 0 1 1 1 1 = 15 + 0 1 0 0 0 = 8
Smallest: 1 1 1 1 1 = -15 1 0 1 1 1 = -7.
we get
Smallest positive: 0 0 0 0 1 =1
We call this overflow: the answer is too large to store,
since it is outside the range of this number system.
1999 - 2006 Dianne P. O'Leary 23 1999 - 2006 Dianne P. O'Leary 24
6
Features of fixed point
arithmetic Floating point arithmetic
If we wanted to store 15 x 211 , we would need 16 bits:
Easy: always get an integer answer. 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0
Instead, let’s agree to code numbers as two fixed point
Either we get exactly the right answer for addition,
numbers:
subtraction, or multiplication, or we can detect overflow.
p
z x 2 , with z = 15 saved as 01111
The numbers that we can store are equally spaced.
and p = 11 saved as 01011.
Disadvantage: very limited range of numbers.
Now we can have fractions, too:
-1 -2 -3
binary .101 = 1 x 2 + 0 x 2 + 1 x 2 .
1999 - 2006 Dianne P. O'Leary 25 1999 - 2006 Dianne P. O'Leary 26
Floating point arithmetic Floating point representation
Jargon: z is called the mantissa or significand. Example: Suppose we have a machine with d = 5, m = -15,
p is called the exponent. M = 15.
15 x 210 = 11112 x 210 = 1.1112 x 213
± z x 2p
mantissa z = +1.1110
To make the representation unique (since, for example, exponent p = +1101
2 x 21 = 4 x 20 ), we make the rule that 1 ≤ z < 2 15 x 2-10 = 11112 x 2-10 = 1.1112 x 2-7
(normalization).
mantissa z = +1.1110
We store d digits for the mantissa, and limit the range of the exponent p = -0111
exponent to m ≤ p ≤ M, for some integers m and M.
1999 - 2006 Dianne P. O'Leary 27 1999 - 2006 Dianne P. O'Leary 28
7
Floating point standard Floating point standard
Up until the mid-1980s, each computer manufacturer had a On most machines today,
different choice for d, m, and M, and even a different way
to select answers to arithmetic problems. single precision: d = 24, m = -126, M = 127
A program written for one machine often would not compute double precision: d = 53, m = -1022, M = 1023.
the same answers on other machines.
(And the representation is 2’s complement, not
The situation improved somewhat with the introduction in 1987 sign-magnitude, so that the number -|x| is stored as 2d - |x|,
of IEEE standard floating point arithmetic. where d is the number of bits allotted for its representation.)
1999 - 2006 Dianne P. O'Leary 29 1999 - 2006 Dianne P. O'Leary 30
Roundoff in Floating point
Floating point addition addition
Machine arithmetic is more complicated for floating point.
Sometimes we cannot store the exact answer.
Example: In fixed point, we added 3 + 10. Example: 1.1001 x 20 + 1.0001 x 2-1
Here it is in floating point:
1. Shift the smaller number so that the exponents are equal
3 = 11 (binary) = 1.100 x 21 z = 1.100, p=1 z = 0.10001 p = 0
10 = 1010 (binary) = 1.010 x 23 z = 1.010, p = 11. 2. Add the mantissas
0.10001
1. Shift the smaller number so that the exponents are equal + 1.1001
z = 0.0110 p = 11 = 10.00011, p = 0
2. Add the mantissas 3. Shift if necessary to normalize: 1.000011 x 21
z = 0.0110 + 1.010 = 1.1010, p = 11
3. Shift if necessary to normalize. But we can only store 1.0000 x 21! The error is called roundoff.
1999 - 2006 Dianne P. O'Leary 31 1999 - 2006 Dianne P. O'Leary 32
8
Underflow, overflow…. Range of floating point
Convince yourself that roundoff cannot occur in fixed point. Example: Suppose that d = 5 and
exponents range between -15 and 15.
Other floating point troubles:
Smallest positive number: 1.0000 (binary) x 2-15
Overflow: exponent grows too large. (since mantissa needs to be normalized)
Underflow: exponent grows too small. Largest positive number: 1.1111 (binary) x 215
1999 - 2006 Dianne P. O'Leary 33 1999 - 2006 Dianne P. O'Leary 34
An important number:
Rounding machine epsilon
Machine epsilon is defined to be gap between 1 and the
IEEE standard arithmetic uses rounding.
next larger number that can be represented exactly
on the machine.
Rounding: Store x as r, where r is the machine
number closest to x.
Example: Suppose that d = 5 and
exponents range between -15 and 15.
What is machine epsilon in this case?
Note: Machine epsilon depends on d and on whether
rounding or chopping is done, but does not depend on
m or M!
1999 - 2006 Dianne P. O'Leary 35 1999 - 2006 Dianne P. O'Leary 36
9
Features of floating point
arithmetic How errors are propagated
• The numbers that we can store are not equally
spaced. (Try to draw them on a number line.)
• A wide range of variably-spaced numbers can be
represented exactly.
• For addition, subtraction, and multiplication, either
we get exactly the right answer or a rounded version
of it, or we can detect underflow or overflow.
1999 - 2006 Dianne P. O'Leary 37 1999 - 2006 Dianne P. O'Leary 38
Numerical Analysis vs.
Analysis Absolute vs. relative errors
Mathematical analysis works with computations involving
real or complex numbers. Absolute error in c as an approximation to x:
Computers do not work with these; for instance, they |x – c|
do not have a representation for the numbers π or e or
even 0.1 .
Relative error in c as an approximation to nonzero x:
Dealing with the finite approximations called floating point
numbers means that we need to understand error and |x – c|
its propagation. |x|
1999 - 2006 Dianne P. O'Leary 39 1999 - 2006 Dianne P. O'Leary 40
10
Error Analysis Error Analysis
Errors can be magnified during computation. Errors can be magnified during computation.
Example: 2.003 x 100 (suppose ± .001 or .05% error) Example: 2.003 x 100 (suppose ± .001 or .05% error)
- 2.000 x 100 (suppose ± .001 or .05% error) - 2.000 x 100 (suppose ± .001 or .05% error)
Result of subtraction: Result of subtraction:
0.003 x 100 0.003 x 100 (± .002 or 200% error if true
answer is 0.001)
but true answer could be as small as 2.002 - 2.001 = 0.001,
or as large as 2.004 - 1.999 = 0.005! Catastrophic cancellation, or “loss of significance”
1999 - 2006 Dianne P. O'Leary 41 1999 - 2006 Dianne P. O'Leary 42
Error Analysis Error Analysis
What if we multiply or divide?
We could generalize this example to prove a theorem:
Suppose x and y are the true values, and X and Y are our
approximations to them. If
When adding or subtracting, the bounds on
absolute errors add. X = x (1 - r) and Y = y (1 - s)
then r is the relative error in x and s is the relative error
in y. You could show that
xy - XY
≤ |r| + |s| + |rs|
xy
1999 - 2006 Dianne P. O'Leary 43 1999 - 2006 Dianne P. O'Leary 44
11
Error Analysis Avoiding error build-up
Therefore,
Sometimes error can be avoided by clever tricks.
•When adding or subtracting, the bounds on absolute
errors add.
As an example, consider catastrophic cancellation that
•When multiplying or dividing, the bounds on relative
can arise when solving for the roots of a quadratic
errors add (approximately).
polynomial.
But we may also have additional error -- for example, from
chopping or rounding the answer.
Error bounds can be pessimistic.
1999 - 2006 Dianne P. O'Leary 45 1999 - 2006 Dianne P. O'Leary 46
Cancellation example Avoiding cancellation
Example: Find the roots of x2 - 56 x + 1 = 0. Three tricks:
Usual algorithm: x1 = 28 + sqrt(783) = 28 + 27.982 (± .0005) 1) Use an alternate formula.
= 55.982 (± .0005) The product of the roots equals the low order term in the
polynomial. So
x2 = 28 - sqrt(783) = 28 - 27.982 (± .0005)
= 0.018 (± .0005) x2 = 1 / x1 = .0178629 (± 2 x 10-7 )
by our error propagation
The absolute error bounds are the same, but the relative error formula.
bounds are 10-5 vs. .02!
1999 - 2006 Dianne P. O'Leary 47 1999 - 2006 Dianne P. O'Leary 48
12
Avoiding cancellation Avoiding cancellation
2) Rewrite the formula. 3) Use Taylor series.
sqrt(x + e) - sqrt(x) = (sqrt(x+e) - sqrt(x)) (sqrt(x+e) + sqrt(x)) Let f(x) = sqrt(x). Then
(sqrt(x+e) + sqrt(x))
f(x+a) - f(x) = f ′(x) a + 1/2 f ″(x) a2 + ...
= x+e-x = e
sqrt(x+e) + sqrt(x) sqrt(x+e) + sqrt(x)
so x2 = 28 - sqrt(783) = sqrt(784) - sqrt(783) .
1999 - 2006 Dianne P. O'Leary 49 1999 - 2006 Dianne P. O'Leary 50
How errors are measured Error analysis
Error analysis determines the cumulative effects of error.
Two approaches:
• Forward error analysis
• Backward error analysis
1999 - 2006 Dianne P. O'Leary 51 1999 - 2006 Dianne P. O'Leary 52
13
1) Forward error analysis 1) Forward error analysis
This is the way we have been discussing. This is the way we have been discussing.
Find an estimate for the answer, and bounds on the error. Find an estimate for the answer, and bounds on the error.
True True
*
problem solution
(known) * (unknown)
Space of problems Space of answers Space of problems Space of answers
1999 - 2006 Dianne P. O'Leary 53 1999 - 2006 Dianne P. O'Leary 54
1) Forward error analysis 1) Forward error analysis
This is the way we have been discussing. This is the way we have been discussing.
Find an estimate for the answer, and bounds on the error. Find an estimate for the answer, and bounds on the error.
True * True True * (known)
problem Computed solution problem Computed region
(known) solution #* (unknown) (known) solution #* guaranteed
to contain
true soln.
Space of problems Space of answers Space of problems Space of answers
1999 - 2006 Dianne P. O'Leary 55 1999 - 2006 Dianne P. O'Leary 56
14
1) Forward error analysis 2) Backward error analysis
Report computed solution and location of region to the user. Given an answer, determine how close the problem actually
solved is to the given problem.
# (known) True * True
Computed region problem Computed solution
solution #* guaranteed (known) solution #* (unknown)
to contain
true soln.
Space of answers Space of problems Space of answers
1999 - 2006 Dianne P. O'Leary 57 1999 - 2006 Dianne P. O'Leary 58
2) Backward error analysis 2) Backward error analysis
Given an answer, determine how close the problem actually Report computed solution and location of region to the user.
solved is to the given problem.
Region containing
Problem we true problem and
solved (unknown) solved problem
True * True
problem # Computed Computed
solution
(known) solution #* (unknown) solution #
Space of problems Space of answers Space of problems Space of answers
1999 - 2006 Dianne P. O'Leary 59 1999 - 2006 Dianne P. O'Leary 60
15
Arithmetic and Error
Summary:
-- how does error arise
-- how machines do arithmetic
-- fixed point arithmetic
-- floating point arithmetic
-- how errors are propagated in calculations.
-- how to measure error
1999 - 2006 Dianne P. O'Leary 61
16