An Introduction To Floating Point Arithmetic by Example: Pat Quillen
An Introduction To Floating Point Arithmetic by Example: Pat Quillen
Arithmetic by Example
Pat Quillen
21 January 2010
1 − 3 ∗ (4/3 − 1)
according to M ATLAB?
1 − 3 ∗ (4/3 − 1)
according to M ATLAB?
2.220446049250313e-016
1 − 3 ∗ (4/3 − 1)
according to M ATLAB?
2.220446049250313e-016
Why??
1 − 3 ∗ (4/3 − 1)
according to M ATLAB?
2.220446049250313e-016
Why?? Essentially because 4/3 cannot be represented
exactly by a binary number with finitely many terms.
That is,
4 1 1 1
= 1 + 2 + 4 + 6 + ···
3 2 2 2
or, in binary,
4
= 1.010101010101 · · ·
3
which, again, is not exactly representable by finitely many
terms.
(−1)0 20 (1 + 0).
(−1)0 20 (1 + 0).
You can use format hex in M ATLAB to see the bit pattern
of the floating point number in hexadecimal. The first three
hex digits (12 bits) represent the sign bit and the biased ex-
ponent, and the remaining 13 hex digits (52 bits) represent
the mantissa.
1 − 3 ∗ (4/3 − 1) 6= 0.
ǫ = 2−52
is the result.
ǫ is referred to as machine epsilon, or the unit-roundoff, and it
is the distance between 1 and the next closest floating point
number.
Floating Point Arithmetic by Example – p.9/15
Example
A very common example of propogation of round-off comes
in the form of
0.1 + 0.1 + 0.1
Specifically, is the above expression equal to 0.3?
Numerical Disasters
1991: Patriot Missile misses Scud!
1996: Ariane Rocket explodes!
Note: Not all cancellation can be avoided, and not all can-
cellation is bad!