Floating Point Numbers: Do You Have Your Laptop Here?
Floating Point Numbers: Do You Have Your Laptop Here?
Floating Point Numbers: Do You Have Your Laptop Here?
• https://www.doc.ic.ac.uk/~eedwards/compsys
Eddie Edwards 2008 Floating Point Numbers 7.3 Eddie Edwards 2008 Floating Point Numbers 7.4
1
31/10/2011
Pi (to 8 decimal places) 3.14159265... The Pentium includes instructions for writing multi-precision integer
Standard Rate of VAT 17.5 routines using Binary Coded Decimal (BCD) Arithmetic & ASCII arithmetic
Eddie Edwards 2008 Floating Point Numbers 7.5 Eddie Edwards 2008 Floating Point Numbers 7.6
2
31/10/2011
Errors ? Incorrect results are We’ll normalise Mantissa's in the Range [ 1 .. R ) where R is the Base,
possible e.g.:
[ 1 .. 10 ) for DECIMAL
[ 1 .. 2 ) for BINARY
Eddie Edwards 2008 Floating Point Numbers 7.9 Eddie Edwards 2008 Floating Point Numbers 7.10
-3 -3
–4.01 x 10 –4.01 x 10 0.001 0.125
5 0.11 0.75
343 000 x 10 3.43 x 10
0.111 0.875
0 -8
0.000 000 098 9 x 10 9.89 x 10
0.011 0.375
0.101 0.625
Eddie Edwards 2008 Floating Point Numbers 7.11 Eddie Edwards 2008 Floating Point Numbers 7.12
3
31/10/2011
. 0 1 1 0 1 0.6875 * 2= 1 .3750
0.3750 * 2= 0 .7500
32 16 8 4 2 1 Sum = 8+4+1 = 13 0.7500 * 2= 1 .5000
0.5000 * 2= 1 .0000
0.0000 * 2= 0
Answer: 13 / 32 = 0.40625
Answer: 0.10112
Answer: (32+
32+16+
16+2+1) / 512 = 51 / 512 = 0.099609375
Eddie Edwards 2008 Floating Point Numbers 7.13 Eddie Edwards 2008 Floating Point Numbers 7.14
Eddie Edwards 2008 Floating Point Numbers 7.15 Eddie Edwards 2008 Floating Point Numbers 7.16
4
31/10/2011
2
Example: 20 * 6
1
= (2.0 x 10 ) x (6.0 x 10 )
0 ROUNDING => 5.3 x 10 (Unbiased Error)
1+0
= (2.0 x 6.0) x (10 )
1
= 12.0 x 10
2
We must also normalise the result, so the final answer = 1.2 x 10
Eddie Edwards 2008 Floating Point Numbers 7.17 Eddie Edwards 2008 Floating Point Numbers 7.18
= 5.17 x 103 Example: if Min Exp. is –99 then 10-99 * 10-99 = 10-198 (underflow)
= 5.2 x 103 (rounded)
On Underflow => Proceed with zero value or raise an Exception
Eddie Edwards 2008 Floating Point Numbers 7.19 Eddie Edwards 2008 Floating Point Numbers 7.20
5
31/10/2011
Eddie Edwards 2008 Floating Point Numbers 7.21 Eddie Edwards 2008 Floating Point Numbers 7.22
What is binary 0.1101 in decimal? Widely adopted => Predictable results independent of architecture
Eddie Edwards 2008 Floating Point Numbers 7.23 Eddie Edwards 2008 Floating Point Numbers 7.24
6
31/10/2011
Eddie Edwards 2008 Floating Point Numbers 7.25 Eddie Edwards 2008 Floating Point Numbers 7.26
7
31/10/2011
BEC0
BEC0_0000 = 1011_
011_1110_
1110_1 100_
100_0000_
0000_0000_
0000_0000_
0000_0000_
0000_0000 Number Sign Exponent Significand
42.6875 0 1000_
1000_0100 0101_
0101_0101_
0101_1000_
1000_0000_
0000_0000_
0000_000
Sign Exponent Significand
1 0111_
0111_1101 1000_
1000_0000_
0000_0000_
0000_0000_
0000_0000_
0000_000 0.375 0 0111_
0111 _1101 1000_
1000 0000_
_0000 0000_
_0000 0000_
_0000 0000_
_0000 _000
To add these numbers the exponents of the numbers must be the same => Make
Exponent Field = 0111_1101 = 125
the smaller exponent equal to the larger exponent, shifting the mantissa
True Binary Exponent = 125 – 127 = –2
accordingly.
Significand Field = 1000_
1000_0000_
0000_0000_
0000_0000_
0000_0000_
0000_000
Adding Hidden Bit = 1.1000
1.1000_
1000_0000_
0000_0000_
0000_0000_
0000_0000_
0000_000 Note: We must restore the Hidden bit when carrying out floating point
Therefore unsigned value = 1.1 x 2–2 = 0 . 011 (binary) operations.
= 0.25 + 0.125 = 0.375 (decimal)
Sign bit = 1 therefore number is –0.375
Eddie Edwards 2008 Floating Point Numbers 7.29 Eddie Edwards 2008 Floating Point Numbers 7.30
Eddie Edwards 2008 Floating Point Numbers 7.31 Eddie Edwards 2008 Floating Point Numbers 7.32
8
31/10/2011
Denormalised Numbers
An Exponent of All 0’s is used to represent Zero and Denormalised numbers,
while All 1’s is used to represent Infinities and Not-A-Numbers (NaNs)
This means that the maximum range for normalised numbers is reduced, i.e. for
Single Precision the range is –126 .. +127 rather than
–127 .. +128 as one might expect for Excess 127.
IEEE 754 floating point numbers -
Denormalised Numbers represent values between the Underflow limits questions
and zero, i.e. for single precision we have:
–126
± 0.F x 2–126
Eddie Edwards 2008 Floating Point Numbers 7.33 Eddie Edwards 2008 Floating Point Numbers 7.34
Operations with a NaN operand yield either a NaN result (quiet NaN operand)
or an exception (signalling NaN operand)
Eddie Edwards 2008 Floating Point Numbers 7.35 Eddie Edwards 2008 Floating Point Numbers 7.36
9
31/10/2011
10