IEEE 754 Floating Point Standard

Download as pdf or txt
Download as pdf or txt
You are on page 1of 2

CO2103

IEEE 754 Floating Point Standard In lecture slides CO2103 Chapter 03 on Background, we briefly mentioned how computer stores floating point numbers. The format used in the representation of floating point number in the computer is based on the IEEE 754 Floating Point Standard. All floating point numbers will be normalized and the normalized form will be stored in the computer in accordance to IEEE 754 standard. Normalized form: 1.xxxxxx 2yyyy IEEE 754 Floating Point Standard: -1S (1.0 + 0.M) 2E

The Sign (S) bit indicates if the number is positive (S=0) or negative (S=1). With normalized form, only the fractional part of the mantissa needs to be stored. The Mantissa (M) bits are the xxxxxx after the radix point. M is stored in natural binary form. The Exponential (E) bits are the yyyy, which are represented in bias-m to ease comparisons. Using 1. 2. 3. normalized scientific notation Simplifies the exchange (and representation) of data that includes floating-point numbers Simplifies the arithmetic algorithms to know that the numbers will always be in this form Increases the accuracy of the numbers that can be stored in a word, since each unnecessary leading 0 is replaced by another significant digit to the right of the decimal point

Under IEEE 754 standard, floating point numbers can be represented in either of the two precisions: Single-Precision (32-bit) or Double-Precision (64-bit).

Bit No 31 23-30

Size

Field Name

Bit No 63

Size

Field Name

1 bit Sign (S) 8 bits Exponent (E)

1 bit Sign (S)

52-62 11 bits Exponent (E) 0-51 52 bits Mantissa (M) Double-Precision

0-22 23 bits Mantissa (M) Single-Precision

Single-Precision floating point numbers will occupy 32 bits and give approx range of 10-38 1038. The Exponent (E) is represented in bias-127. Double-Precision floating point numbers will occupy 64 bits and give approx range of 10-308 10308. The Exponent (E) is represented in bias-1023. Few examples for Single-Precision:
Number (binary) -10.00111 101101.111011 -0.001111 0.0000101111 = = = = Normalized (binary) -1.000111121 1.0110111101125 -1.1112-3 1.011112-5 S 1 0 1 0 E (8-bit in bias-127) 1+127=128=100000002 10000100 01111100 01111010 M (23-bit) 00001111 001101111011 00111 001111 IEEE 754 Single (32-bit) 1100000111 010000101101111011 101111100111 00111101001111

There are two potential errors in representing a floating numbers in IEEE 754 format: Overflow - the exponent is too large to be represented in the Exponent field Underflow - the number is too small to be represented in the Exponent field

[email protected]

CO2103

To reduce the chances of underflow/overflow, can use 64-bit Double-Precision arithmetic For further reference: http://babbage.cs.qc.edu/IEEE-754/References.xhtml. The above material was prepared with reference to http://www.doc.ic.ac.uk/~ih. Exercises: 1. Determine the normalized binary for the following decimal numbers: a) 234.625 b) -890.375 c) -0.001007080078125 d) 0.000091552734375
(Ans. a) 11101010.101, b) -1101111010.011, c) -0.000000000100001, d) 0.000000000000011)

2.

Represent the above floating point numbers in IEEE 754 Single-Precision format. Write your answers in Hex.
(Ans. a) 436AA000, b) C45E9800, c) BA840000, d) 38C00000)

3.

Determine the decimal Double-Precision number.


(Ans. -1.9726562500000000e-1)

value

of

BFC940000000000016,

which

is

an

IEEE

754

[email protected]

You might also like