0% found this document useful (0 votes)

21 views10 pages

Floating Point Representation

The document explains the IEEE 754 standard for floating point representation, detailing the structure for single and double precision formats. It describes how to store and retrieve floating point numbers using binary representation, including the roles of the sign bit, exponent, and fraction. It also addresses precision limitations and numerical implications in computations using single precision.

Uploaded by

dfsadfsagf

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

21 views10 pages

Floating Point Representation

Uploaded by

dfsadfsagf

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

5.

Floating point representation

1. IEEE 754 Standard for floating point representation

• Binary number in NSN form always has the form  1.F x 2E
• To store this in computer, we need to store (i) Signs of number and
exponent, (ii) Fraction (F) and (iii) Exponent (E)
• We do not need to store ‘1’ or ‘.’ or ‘2’ because they are part of any floating
point number in NSN form.
• If we can store (1) signs, (2) F and (3) E, we can retrieve the complete number
by using  1.F x 2E
• The IEEE 754 Standard is an international standard for storing floating
point numbers in computer memory
• In ‘single precision’, it uses a 32-bit word.
• In ‘double precision’, it uses a 64-bit or two 32-bit words.
2. IEEE 754 in Single Precision
• The standard specifies 32-bit word as shown below
31 30 – 23 bits 23 22 – 0 bits 0
M L
S S
B B

Sign bit Fraction (Mantissa)

Exponent + bias (127)
1 = negative
8 bits 23 bits
0 = positive

• Bit # 0 is the right most bit, also called the least significant bit (LSB)
• Bit # 31 is the left most bit, also called the most significant bit (MSB)
• MSB is used as a sign bit (s) for the number
• Bit number 23-30 (eight bits) are used to store the exponent plus bias (127)
• Bit number 0-22 (23 bits) are used to store the fraction (F)
3. Storing a number in IEEE 754 Single
Precision 31 30 – 23 bits 23 22 – 0 bits 0
M L
S S
B B

Sign bit Fraction (Mantissa)

Exponent + bias (127)
1 = negative 23 bits
8 bits
0 = positive

• Step 1: We convert the number into normalized scientific notation.

•  1.F x 2E
• Step 2: MSB = 1 if the number is negative and MSB = 0 if number is positive
• Step 3: Bit 30-23 are 8-bit binary representation of E + 127
• Step 4: Bit 22-0 are F
4. Retrieving a number stored in IEEE 754 Single Precision
31 30 – 23 bits 23 22 – 0 bits 0
M L
S S
B B

Sign bit Fraction (Mantissa)

Exponent + bias (127)
1 = negative 23 bits
8 bits
0 = positive

• Step 1: We check if the number is positive or negative by checking s bit.

• Step 2: We convert 8-bit combination of bits 30-23 into decimal. Suppose the
decimal value is Eb. Then, E = Eb – 127.
• Step 3: We check the fraction F from looking at bits 22-0.
• Step 4: Now, we have the sign of the number, F and E. The number is =  1.F x 2E
• We can convert this into decimal using known method for this purpose.
5. IEEE 754 Single Precision Floating Format
32 bits
31 30 – 23 bits 23 22 – 0 bits 0
M L
S S
B B

Sign bit Fraction (Mantissa)

Exponent + bias (127)
1 = negative 23 bits
8 bits
0 = positive Fraction part = The
number after the
Example: floating points is the
Convert -28.75 part of the fraction
(single precision floating point) portion 1.1100110 ->
Exponent part = 28 in binary is 1100110
11100. Convert 0.75 = 11 ->
11100.11
Shift floating point to first 1 =
Negative 1.1100110 = 4xleft shift, add bias ->
number –sign bit 127+4=131 = 10000011 in binary
is 1

1 1 0 0 0 0 0 1 1 1 1 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
6. IEEE 754 Single Precision Floating Format 32 bits

0 0 1 1 1 1 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

To find the number stored:

Step 1: Sign bit s = 0, the number is positive

Step 2: E + 127 = 011111102 = 12610 This means E = -1
Step 3: F = Bit 22-0 = .1 (zeros on the right side of fraction are
meaningless)
Step 4: The store number is 1.1 x 2-1 = 0.112 = 0.5 + 0.25 = ¾ Answer
7. IEEE 754 Double precision
• In double precision, IEEE 754 uses 64-bits.
• The MSB is used as a sign bit to store the sign of the number.
• The next 11 bits to the MSB are used to store a value of the bias plus
the exponent from normalized scientific notation (NSN).
• Bias = 211-1 – 1 = 1023
• The remaining 52-bits are used for fraction.
• Therefore the floating point 0 = 1.0 x 2-1023 which is its precision
• The range is 22047 – 1023 = 21024 = roughly the largest stored number
7. IEEE 754 Double precision - II
• IEEE 754 double precision with two 32-bit registers / memory cells
#31 #0
s 11-bit exponent + 1023 20 MSBits of fraction 32-bit register 1
#31 #0
32 LSBits of fraction 32-bit register 2

• IEEE 754 double precision with one 64-bit register / memory cell
#63 #0
s 11-bit exponent + 1023 MSB 52 Bits of fraction LSB

64-bit register
7. Review Questions
• Question 1: What is the precision of the IEEE 754 standard using single precision?
• Answer: The precision is the smallest quantity that can be stored. For 32-bit
standard with a bias of 28-1 – 1 = 128, the precision is 2-128 .
• Question 2. Can we store a number smaller than 2-128 in the single precision IEEE
standard?
• Answer: No.
• Question 3. What is the possible impact of questions 1 & 2?
• Answer: In numerical methods, using single precision, we can’t have a step size
smaller than precision. If we compare a floating point operation with 0, we will
not get a true value for equal operation. We will have two zeros depending
whether we approach 2-128 from positive or negative side. We should use integer
values where operations could contain a comparison with zero.

IEEE 754 Floating Point Standard
No ratings yet
IEEE 754 Floating Point Standard
2 pages
Lecture 4 - Computer Arithmetic
No ratings yet
Lecture 4 - Computer Arithmetic
18 pages
#3 - Floating Point
No ratings yet
#3 - Floating Point
38 pages
Module2.1 of Nothing
No ratings yet
Module2.1 of Nothing
7 pages
Floating Point Representation: Reading: B&O 2.4
No ratings yet
Floating Point Representation: Reading: B&O 2.4
44 pages
Ieee 754 F P R: Loating Oint Epresentation
No ratings yet
Ieee 754 F P R: Loating Oint Epresentation
11 pages
IEEE Standard 754
No ratings yet
IEEE Standard 754
10 pages
The IEEE Standard For Floating Point Arithmetic
No ratings yet
The IEEE Standard For Floating Point Arithmetic
9 pages
IEEE754 Floating Point Standard Presentation Detailed
No ratings yet
IEEE754 Floating Point Standard Presentation Detailed
15 pages
Module 2 - PART D Floating
No ratings yet
Module 2 - PART D Floating
30 pages
Lec07 - Computer Arithmetic - Floating-Point Representation and Arithmetic
No ratings yet
Lec07 - Computer Arithmetic - Floating-Point Representation and Arithmetic
42 pages
IEEE 754 Floating Point Representation
No ratings yet
IEEE 754 Floating Point Representation
5 pages
Week 5: IEEE Floating Point Revision Guide For Phase Test
No ratings yet
Week 5: IEEE Floating Point Revision Guide For Phase Test
23 pages
Floating Point Representation - M.eng Term Paper
No ratings yet
Floating Point Representation - M.eng Term Paper
6 pages
Floating Point
No ratings yet
Floating Point
26 pages
arch1-LECTURE-NUMBER REPRESENTATION
No ratings yet
arch1-LECTURE-NUMBER REPRESENTATION
42 pages
Lecture 2
No ratings yet
Lecture 2
27 pages
Fix and Floting Systems
No ratings yet
Fix and Floting Systems
28 pages
Soc2040 SP Week 5 Lecture1 Slides On Data Representation Part4 Spring 2024
No ratings yet
Soc2040 SP Week 5 Lecture1 Slides On Data Representation Part4 Spring 2024
46 pages
Fixed and Floating Point Numbers: Dr. Ashish GUPTA Sense, Vit-Ap Ashish - Gupta@vitap - Ac.in
No ratings yet
Fixed and Floating Point Numbers: Dr. Ashish GUPTA Sense, Vit-Ap Ashish - Gupta@vitap - Ac.in
34 pages
Unit-1 COA
No ratings yet
Unit-1 COA
26 pages
Lab 1
100% (1)
Lab 1
10 pages
COMP0068 Lecture10 High Level Data Types
No ratings yet
COMP0068 Lecture10 High Level Data Types
25 pages
Part 5 Floating Point Add Sub Mul
No ratings yet
Part 5 Floating Point Add Sub Mul
20 pages
Lecture 14 - Arithmetic Subsystems - Numbering Systems and Floating Point Unit (FPU)
No ratings yet
Lecture 14 - Arithmetic Subsystems - Numbering Systems and Floating Point Unit (FPU)
32 pages
Fixed & Floating Point
No ratings yet
Fixed & Floating Point
31 pages
COA UNIT-III PPTs Dr.G.Bhaskar ECE
No ratings yet
COA UNIT-III PPTs Dr.G.Bhaskar ECE
64 pages
International Journal of Engineering Research and Development
No ratings yet
International Journal of Engineering Research and Development
6 pages
Lecture Slides Week4
No ratings yet
Lecture Slides Week4
42 pages
Unit2 2.3&2.4
No ratings yet
Unit2 2.3&2.4
28 pages
Floating Point Alu
No ratings yet
Floating Point Alu
11 pages
Double-Precision Floating-Point Format - Wikipedia
No ratings yet
Double-Precision Floating-Point Format - Wikipedia
8 pages
IEEE Standard 754 Floating Point Numbers
No ratings yet
IEEE Standard 754 Floating Point Numbers
7 pages
Floating Point Numbers
No ratings yet
Floating Point Numbers
8 pages
4.4 - 1 New Floating Point
No ratings yet
4.4 - 1 New Floating Point
22 pages
Introduction of IEEE 754 Floating Point Number
No ratings yet
Introduction of IEEE 754 Floating Point Number
11 pages
Architetture Dei Calcolatori 2425 079 092
No ratings yet
Architetture Dei Calcolatori 2425 079 092
14 pages
Floating Point Representation
No ratings yet
Floating Point Representation
3 pages
Fixed Point and Floating Point Number Representations
No ratings yet
Fixed Point and Floating Point Number Representations
7 pages
Explain The Single Precision Floating Point Single IEEE 754 Representation
No ratings yet
Explain The Single Precision Floating Point Single IEEE 754 Representation
2 pages
Number System
No ratings yet
Number System
38 pages
Floating Point Numbers: CS101 Introduction To Computing
No ratings yet
Floating Point Numbers: CS101 Introduction To Computing
41 pages
Ieee Tex
No ratings yet
Ieee Tex
4 pages
Hello World 2015
No ratings yet
Hello World 2015
6 pages
Floating-Point Multiplication Unit With 16-Bit Significant and 8-Bit Exponent
No ratings yet
Floating-Point Multiplication Unit With 16-Bit Significant and 8-Bit Exponent
6 pages
Floating Point
No ratings yet
Floating Point
26 pages
Single Precision Floating-Point Conversion
No ratings yet
Single Precision Floating-Point Conversion
6 pages
Cacc
No ratings yet
Cacc
106 pages
3 Fixed and Floating Point DSP
No ratings yet
3 Fixed and Floating Point DSP
23 pages
Floating Point Representation
No ratings yet
Floating Point Representation
3 pages
Data Representation
No ratings yet
Data Representation
28 pages
Chapter 05
No ratings yet
Chapter 05
29 pages
IEEE 754 Floating Point Formats
No ratings yet
IEEE 754 Floating Point Formats
12 pages
Floating Point & Fixed Point Representation - BCA II
No ratings yet
Floating Point & Fixed Point Representation - BCA II
24 pages
Floating Point Numbers
No ratings yet
Floating Point Numbers
20 pages
Chap-03 Computer Arithmetics
No ratings yet
Chap-03 Computer Arithmetics
16 pages
IEEE 754 Conversion (32-Bit Single Precision) Bit Fields
No ratings yet
IEEE 754 Conversion (32-Bit Single Precision) Bit Fields
4 pages
Principles of Digital Electronics
From Everand
Principles of Digital Electronics
Sapana Rane
No ratings yet
Anti-Aliasing with MSAA vs ABAA
From Everand
Anti-Aliasing with MSAA vs ABAA
Michel A Rohner
No ratings yet
Computer Organization & Architecture
No ratings yet
Computer Organization & Architecture
41 pages
CPS 104 Computer Organization and Programming Lecture-4: Data Representations, Memory and Data Structures
No ratings yet
CPS 104 Computer Organization and Programming Lecture-4: Data Representations, Memory and Data Structures
22 pages
IEEE 754 Floating Point Notes
No ratings yet
IEEE 754 Floating Point Notes
4 pages
High Speed Adder Used in Digital Signal Processing
No ratings yet
High Speed Adder Used in Digital Signal Processing
17 pages
Lecture1 Chapter1 - Introduction To Digital Systems
No ratings yet
Lecture1 Chapter1 - Introduction To Digital Systems
47 pages
243 2s Complement Arithmkhkhjketic
No ratings yet
243 2s Complement Arithmkhkhjketic
19 pages
Data Transfer and Arithmetic Instructions
No ratings yet
Data Transfer and Arithmetic Instructions
6 pages
14 DLD Lec 14 Canonical Forms Dated 13 Nov 2020 Lecture Slides
No ratings yet
14 DLD Lec 14 Canonical Forms Dated 13 Nov 2020 Lecture Slides
22 pages
Machine Level Representation of Data Part 3
100% (1)
Machine Level Representation of Data Part 3
32 pages
RTL Design Examples
No ratings yet
RTL Design Examples
113 pages
Workshop 1
No ratings yet
Workshop 1
3 pages
Hohner: Esquema de Ligação. Conector 16 Pinos RC - Macho (Para Encoder Absoluto)
No ratings yet
Hohner: Esquema de Ligação. Conector 16 Pinos RC - Macho (Para Encoder Absoluto)
3 pages
ICT 1105 - Digital Electronics Fundamentals: Introduction To Number Systems and Codes
No ratings yet
ICT 1105 - Digital Electronics Fundamentals: Introduction To Number Systems and Codes
38 pages
CH 6 - Arithmetic, Logic Instructions and Programs
No ratings yet
CH 6 - Arithmetic, Logic Instructions and Programs
35 pages
Booth Radix 4
No ratings yet
Booth Radix 4
2 pages
Assign1 DLD
No ratings yet
Assign1 DLD
2 pages
High-Speed Area-Efficient VLSI Architecture of Three-Operand Binary Adder
No ratings yet
High-Speed Area-Efficient VLSI Architecture of Three-Operand Binary Adder
10 pages
Digital Arithmetic and Circuits
No ratings yet
Digital Arithmetic and Circuits
24 pages
Data Representation in Computer Systems Is The Process of Encoding
No ratings yet
Data Representation in Computer Systems Is The Process of Encoding
3 pages
DC-645 Inst It
No ratings yet
DC-645 Inst It
106 pages
DPSD Mini Project
No ratings yet
DPSD Mini Project
9 pages
Converting Binary To Decimal
No ratings yet
Converting Binary To Decimal
6 pages
1.1 Data Representation EMK Notes 2023
No ratings yet
1.1 Data Representation EMK Notes 2023
9 pages
An617 Fixed Point Mult PDF
No ratings yet
An617 Fixed Point Mult PDF
383 pages
Bca 2 Sem Digital Electronics 7134 Jan 2019
No ratings yet
Bca 2 Sem Digital Electronics 7134 Jan 2019
4 pages
Math251 Quiz1
No ratings yet
Math251 Quiz1
80 pages
Encoders and Decoders
100% (2)
Encoders and Decoders
15 pages
Error Correcting Codes
No ratings yet
Error Correcting Codes
27 pages
DE Experiment 7
No ratings yet
DE Experiment 7
9 pages

Floating Point Representation

Uploaded by

Floating Point Representation

Uploaded by

5.

Floating point representation

Sign bit Fraction (Mantissa)

Sign bit Fraction (Mantissa)

• Step 1: We convert the number into normalized scientific notation.

Sign bit Fraction (Mantissa)

• Step 1: We check if the number is positive or negative by checking s bit.

Sign bit Fraction (Mantissa)

To find the number stored:

Step 1: Sign bit s = 0, the number is positive

You might also like