Fixed Point and Floating Point Number Representations

Digital computers represent all data, including numbers, using binary digits. There are two main approaches for representing real numbers in computing: fixed point notation and floating point notation. Fixed point notation assigns a fixed number of bits to the integer and fractional parts, limiting the range of representable values. Floating point notation uses a mantissa and exponent, allowing for a flexible range but less precision. The IEEE standard specifies floating point representation with single, double, and quadruple precision formats.

Uploaded by

Santhalakshmi Sn

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

1K views7 pages

Fixed Point and Floating Point Number Representations

Uploaded by

Santhalakshmi Sn

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 7

Fixed Point and Floating Point Number

Representations
Data Structure and AlgorithmsMathematicsDigital Electronics

Digital Computers use Binary number system to represent all types of information inside
the computers. Alphanumeric characters are represented using binary bits (i.e., 0 and
1). Digital representations are easier to design, storage is easy, accuracy and precision
are greater.
There are various types of number representation techniques for digital number
representation, for example: Binary number system, octal number system, decimal
number system, and hexadecimal number system etc. But Binary number system is
most relevant and popular for representing numbers in digital computer system.
Storing Real Number:
These are structures as following below:

There are two major approaches to store real numbers (i.e., numbers with fractional
component) in modern computing. These are (i) Fixed Point Notation and (ii) Floating
Point Notation. In fixed point notation, there are a fixed number of digits after the
decimal point, whereas floating point number allows for a varying number of digits after
the decimal point.
Fixed-Point Representation:
This representation has fixed number of bits for integer part and for fractional part. For
example, if given fixed-point representation is IIII.FFFF, then you can store minimum
value is 0000.0001 and maximum value is 9999.9999. There are three parts of a fixed-
point number representation: the sign field, integer field, and fractional field.

We can represent these numbers using:

 Signed representation: range from -(2(k-1)-1) to (2(k-1)-1), for k bits.

 1’s complement representation: range from -(2(k-1)-1) to (2(k-1)-1), for k bits.
 2’s complementation representation: range from -(2(k-1)) to (2(k-1)-1), for k bits.
2’s complementation representation is preferred in computer system because of
unambiguous property and easier for arithmetic operations.
Example: Assume number is using 32-bit format which reserve 1 bit for the sign, 15 bits
for the integer part and 16 bits for the fractional part.
Then, -43.625 is represented as following:

Where, 0 is used to represent + and 1 is used to represent. 000000000101011 is 15 bit

binary value for decimal 43 and 1010000000000000 is 16 bit binary value for fractional
0.625.
The advantage of using a fixed-point representation is performance and disadvantage is
relatively limited range of values that they can represent. So, it is usually inadequate for
numerical analysis as it does not allow enough numbers and accuracy. A number
whose representation exceeds 32 bits would have to be stored inexactly.
These are above smallest positive number and largest positive number which can be
store in 32-bit representation as given above format. Therefore, the smallest positive
number is 2-16 ≈ 0.000015 approximate and the largest positive number is (2 15-1)+(1-2-
16)=215(1-2-16) =32768, and gap between these numbers is 2 -16.

We can move the radix point either left or right with the help of only integer field is 1.
Floating-Point Representation:
This representation does not reserve a specific number of bits for the integer part or the
fractional part. Instead it reserves a certain number of bits for the number (called the
mantissa or significand) and a certain number of bits to say where within that number
the decimal place sits (called the exponent).
The floating number representation of a number has two part: the first part represents a
signed fixed point number called mantissa. The second part of designates the position
of the decimal (or binary) point and is called the exponent. The fixed point mantissa may
be fraction or an integer. Floating -point is always interpreted to represent a number in
the following form: Mxre.
Only the mantissa m and the exponent e are physically represented in the register
(including their sign). A floating-point binary number is represented in a similar manner
except that is uses base 2 for the exponent. A floating-point number is said to be
normalized if the most significant digit of the mantissa is 1.

So, actual number is (-1)s(1+m)x2(e-Bias), where s is the sign bit, m is the mantissa, e is
the exponent value, and Bias is the bias number.
Note that signed integers and exponent are represented by either sign representation,
or one’s complement representation, or two’s complement representation.
The floating point representation is more flexible. Any non-zero number can be
represented in the normalized form of ±(1.b1b2b3 ...)2x2n This is normalized form of a
number x.
Example: Suppose number is using 32-bit format: the 1 bit sign bit, 8 bits for signed
exponent, and 23 bits for the fractional part. The leading bit 1 is not stored (as it is
always 1 for a normalized number) and is referred to as a “hidden bit”.
Then −53.5 is normalized as -53.5=(-110101.1)2=(-1.101011)x25 , which is represented
as following below,
Where 00000101 is the 8-bit binary value of exponent value +5.
Note that 8-bit exponent field is used to store integer exponents -126 ≤ n ≤ 127.
The smallest normalized positive number that fits into 32 bits is
(1.00000000000000000000000)2x2-126=2-126≈1.18x10-38 , and largest normalized positive
number that fits into 32 bits is (1.11111111111111111111111) 2x2127=(224-1)x2104 ≈
3.40x1038 . These numbers are represented as following below,

The precision of a floating-point format is the number of positions reserved for binary
digits plus one (for the hidden bit). In the examples considered here the precision is
23+1=24.
The gap between 1 and the next normalized floating-point number is known as machine
epsilon. the gap is (1+2-23)-1=2-23for above example, but this is same as the smallest
positive floating-point number because of non-uniform spacing unlike in the fixed-point
scenario.
Note that non-terminating binary numbers can be represented in floating point
representation, e.g., 1/3 = (0.010101 ...)2 cannot be a floating-point number as its binary
representation is non-terminating.
IEEE Floating point Number Representation:
IEEE (Institute of Electrical and Electronics Engineers) has standardized Floating-Point
Representation as following diagram.

So, actual number is (-1)s(1+m)x2(e-Bias), where s is the sign bit, m is the mantissa, e is
the exponent value, and Bias is the bias number. The sign bit is 0 for positive number
and 1 for negative number. Exponents are represented by or two’s complement
representation.
According to IEEE 754 standard, the floating-point number is represented in following
ways:

 Half Precision (16 bit): 1 sign bit, 5 bit exponent, and 10 bit mantissa
 Single Precision (32 bit): 1 sign bit, 8 bit exponent, and 23 bit mantissa
 Double Precision (64 bit): 1 sign bit, 11 bit exponent, and 52 bit mantissa
 Quadruple Precision (128 bit): 1 sign bit, 15 bit exponent, and 112 bit mantissa
Special Value Representation:
There are some special values depended upon different values of the exponent and
mantissa in the IEEE 754 standard.

 All the exponent bits 0 with all mantissa bits 0 represents 0. If sign bit is 0, then
+0, else -0.
 All the exponent bits 1 with all mantissa bits 0 represents infinity. If sign bit is 0,
then +∞, else -∞.
 All the exponent bits 0 and mantissa bits non-zero represents denormalized
number.
 All the exponent bits 1 and mantissa bits non-zero represents error.
Floating Point Representation | Digital Logic
1. To convert the floating point into decimal, we have 3 elements in a
32-bit floating point representation:
i) Sign
ii) Exponent
iii) Mantissa
 Sign bit is the first bit of the binary representation. '1' implies negative
number and '0' implies positive number.
Example: 11000001110100000000000000000000 This is negative
number.
 Exponent is decided by the next 8 bits of binary representation. 127 is
the unique number for 32 bit floating point representation. It is known
as bias. It is determined by 2k-1 -1 where 'k' is the number of bits in
exponent field.
There are 3 exponent bits in 8-bit representation and 8 exponent bits
in 32-bit representation.
Thus
bias = 3 for 8 bit conversion (23-1 -1 = 4-1 = 3)
bias = 127 for 32 bit conversion. (28-1 -1 = 128-1 = 127)
Example: 01000001110100000000000000000000
10000011 = (131)10
131-127 = 4

Hence the exponent of 2 will be 4 i.e. 24 = 16.

 Mantissa is calculated from the remaining 23 bits of the binary
representation. It consists of '1' and a fractional part which is
determined by:
Example:
01000001110100000000000000000000
The fractional part of mantissa is given by:
1*(1/2) + 0*(1/4) + 1*(1/8) + 0*(1/16) +……… = 0.625
Thus the mantissa will be 1 + 0.625 = 1.625
The decimal number hence given as: Sign*Exponent*Mantissa = (-
1)*(16)*(1.625) = -26
2. To convert the decimal into floating point, we have 3 elements in a
32-bit floating point representation:
i) Sign (MSB)
ii) Exponent (8 bits after MSB)
iii) Mantissa (Remaining 23 bits)
 Sign bit is the first bit of the binary representation. '1' implies negative
number and '0' implies positive number.
Example: To convert -17 into 32-bit floating point representation Sign
bit = 1
 Exponent is decided by the nearest smaller or equal to 2n number.
For 17, 16 is the nearest 2n. Hence the exponent of 2 will be 4 since
24 = 16. 127 is the unique number for 32 bit floating point
representation. It is known as bias. It is determined by 2k-1 -1 where 'k'
is the number of bits in exponent field.
Thus bias = 127 for 32 bit. (28-1 -1 = 128-1 = 127)
Now, 127 + 4 = 131 i.e. 10000011 in binary representation.
 Mantissa: 17 in binary = 10001.
Move the binary point so that there is only one bit from the left. Adjust
the exponent of 2 so that the value does not change. This is
normalizing the number. 1.0001 x 24. Now, consider the fractional part
and represented as 23 bits by adding zeros.
00010000000000000000000
Thus the floating point representation of -17 is 1 10000011
00010000000000000000000

SOP To POS Vice Versa
No ratings yet
SOP To POS Vice Versa
3 pages
Unit1 4
No ratings yet
Unit1 4
1 page
DOPC
No ratings yet
DOPC
76 pages
Pega 244N With 04PC Machine Programming Limits Rev40
No ratings yet
Pega 244N With 04PC Machine Programming Limits Rev40
11 pages
Ict Notes With Image
No ratings yet
Ict Notes With Image
7 pages
Ninebot Kickscooter E2d
No ratings yet
Ninebot Kickscooter E2d
49 pages
FX2 USB To ATA Udma White Paper
No ratings yet
FX2 USB To ATA Udma White Paper
10 pages
Fixed and Floating Point Representation
No ratings yet
Fixed and Floating Point Representation
5 pages
Is It
No ratings yet
Is It
7 pages
Action Replay Max Gba / Max Drive DS: Instruction Manual For
No ratings yet
Action Replay Max Gba / Max Drive DS: Instruction Manual For
33 pages
2019 - 2020 Even Workload After Lab TT
No ratings yet
2019 - 2020 Even Workload After Lab TT
6 pages
Magalpha Ma750: Contactless Turning Knob Sensor
No ratings yet
Magalpha Ma750: Contactless Turning Knob Sensor
14 pages
1.6 Integrity Rules
No ratings yet
1.6 Integrity Rules
1 page
Clustered Table Index by Default.: 1.5.1 Primary Key
No ratings yet
Clustered Table Index by Default.: 1.5.1 Primary Key
1 page
1.1 Types of Relationships
No ratings yet
1.1 Types of Relationships
1 page
1.2 Database Management System
No ratings yet
1.2 Database Management System
1 page
1.3 Functions of RDBMS
No ratings yet
1.3 Functions of RDBMS
1 page
Session-1: 50 Most Useful Linux Commands: Category-1: File System Management
No ratings yet
Session-1: 50 Most Useful Linux Commands: Category-1: File System Management
6 pages
For Students DataStage NOTES
No ratings yet
For Students DataStage NOTES
163 pages
How Do Prefix List Work
No ratings yet
How Do Prefix List Work
18 pages
Cisco AP
No ratings yet
Cisco AP
48 pages
Addressing Modes of Computer Architecture
50% (2)
Addressing Modes of Computer Architecture
25 pages
Lecture 6. Fixed and Floating Point Numbers: Prof. Taeweon Suh Computer Science Education Korea University
No ratings yet
Lecture 6. Fixed and Floating Point Numbers: Prof. Taeweon Suh Computer Science Education Korea University
24 pages
Unit I Database Concepts 1. Introduction - Database
No ratings yet
Unit I Database Concepts 1. Introduction - Database
1 page
Flip Flops, R-S, J-K, D, T, Master Slave - D&E Notes
No ratings yet
Flip Flops, R-S, J-K, D, T, Master Slave - D&E Notes
7 pages
Sony NW-A1000 Service Manual v1.0 2005
No ratings yet
Sony NW-A1000 Service Manual v1.0 2005
58 pages
Addition and Subtraction With Signed 2s Complement
No ratings yet
Addition and Subtraction With Signed 2s Complement
3 pages
Recursion
No ratings yet
Recursion
14 pages
Microprocessor 8085 Lab Manual PDF
33% (3)
Microprocessor 8085 Lab Manual PDF
29 pages
Unit-2 Arithmetic & Logic Unit
No ratings yet
Unit-2 Arithmetic & Logic Unit
28 pages
Atf 16 V 8
No ratings yet
Atf 16 V 8
19 pages
Fixed Point and Floating Point Number Representations
No ratings yet
Fixed Point and Floating Point Number Representations
5 pages
Searching Sorting Notes Handwritten
No ratings yet
Searching Sorting Notes Handwritten
29 pages
WA7 Fireflies Owners Manual
No ratings yet
WA7 Fireflies Owners Manual
17 pages
General Catalogue
No ratings yet
General Catalogue
19 pages
8086 All Basic Programs
No ratings yet
8086 All Basic Programs
34 pages
Question Bank of 8085 & 8086 Microprocessor
No ratings yet
Question Bank of 8085 & 8086 Microprocessor
7 pages
Number Representation
No ratings yet
Number Representation
7 pages
Bit Pair Recoding
0% (1)
Bit Pair Recoding
4 pages
Os Practical File
No ratings yet
Os Practical File
47 pages
Lab Programs 8086
50% (2)
Lab Programs 8086
37 pages
1) Write A Program To Implement VRC and LRC. Code (VRC Program)
No ratings yet
1) Write A Program To Implement VRC and LRC. Code (VRC Program)
9 pages
It Works Daw With Server
No ratings yet
It Works Daw With Server
3 pages
8086 Instruction Set
No ratings yet
8086 Instruction Set
37 pages
Fixed Versus Floating Point
No ratings yet
Fixed Versus Floating Point
5 pages
Horizonat and Vertical Microinstructions PDF
100% (2)
Horizonat and Vertical Microinstructions PDF
6 pages
Classification of Computer
No ratings yet
Classification of Computer
30 pages
Quenched and Tempered Alloy Steel Bolts, Studs, and Other Externally Threaded Fasteners
No ratings yet
Quenched and Tempered Alloy Steel Bolts, Studs, and Other Externally Threaded Fasteners
7 pages
Digital Electronics Laboratory
No ratings yet
Digital Electronics Laboratory
30 pages
Registers and Counters
100% (1)
Registers and Counters
91 pages
The Beginners Guide To Nintendo DS Homebrew
No ratings yet
The Beginners Guide To Nintendo DS Homebrew
26 pages
Exp - No: 16-Bit Subtraction Date: Aim
100% (1)
Exp - No: 16-Bit Subtraction Date: Aim
2 pages
Predictive PDF
100% (1)
Predictive PDF
5 pages
Danfoss Filter Drier DCR With Replaceable Solid Core
No ratings yet
Danfoss Filter Drier DCR With Replaceable Solid Core
9 pages
VISUAL BASIC 6.0 PROJECT ON Fast Food Management System
100% (4)
VISUAL BASIC 6.0 PROJECT ON Fast Food Management System
82 pages
C2. Fixed Point and Floating Point Operations
0% (1)
C2. Fixed Point and Floating Point Operations
71 pages
O.S. Notes Based On Syllabus PDF
No ratings yet
O.S. Notes Based On Syllabus PDF
37 pages
Omron Industrial Automation Guide 2015
100% (2)
Omron Industrial Automation Guide 2015
696 pages
Programming: Passing Arrays To Functions
No ratings yet
Programming: Passing Arrays To Functions
10 pages
8086 Assembler Directives: Segment
100% (1)
8086 Assembler Directives: Segment
5 pages
Data Transfer and Manipulation
No ratings yet
Data Transfer and Manipulation
11 pages
Largest Number
No ratings yet
Largest Number
4 pages
Scilab Textbook Companion For Digital Electronics: Circuits and Systems by V. K. Puri
No ratings yet
Scilab Textbook Companion For Digital Electronics: Circuits and Systems by V. K. Puri
134 pages
Morris Mano - Computer Architecture PPT Chapter 4
100% (4)
Morris Mano - Computer Architecture PPT Chapter 4
27 pages
Cache Memory: Computer Architecture Unit-1
No ratings yet
Cache Memory: Computer Architecture Unit-1
54 pages
PB Se Ev en
No ratings yet
PB Se Ev en
12 pages
CST294 - Ktu Qbank
No ratings yet
CST294 - Ktu Qbank
22 pages
Mid Point Ellipse Algorithm
No ratings yet
Mid Point Ellipse Algorithm
7 pages
Register, Bus and Memory Transfer-5
No ratings yet
Register, Bus and Memory Transfer-5
17 pages
Add 8 Bit With Carry
No ratings yet
Add 8 Bit With Carry
3 pages
Case Study Nift
67% (6)
Case Study Nift
19 pages
Question Bank Subject: Digital Electronics and Computer Organization Subject Code: BCA - 202 (N)
100% (1)
Question Bank Subject: Digital Electronics and Computer Organization Subject Code: BCA - 202 (N)
5 pages
LECTURE 7 - MANIPULATORS in C++
No ratings yet
LECTURE 7 - MANIPULATORS in C++
22 pages
4-Bit Ripple Carry Adder
No ratings yet
4-Bit Ripple Carry Adder
5 pages
DBMS Unit 1
No ratings yet
DBMS Unit 1
23 pages
CH 5 Basic Computer Organization and Design
100% (1)
CH 5 Basic Computer Organization and Design
50 pages
Computer Architecture Viva Questions
No ratings yet
Computer Architecture Viva Questions
7 pages
8085 Addressing Modes and Memory Mapping
100% (2)
8085 Addressing Modes and Memory Mapping
8 pages
Microinstruction Format Concept of Horizontal and Vertical Microprogramming
No ratings yet
Microinstruction Format Concept of Horizontal and Vertical Microprogramming
16 pages
Booths Algo
No ratings yet
Booths Algo
8 pages
Unit - 3 of Computer Architecture
No ratings yet
Unit - 3 of Computer Architecture
59 pages
IP Datagrams. Datagram Forwarding
No ratings yet
IP Datagrams. Datagram Forwarding
26 pages
Introduction of System Call
No ratings yet
Introduction of System Call
13 pages
Features of Intel 8279 Programmable Keyboard Display Interface
No ratings yet
Features of Intel 8279 Programmable Keyboard Display Interface
26 pages
Program-1 Write A Program in C To Create Two Sets and Perform The Union Operation On Sets
No ratings yet
Program-1 Write A Program in C To Create Two Sets and Perform The Union Operation On Sets
22 pages
Senr6483-00 3412 Peec
80% (5)
Senr6483-00 3412 Peec
59 pages
7-Modified Booth Algorithm - Bit Pair Recoding-22-12-2022
No ratings yet
7-Modified Booth Algorithm - Bit Pair Recoding-22-12-2022
18 pages
System Programming & Operating System: A Laboratory Manual FOR
No ratings yet
System Programming & Operating System: A Laboratory Manual FOR
45 pages
Binary Multiplication Twos Complement Multiplication (Booths Algorithm)
100% (1)
Binary Multiplication Twos Complement Multiplication (Booths Algorithm)
9 pages
Computer Organization UNIT-3 Processor and Control Unit: Fundamental Concepts
No ratings yet
Computer Organization UNIT-3 Processor and Control Unit: Fundamental Concepts
23 pages
0 - C Notes PDF
No ratings yet
0 - C Notes PDF
158 pages
Well Head Equipments Xmas Tree
100% (1)
Well Head Equipments Xmas Tree
16 pages
POP Using C - VTU Lab Program-4
No ratings yet
POP Using C - VTU Lab Program-4
5 pages
Furuno Radar Fr21X5
No ratings yet
Furuno Radar Fr21X5
4 pages
Cache Memory Mapping Techniques
No ratings yet
Cache Memory Mapping Techniques
16 pages
Installing ESYS PDF
No ratings yet
Installing ESYS PDF
14 pages
8051 Microcontroller Architecture
No ratings yet
8051 Microcontroller Architecture
3 pages
Computer Registers & Common Bus System
No ratings yet
Computer Registers & Common Bus System
21 pages

Fixed Point and Floating Point Number Representations

Uploaded by

Fixed Point and Floating Point Number Representations

Uploaded by

Fixed Point and Floating Point Number

We can represent these numbers using:

 Signed representation: range from -(2(k-1)-1) to (2(k-1)-1), for k bits.

Where, 0 is used to represent + and 1 is used to represent. 000000000101011 is 15 bit

Hence the exponent of 2 will be 4 i.e. 24 = 16.

You might also like