0% found this document useful (0 votes)

99 views14 pages

3.1 Data Representation: 3.1.3 Real Numebrs and Normalized Floating-Point Representation

This document discusses real numbers and normalized floating-point representation. It explains that real numbers with fractional parts, like decimal numbers, can be represented in binary format by separating the number into a mantissa and exponent. The mantissa is a string of binary digits and the exponent is a power of 2. For accuracy, the mantissa is "normalized" by placing the binary point after the first non-zero bit. Several examples are provided of converting between decimal and binary floating-point formats and normalizing binary numbers.

Uploaded by

Bhavanen Rungapen

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

99 views14 pages

3.1 Data Representation: 3.1.3 Real Numebrs and Normalized Floating-Point Representation

Uploaded by

Bhavanen Rungapen

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 14

3.

1 Data representation

3.1.3 Real numebrs and normalized floating-point representation

In decimal notation, the number “23.456” can be written as “0.23456x102”. This means that in
P P

decimal notation, we only need to store the numbers “0.23456” and “2”. The number “0.23456” is
called the mantissa and the number “2” is called the exponent. This is what happens in binary.

For example, consider the binary number “10111”. This could be represented as “0.10111x25” or P P

“0.10111x2101”. Here “0.10111” is the mantissa and “101” is the exponent.

P P

Similarly, in decimal, 0.0000246 can be written as “0.246x10-4”. Now the mantissa is “0.246” and
P P

the exponent is “-4”.

Thus, in binary, “0.00010101” can be written as “0.10101x2-11”. Now the mantissa is “0.10101” and
P P

the exponent is “-11”.

By now it should be clear that we need to store 2 numbers, the mantissa and the exponent. This
form of representation is called floating point form. Numbers that involve a fractional part like
“2.467 10 ” and “101.0101 2 ” are called real numbers.
R R R R R R R R

Page 1 of 14
3.1 Data representation

3.1.3 Real numebrs and normalized floating-point representation

Converting binary floating-point real numbers into denary and vice versa

Convert “2.625” into 8-bit floating point format

Converting the integral part is simple; you simply keep adding bits of increasing power until you get
the number that you want, so in this case:

2 (the integral part) =

128 (2^7) 64 (2^6) 32 (2^5) 16 (2^4) 8 (2^3) 4 (2^2) 2 (2^1) 1 (2^0)

0 0 0 0 0 0 1 0
Or simply “10”

As for the fractional part, you must do repeated multiplication by 2 until your remainder is zero, so in
this case:

0.625 (the fractional part)

- 0.625 x 2 = 1.25 1 (Generate “1”,continue with the rest)

- 0.25 x 2 = 0.5 0 (Because the whole part is “0” we continue multiplying by 2)

- 0.5 x 2 = 1.0 1 (Generate “1” and nothing remains in the fractional part

So 2 10 = 10 2 and 0.625 10 = 0.101 2

R R R R R R R

So 2.625 10 = 10.101 2 (binary floating-point real number)

R R R R

Page 2 of 14
3.1 Data representation

3.1.3 Real numebrs and normalized floating-point representation

Convert 0.40625 to 8-bit floating point format

0
- 0.40625 x 2 = 0.8125
- 0.8125 x 2 = 1.625 1

- 0.625 x 2 = 1.25 1
- 0.25 x 2 = 0.5 0
- 0.5 x 2 = 1.0 1

So 0.40625 10 = 0.01101 2 (binary floating-point real number)

R R R R

Convert 14.7 into 8-bit floating point format

14 (the integral part) =

128 (2^7) 64 (2^6) 32 (2^5) 16 (2^4) 8 (2^3) 4 (2^2) 2 (2^1) 1 (2^0)

0 0 0 0 1 1 1 0
Or simply “1110”

0.7 (the fractional part)

- 0.7 x 2 = 1.4 1

- 0.4 x 2 = 0.8 0
- 0.8 x 2 = 1.6 1
- 0.6 x 2 = 1.2 1
- 0.2 x 2 = 0.4 0
- 0.4 x 2 = 0.8 0
- 0.8 x 2 = 1.6 1
- 0.6 x 2 = 1.2 1
…

This process seems to go on endlessly. The number “7/10”, which is a perfectly normal decimal
fraction, is a repeating fraction in binary, just as the fraction “1/3” is a repeating fraction in decimal.
(It repeats in binary as well.) We can’t represent this number precisely as a floating point number.
The closest we can get with four bits is “.1011”. Since we already have a leading “14”, the best eight

Page 3 of 14
3.1 Data representation

3.1.3 Real numebrs and normalized floating-point representation

bit number we can make is “1110.1011” So 14.7 10 = 1110.1011 2 (binary floating-point real
R R R R

number)

Convert 1101.1100 into denary format

Integral part: 1101

Fractional part: 1100

Converting the integral part is simply converting from binary to decimal

128 (2^7) 64 (2^6) 32 (2^5) 16 (2^4) 8 (2^3) 4 (2^2) 2 (2^1) 1 (2^0)

0 0 0 0 1 1 0 1

1101 2 = 8+4+1 = 13 10
R R R

Converting the fractional part is similar as integral part

½ (2^-1) ¼ (2^-2) 1/8

R R (2^-3) 1/16
R R (2^-4) 1/32
R R (2^-5) 1/64
R R (2^-6) 1/128
R R (2^-7) R
1/256
R (2^-8)
1 1 0 0 0 0 0 0

0.1100 2 = (1/2) + (1/4) = 0.75 10

R R R

So 1101.1100 2 = 13.75 10
R R R R

*red cells indicate that those bits are not used

Page 4 of 14
3.1 Data representation

3.1.3 Real numebrs and normalized floating-point representation

Normalizing a Real Number

In the above examples, the decimal point in the mantissa was always placed immediately before the
first non-zero digit. This is always done like this with positive numbers because it allows us to use
the maximum number of digits.

Suppose we use 8 bits to hold the mantissa and 8 bits to hold the exponent. The binary number

“10.11011” becomes “0.1011011 x 210” and can be held as:

P P P P

Notice that the first digit of the mantissa is 0 and the second digit is 1. The mantissa is said to be
“normalized” if the first 2 digits are different. Thus, for a positive number, the first digit is always 0
and the second digit is always 1. The exponent is always an integer and is help in 2’s complement
form.

Now consider the binary number “0.00000101011” which is “0.1010110 x 2-101”. Thus the mantissa
P P

is “0.101011” and the exponent is “-101”. Again, using 8 bits for the mantissa and 8 bits for the
exponent, we have:

Because the 2’s complement of “-101” using 8 bits is “11111011”

The main reason that we normalize floating-point numbers is in order to have as high degree of
accuracy as possible.

Care must be taken when normalizing negative numbers. The easiest way to normalize negative
numbers is to first normalize the positive version of the number. Consider the binary number “-
1011”. The positive version is “1011” = “0.1011 x 2100” and can be represented by:
P P P P

Page 5 of 14
3.1 Data representation

3.1.3 Real numebrs and normalized floating-point representation

Page 6 of 14
3.1 Data representation

3.1.3 Real numebrs and normalized floating-point representation

As another example, change the fraction “-11/32” into a normalized floating point binary number.

Ignore the negative sign and solve for just “11/32”

11/32 = 0.34375

- 0.3475 x 2 = 0.6875 0

- 0.6875 x 2 = 1.375 1
- 0.375 x 2 = 0.75 0

- 0.75 x 2 = 1.50 1

- 0.50 x 2 = 1.00 1

= 0.01011 2 R

Now we have 8 bits mantissa and 8 bits exponent

R R R R

½ ¼ 1/8
R
1/16
R R
1/32
R R
1/64
R R
1/128
R
1/256
R R

1 2 4 8 16
R R
32 64
R
128
R

0 0 1 0 1 1 0 0 0 0 0 0 0 0 0 0
Mantissa Exponent

But it is not normalized, so we do that by removing a “0” from bit “1/2” in the mantissa and
subtracting that 1 location from exponent 0 revels -1.

That is:
R R R R

½ ¼ 1/8
R
1/16
R R
1/32
R R
1/64
R R
1/128
R
1/256
R R

1 2 4 8 16
R R
32 64
R
128
R

0 1 0 1 1 0 0 0 1 1 1 1 1 1 1 1
Mantissa Exponent

For “-11/32”, keep the exponent the same and in the mantissa, convert it into 2’s complement.
R R R R

½ ¼ 1/8
R
1/16
R R
1/32
R R
1/64
R R
1/128
R
1/256
R R

1 2 4 8 16
R R
32 64
R
128
R

1 0 1 0 1 0 0 0 1 1 1 1 1 1 1 1
Mantissa Exponent

Page 7 of 14
3.1 Data representation

3.1.3 Real numebrs and normalized floating-point representation

There are always a finite number of bits that can be used to represent numbers in a computer. This
means that if we allocate more bits for the mantissa, we will have to allocate fewer bits for the
exponent.

Let us start off by using 8 bits for the mantissa and 8 bits for the exponent. The largest positive value
we can have for the mantissa is 0.1111111 and the largest positive number we can have for the
exponent is 01111111. This means that we can have

0.1111111 x 21111111 = 0.1111111 x 2127

P P P

This means that the largest positive number is almost 1 x 2127 P

The smallest positive mantissa is 0.1000000 and the smallest exponent is 10000000. This
represents:

0.1000000 x 210000000 = 0.1000000 x 2-128

P P P

Which is very close to zero; in fact it is 2-129 .

P P

Page 8 of 14
3.1 Data representation

3.1.3 Real numebrs and normalized floating-point representation

The largest negative number (i.e. the negative number closest to zero) is

1.0111111 x 210000000 = -0.10000001 x 2-128

P P P

Have you ever noticed that zero cannot be represented in normalized form? This is because
“0.0000000” is not normalized because the first two 2 digits are the same. Usually, the computer
uses the smallest positive number to represent zero. Keep in mind that when we are talking about
the “size” of the number, as you move towards the right, the size of the number increases. So “-1” is
greater than “-2” whereas, if we talk about the largest “magnitude” negative number, then “-2” is
greater than -1 because the integer value is greater.

Hence, for a positive number n:

2-129 ≤ n < 2127

P P P

And for a negative number n:

-2127 ≤ n < -2-129

P P P

Now suppose we use 12 of the bits for the mantissa and the other 4 bits for the exponent (12+4 =16).
This will increase the number of bits available for the mantissa that can be used to represent the
number. That is, we shall have more binary places in the mantissa and hence greater accuracy.
However, the range of values in the exponent is from -8 (1000) to +7 (0111) and this produces a
small range of numbers. Hence, we have increased the accuracy of the number but at the expense of
the range of numbers that can be represented.

Similarly, allowing fewer bytes to the mantissa will reduce the accuracy, but allows a much greater
range of values because the size of the exponent has increased.

Page 9 of 14
3.1 Data representation

3.1.3 Real numebrs and normalized floating-point representation

Integer Overflow

How are integers represented on a computer?

Most computers use the 2’s complement representation.

Assume we have 4 bits to store and operate on an integer; we may end up with the following
situation:

The numbers in parentheses are the decimal value represented by the 2’s complement binary
integers. The result is obviously wrong because adding positive numbers should never result in a
negative number. The reason for this error is that the result exceeds the range of value the 4-bit can
store with the 2’s complement notation.

The following example gives the wrong answer for the same reason. The leading 1 cannot be stored.

Page 10 of 14
3.1 Data representation

3.1.3 Real numebrs and normalized floating-point representation

Integer Underflow

Since we are only concerned with magnitude, why even talk about underflow? Talking about
underflow makes sense for floating-point numbers.

For example, using a 4-bit 2’s complement representation for an exponent of a floating-point number
would enable us to represent numbers of the order 10-16 to 1015. If the number is less than 10-16, there
P P P P P P

would be no way to represent that number, and hence would lean to an “underflow”. Underflow in
floating-point arithmetic may be thought of as “overflow of the exponent”.

In math, numbers have infinite precision, but numerals (representation of a number) have finite
precision. One third (1/3) cannot be represented precisely in decimal otherwise it would need an
infinite number of digits (which is impossible). Similarly, “0.2” cannot be represented precisely in
binary, which means its binary representation is not precise (approximation).

A floating-point underflow happens when the result of a calculation is too small to be stored. An
underflow could also be caused by the (negative) overflow of the exponent.

Page 11 of 14
3.1 Data representation

3.1.3 Real numebrs and normalized floating-point representation

Rounding errors

Suppose you wish to represent the number 1/7 in decimal. This number goes on infinitely repeating
the sequence “0.142857”.

How would you represent this in 4 bit floating point? (3 bit mantissa and 1 bit exponent) 1.43 x 10-
P

1
Pis the closest you can get.

Even with 10, 20, or 100 digits, you would need to do some rounding to represent an infinite number
in a finite space. If you have a lot of digits, your rounding error might seem insignificant. But
consider what happens if you add up these rounded numbers repeatedly for a long period of time. If
you round 1/7 to 1.42 x 10-1 (0.142) and add up this representation 700 times, you would expect to
P P

get 100. (1/7 x 700 = 100) but instead you get 99.4 (0.142 x 700).

Relatively small rounding errors like the example above can have huge impacts. Knowing how these
rounding errors can occur and being conscious of them will help you become a better and more
precise programmer.

Page 12 of 14
3.1 Data representation

3.1.3 Real numebrs and normalized floating-point representation

Consequences of binary
representation of a real
number

While some rounding errors

seem to be very insignificant,
small errors can quickly add up.
One of the most tragic events
caused by rounding errors was
the Patriot Missile Crisis in 1991.

Patriot Missiles, which stand for Phased Array Tracking intercept of Target, were originally designed
to be mobile defenses against enemy aircraft. These missiles were supposed to explode right before
encountering an incoming object. However, on the 25th of February 1991, a Patriot Missile failed to
P P

intercept an incoming Iraqi Scud missile which then struck an army barrack in Dhahran killing 28
American soldiers and injuring over 100 other people!

So what went wrong?

The system’s internal clock recorded passage of time in tenths of seconds. However, as explained
earlier, 1/10 has a non-terminating binary representation, which could lead to problems. Let’s look
into why this happens.

This is an approximation of what 1/10 looks like in binary.

0.0001100110011001100110011001100...

The internal clock used by the computer system only saved 24 bits. So this is what was saved every
tenth of a second:

0.00011001100110011001100

Chopping off any digits beyond the first 24 bits introduced an error of about:

0.0000000000000000000000011001100

Page 13 of 14
3.1 Data representation

3.1.3 Real numebrs and normalized floating-point representation

This is about 0.000000095 seconds for each tenth of a second

The missile’s battery had been running for about 100 hours before the incident. Imagine this error
accumulating for 100 hours (10 times each second!)

The small error for each tenth of a second was not believed to be a problem. However, the missile’s
battery had been running for over 100 hours, introducing an error of about 0.34 seconds!

Given the missile was supposed to intercept a Scud traveling 1,624 meters per second, 0.34 second
delay turned out to be a huge problem!

Page 14 of 14

Data Representation
No ratings yet
Data Representation
28 pages
A Level ZIMSEC Computer Science Notes
No ratings yet
A Level ZIMSEC Computer Science Notes
10 pages
L4
No ratings yet
L4
29 pages
Chapter 1 - Izaac-Wang - Computational Quantum Mechanics (2018)
No ratings yet
Chapter 1 - Izaac-Wang - Computational Quantum Mechanics (2018)
12 pages
13.3 Real Numbers & Normalized Floating-Point
No ratings yet
13.3 Real Numbers & Normalized Floating-Point
17 pages
13.3 Floating Point Numbers Notes 2024
No ratings yet
13.3 Floating Point Numbers Notes 2024
8 pages
Floating Point Numbers
No ratings yet
Floating Point Numbers
20 pages
Fixed Point and Floating Point Number Representations
No ratings yet
Fixed Point and Floating Point Number Representations
7 pages
DMO 2024 Team and Individual Quiz Mechanics
No ratings yet
DMO 2024 Team and Individual Quiz Mechanics
6 pages
arch1-LECTURE-NUMBER REPRESENTATION
No ratings yet
arch1-LECTURE-NUMBER REPRESENTATION
42 pages
Cacc
No ratings yet
Cacc
106 pages
Lecture Slides Week4
No ratings yet
Lecture Slides Week4
42 pages
COA
No ratings yet
COA
14 pages
PDF Machine Learning
100% (1)
PDF Machine Learning
222 pages
Floating Point Representation: Reading: B&O 2.4
No ratings yet
Floating Point Representation: Reading: B&O 2.4
44 pages
Floating Point Representation
No ratings yet
Floating Point Representation
18 pages
MCQ
100% (1)
MCQ
5 pages
MTH 214 Accuracy in Numerical Calculations and Error Analysis
No ratings yet
MTH 214 Accuracy in Numerical Calculations and Error Analysis
18 pages
NT Notes
No ratings yet
NT Notes
8 pages
(English) CURRENT ELECTRICITY in 1 Shot - All Concepts & PYQs Covered - Prachand NEET (DownSub - Com)
No ratings yet
(English) CURRENT ELECTRICITY in 1 Shot - All Concepts & PYQs Covered - Prachand NEET (DownSub - Com)
295 pages
Lect4 Floats
No ratings yet
Lect4 Floats
64 pages
CH03 Data II
No ratings yet
CH03 Data II
31 pages
3-EED220 Lecture 3
No ratings yet
3-EED220 Lecture 3
22 pages
COA UNIT-III PPTs Dr.G.Bhaskar ECE
No ratings yet
COA UNIT-III PPTs Dr.G.Bhaskar ECE
64 pages
Part 1
No ratings yet
Part 1
33 pages
COA Module6 FloatingPoint
No ratings yet
COA Module6 FloatingPoint
17 pages
Machine Arithmetic - Notes
No ratings yet
Machine Arithmetic - Notes
10 pages
L-5 Floating Point Representation of Numbers
No ratings yet
L-5 Floating Point Representation of Numbers
21 pages
NAChapter 1
No ratings yet
NAChapter 1
24 pages
Lecture11 Slides 1
No ratings yet
Lecture11 Slides 1
52 pages
Ass 1
No ratings yet
Ass 1
8 pages
Lecture 14 - Arithmetic Subsystems - Numbering Systems and Floating Point Unit (FPU)
No ratings yet
Lecture 14 - Arithmetic Subsystems - Numbering Systems and Floating Point Unit (FPU)
32 pages
Floating Point Numbers: Do You Have Your Laptop Here?
No ratings yet
Floating Point Numbers: Do You Have Your Laptop Here?
10 pages
Floating Break Water
No ratings yet
Floating Break Water
160 pages
Floating Point Numbers 237045407 237045407
No ratings yet
Floating Point Numbers 237045407 237045407
20 pages
Unit-1 COA
No ratings yet
Unit-1 COA
26 pages
Module 2 - PART D Floating
No ratings yet
Module 2 - PART D Floating
30 pages
Floating Points
No ratings yet
Floating Points
31 pages
Floating Point
No ratings yet
Floating Point
26 pages
COMP0068 Lecture10 High Level Data Types
No ratings yet
COMP0068 Lecture10 High Level Data Types
25 pages
16-Algorithms For Floating Point Arithmetic Operations and Numericals-01-02-2024
No ratings yet
16-Algorithms For Floating Point Arithmetic Operations and Numericals-01-02-2024
21 pages
Information Representation Floating Point
No ratings yet
Information Representation Floating Point
17 pages
Floating Point Numbers
No ratings yet
Floating Point Numbers
23 pages
DCF Techniques
No ratings yet
DCF Techniques
25 pages
FIXED and FLOAT
No ratings yet
FIXED and FLOAT
8 pages
Roller Coaster SImulation
No ratings yet
Roller Coaster SImulation
75 pages
Week 5: IEEE Floating Point Revision Guide For Phase Test
No ratings yet
Week 5: IEEE Floating Point Revision Guide For Phase Test
23 pages
9 Floating Point Numbers
No ratings yet
9 Floating Point Numbers
21 pages
Solid Geometry
No ratings yet
Solid Geometry
8 pages
10Th Maths EM Creative One Mark - UNIT 3 - 4 - Kalviexpress
No ratings yet
10Th Maths EM Creative One Mark - UNIT 3 - 4 - Kalviexpress
10 pages
Network Engineering PDF
0% (1)
Network Engineering PDF
44 pages
Computer Arithmetic Representations
No ratings yet
Computer Arithmetic Representations
24 pages
IEEE Standard 754
No ratings yet
IEEE Standard 754
10 pages
Module2.1 of Nothing
No ratings yet
Module2.1 of Nothing
7 pages
#3 - Floating Point
No ratings yet
#3 - Floating Point
38 pages
Floating Point Representation - M.eng Term Paper
No ratings yet
Floating Point Representation - M.eng Term Paper
6 pages
Lesson 4
No ratings yet
Lesson 4
83 pages
Floating Point 6up
No ratings yet
Floating Point 6up
7 pages
Computer Organization
No ratings yet
Computer Organization
22 pages
5NF and Other Normal Forms
No ratings yet
5NF and Other Normal Forms
22 pages
02 - Data Representation - Exercise Sheet 2 (Solutions) - 1271758327
No ratings yet
02 - Data Representation - Exercise Sheet 2 (Solutions) - 1271758327
5 pages
BT 154 Statics of Rigid Bodies
No ratings yet
BT 154 Statics of Rigid Bodies
103 pages
1 5 Floating Point Representation
No ratings yet
1 5 Floating Point Representation
9 pages
Fermi-Hubbard Gas With Three-Body Losses: Symmetries and Dark States
No ratings yet
Fermi-Hubbard Gas With Three-Body Losses: Symmetries and Dark States
21 pages
Floating-Point Numbers and Operations Representation
No ratings yet
Floating-Point Numbers and Operations Representation
8 pages
Example: 5. Twos Complement
No ratings yet
Example: 5. Twos Complement
4 pages
Number Representation
No ratings yet
Number Representation
7 pages
Hashsorting
No ratings yet
Hashsorting
33 pages
Crash Course JEE Advanced Sample Ebook
100% (1)
Crash Course JEE Advanced Sample Ebook
31 pages
A New and Simple Method To Estimate RE) and Ko (E) in DAEM Model From Three Set of Experimental Data
No ratings yet
A New and Simple Method To Estimate RE) and Ko (E) in DAEM Model From Three Set of Experimental Data
6 pages
This PDF Is The Sample PDF Taken From Our Comprehensive Study Material For IIT-JEE Main & Advanced
No ratings yet
This PDF Is The Sample PDF Taken From Our Comprehensive Study Material For IIT-JEE Main & Advanced
13 pages
Computational Methods
No ratings yet
Computational Methods
7 pages
Lecture 9 - Marine Hydrodynamics I - Volume and Mass Flow Rates - Part I
No ratings yet
Lecture 9 - Marine Hydrodynamics I - Volume and Mass Flow Rates - Part I
25 pages
2023 PLS
No ratings yet
2023 PLS
21 pages
1980 Kennedy
No ratings yet
1980 Kennedy
24 pages
Carrom
No ratings yet
Carrom
3 pages
Real Number Representation and Floating Point Arithmetic
No ratings yet
Real Number Representation and Floating Point Arithmetic
12 pages
Data Representation Workbook
No ratings yet
Data Representation Workbook
8 pages
A Comparative Review of 3D Container Loading Algorithms
No ratings yet
A Comparative Review of 3D Container Loading Algorithms
34 pages
Face-Bow Record Without A Third Point of Reference Theoretical Considerations and An Alternative Technique
No ratings yet
Face-Bow Record Without A Third Point of Reference Theoretical Considerations and An Alternative Technique
5 pages
Astrophysics, Gravitation and Quantum Physics PDF
100% (2)
Astrophysics, Gravitation and Quantum Physics PDF
300 pages
SimAM - A Simple, Parameter-Free Attention Module For Convolutional Neural Networks
No ratings yet
SimAM - A Simple, Parameter-Free Attention Module For Convolutional Neural Networks
12 pages
Decimal To Binary Conversion
No ratings yet
Decimal To Binary Conversion
4 pages
SIMULATION MODEL of Permanent Magnet Synchronous Motor
No ratings yet
SIMULATION MODEL of Permanent Magnet Synchronous Motor
9 pages
80-20 Curve (Pareto) PDF
No ratings yet
80-20 Curve (Pareto) PDF
6 pages
The Spatial Diffusion of Homicide in Mexico City: A Test of Theories in Context
No ratings yet
The Spatial Diffusion of Homicide in Mexico City: A Test of Theories in Context
20 pages
Asymptotic Generalizations of The Lockhart Martinelli Method For Two Phase Flows
No ratings yet
Asymptotic Generalizations of The Lockhart Martinelli Method For Two Phase Flows
12 pages

3.1 Data Representation: 3.1.3 Real Numebrs and Normalized Floating-Point Representation

Uploaded by

3.1 Data Representation: 3.1.3 Real Numebrs and Normalized Floating-Point Representation

Uploaded by

3.

3.1.3 Real numebrs and normalized floating-point representation

“0.10111x2101”. Here “0.10111” is the mantissa and “101” is the exponent.

the exponent is “-4”.

the exponent is “-11”.

3.1.3 Real numebrs and normalized floating-point representation

Convert “2.625” into 8-bit floating point format

2 (the integral part) =

128 (2^7) 64 (2^6) 32 (2^5) 16 (2^4) 8 (2^3) 4 (2^2) 2 (2^1) 1 (2^0)

0.625 (the fractional part)

- 0.625 x 2 = 1.25 1 (Generate “1”,continue with the rest)

- 0.25 x 2 = 0.5 0 (Because the whole part is “0” we continue multiplying by 2)

So 2 10 = 10 2 and 0.625 10 = 0.101 2

So 2.625 10 = 10.101 2 (binary floating-point real number)

3.1.3 Real numebrs and normalized floating-point representation

Convert 0.40625 to 8-bit floating point format

So 0.40625 10 = 0.01101 2 (binary floating-point real number)

Convert 14.7 into 8-bit floating point format

14 (the integral part) =

128 (2^7) 64 (2^6) 32 (2^5) 16 (2^4) 8 (2^3) 4 (2^2) 2 (2^1) 1 (2^0)

0.7 (the fractional part)

3.1.3 Real numebrs and normalized floating-point representation

Convert 1101.1100 into denary format

Integral part: 1101

Converting the integral part is simply converting from binary to decimal

128 (2^7) 64 (2^6) 32 (2^5) 16 (2^4) 8 (2^3) 4 (2^2) 2 (2^1) 1 (2^0)

Converting the fractional part is similar as integral part

½ (2^-1) ¼ (2^-2) 1/8

0.1100 2 = (1/2) + (1/4) = 0.75 10

*red cells indicate that those bits are not used

3.1.3 Real numebrs and normalized floating-point representation

Normalizing a Real Number

“10.11011” becomes “0.1011011 x 210” and can be held as:

Because the 2’s complement of “-101” using 8 bits is “11111011”

3.1.3 Real numebrs and normalized floating-point representation

3.1.3 Real numebrs and normalized floating-point representation

Ignore the negative sign and solve for just “11/32”

Now we have 8 bits mantissa and 8 bits exponent

3.1.3 Real numebrs and normalized floating-point representation

0.1111111 x 21111111 = 0.1111111 x 2127

This means that the largest positive number is almost 1 x 2127 P

0.1000000 x 210000000 = 0.1000000 x 2-128

Which is very close to zero; in fact it is 2-129 .

3.1.3 Real numebrs and normalized floating-point representation

1.0111111 x 210000000 = -0.10000001 x 2-128

Hence, for a positive number n:

2-129 ≤ n < 2127

And for a negative number n:

-2127 ≤ n < -2-129

3.1.3 Real numebrs and normalized floating-point representation

How are integers represented on a computer?

Most computers use the 2’s complement representation.

3.1.3 Real numebrs and normalized floating-point representation

3.1.3 Real numebrs and normalized floating-point representation

3.1.3 Real numebrs and normalized floating-point representation

While some rounding errors

So what went wrong?

This is an approximation of what 1/10 looks like in binary.

3.1.3 Real numebrs and normalized floating-point representation

This is about 0.000000095 seconds for each tenth of a second

You might also like