0% found this document useful (0 votes)
7 views75 pages

CO04

The document discusses signed and floating point numbers, focusing on their representation methods, specifically sign/magnitude and two's complement. It explains how to compute negative values using these representations and highlights the advantages of two's complement, such as simplified arithmetic operations and elimination of negative zero. Additionally, it outlines the valid ranges for unsigned, signed, and two's complement integers based on the number of bits used for storage.

Uploaded by

b23032
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views75 pages

CO04

The document discusses signed and floating point numbers, focusing on their representation methods, specifically sign/magnitude and two's complement. It explains how to compute negative values using these representations and highlights the advantages of two's complement, such as simplified arithmetic operations and elimination of negative zero. Additionally, it outlines the valid ranges for unsigned, signed, and two's complement integers based on the number of bits used for storage.

Uploaded by

b23032
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 75

Signed and Floating Point Numbers

and
Floating Point Arithmetic Unit

Dr. Shubhajit Roy Chowdhury,


School of Computing and Electrical Engineering,
Indian Institute of Technology Mandi, India
Email: [email protected]

Dr. Shubhajit Roy Chowdhury SCEE, IIT MANDI


Signed Numbers
• Until now we've been concentrating on unsigned
numbers. In real life we also need to be able
represent signed numbers ( like: -12, -45, +78)
except when we talked about Booth’s Algorithm.
• A signed number MUST have a sign (+/-). A method is
needed to represent the sign as part of the binary
representation.
• Two signed number representation methods are:
– Sign/magnitude representation
– Twos-complement representation

Dr. Shubhajit Roy Chowdhury SCEE, IIT MANDI


Sign/Magnitude Representation

In sign/magnitude (S/M) representation, the


leftmost bit of a binary code represents the sign
of the value:

• 0 for positive,
• 1 for negative;

The remaining bits represent the numeric value.

Dr. Shubhajit Roy Chowdhury SCEE, IIT MANDI


Sign/Magnitude Representation

To compute negative values using


Sign/Magnitude (S/M) representation:

1) Begin with the binary representation of the


positive value

2) Then flip the leftmost zero bit.

Dr. Shubhajit Roy Chowdhury SCEE, IIT MANDI


Sign/Magnitude Representation
Ex 1. Find the S/M representation of -610

Step 1: Find binary representation using 8 bits


610 = 000001102
Step 2: If the number you want to represent is
negative, flip leftmost bit

10000110

So: -610 = 100001102


(in 8-bit sign/magnitude form)

Dr. Shubhajit Roy Chowdhury SCEE, IIT MANDI


Sign/Magnitude Representation
Ex 2. Find the S/M representation of -3610

Step 1: Find binary representation using 8 bits


-3610 = 001001002
Step 2: If the number you want to represent is
negative, flip left most bit

10100100

So: -3610 = 101001002


(in 8-bit sign/magnitude form)

Dr. Shubhajit Roy Chowdhury SCEE, IIT MANDI


Sign/Magnitude Representation
32-bit example:
+9
0 000 0000 0000 0000 0000 0000 0000 1001
1 000 0000 0000 0000 0000 0000 0000 1001 -9

Sign bit: 31 remaining bits


0  positive for magnitude
1  negative (i.e. the value)

Dr. Shubhajit Roy Chowdhury SCEE, IIT MANDI


Problems with Sign/Magnitude
-7 +0 Seven Positive
-6 0000 +1
1111 Numbers and
1110 0001
-5 +2 “Positive” Zero
1101 0010
-4 Inner numbers: 0011 +3
1100
Binary
-3 1011 representation 0100 +4

-2 1010 0101 +5
Seven Negative 1001 0110
Numbers and -1 1000 0111 +6
“Negative” Zero -0 +7

• Two different representations for 0!


• Two discontinuities

Dr. Shubhajit Roy Chowdhury SCEE, IIT MANDI


Two’s Complement Representation

• Another method used to represent negative


numbers (used by most modern computers)
is two’s complement.

• The leftmost bit STILL serves as a sign bit:


– 0 for positive numbers,
– 1 for negative numbers.

Dr. Shubhajit Roy Chowdhury SCEE, IIT MANDI


Two’s Complement Representation
To compute negative values using Two’s
Complement representation:

1) Begin with the binary representation of the


positive value
2) Complement (flip each bit -- if it is 0 make it
1 and visa versa) the entire positive
number
3) Then add one.

Dr. Shubhajit Roy Chowdhury SCEE, IIT MANDI


Two’s Complement Representation

Ex 1. Find the 8-bit two’s complement


representation of –610

Step 1: Find binary representation of the


positive value in 8 bits
610 = 000001102

Dr. Shubhajit Roy Chowdhury SCEE, IIT MANDI


Two’s Complement Representation
Ex 1 continued
Step 2: Complement the entire positive
value

Positive Value:00000110

Complemented: 11111001

Dr. Shubhajit Roy Chowdhury SCEE, IIT MANDI


Two’s Complement Representation
Ex 1, Step 3: Add one to complemented
value

(complemented) -> 11111001


(add one) -> + 1
11111010
So: -610 = 111110102
(in 8-bit 2's complement form)

Dr. Shubhajit Roy Chowdhury SCEE, IIT MANDI


Two’s Complement Representation
Ex 2. Find the 8-bit two’s complement
representation of 2010

Step 1: Find binary representation of the


positive value in 8 bits
2010 = 000101002

20 is positive, so STOP after step 1!

So: 2010 = 000101002


(in 8-bit 2's complement form)

Dr. Shubhajit Roy Chowdhury SCEE, IIT MANDI


Two’s Complement Representation

Ex 3. Find the 8-bit two’s complement


representation of –8010

Step 1: Find binary representation of the


positive value in 8 bits
8010 = 010100002

-80 is negative, so continue…

Dr. Shubhajit Roy Chowdhury SCEE, IIT MANDI


Two’s Complement Representation

Ex 3
Step 2: Complement the entire positive
value

Positive Value: 01010000

Complemented: 10101111

Dr. Shubhajit Roy Chowdhury SCEE, IIT MANDI


Two’s Complement Representation
Ex 3, Step 3: Add one to complemented
value

(complemented) -> 10101111


(add one) -> + 1
10110000

So: -8010 = 101100002


(in 8-bit 2's complement form)

Dr. Shubhajit Roy Chowdhury SCEE, IIT MANDI


Two’s Complement Representation
Alternate method -- replaces previous
steps 2-3

Step 2: Scanning the positive binary representation from right to


left,
find first one bit, from low-order (right) end

Step 3: Complement (flip) the remaining bits to the left.

00000110
(left complemented) --> 11111010

Dr. Shubhajit Roy Chowdhury SCEE, IIT MANDI


Two’s Complement Representation
Ex 1: Find the Two’s Complement of -7610

Step 1: Find the 8-bit binary


representation of the positive value.

7610 = 010011002

Dr. Shubhajit Roy Chowdhury SCEE, IIT MANDI


Two’s Complement Representation
Step 2: Find first one bit, from low-order
(right) end, and complement the pattern to
the left.
01001100
(left complemented) -> 10110100

So: -7610 = 101101002


(in 8-bit 2's complement form)

Dr. Shubhajit Roy Chowdhury SCEE, IIT MANDI


Two’s Complement Representation

Ex 2: Find the Two’s Complement of -2610

Step 1: Find the 8-bit binary


representation of the positive value.

2610 = 000110102

Dr. Shubhajit Roy Chowdhury SCEE, IIT MANDI


Two’s Complement Representation
Ex 2, Step 2: Find first one bit, from low-
order (right) end, and complement the
pattern to the left.
00011010
(left complemented) -> 11100110

So: -2610 = 111001102


(in 8-bit 2's complement form)

Dr. Shubhajit Roy Chowdhury SCEE, IIT MANDI


Two’s Complement Representation

32-bit example:
+9
0 000 0000 0000 0000 0000 0000 0000 1001
1 111 1111 1111 1111 1111 1111 1111 0111 -9

Sign bit: 31 remaining bits for


0 --> positive magnitude
1 --> negative (i.e. value stored in two’s
complement form)
Dr. Shubhajit Roy Chowdhury SCEE, IIT MANDI
Two’s Complement to Decimal

Ex 1: Find the decimal equivalent of the


8-bit 2’s complement value 110010002

Step 1: Determine if number is positive


or negative:

Leftmost bit is 1, so number is negative.

Dr. Shubhajit Roy Chowdhury SCEE, IIT MANDI


Two’s Complement to Decimal

Ex 1, Step 2: Find first one bit, from low-


order (right) end, and complement the
pattern to the left.
11001000
(left complemented) 00111000

Dr. Shubhajit Roy Chowdhury SCEE, IIT MANDI


Two’s Complement to Decimal

Ex 1, Step 3: Determine the numeric


value:
001110002 = 32 + 16 + 8 = 5610

So: 110010002 = -5610


(8-bit 2's complement form)

Dr. Shubhajit Roy Chowdhury SCEE, IIT MANDI


S/M problems solved with 2s complement
Re-order Negative -1 +0
numbers to eliminate -2 1111 0000 +1
one Discontinuity 1110 0001 Eight
-3 +2
1101 0010 Positive
Note: -4 1100 Inner numbers: 0011 +3 Numbers
Negative Numbers Binary
-5 1011 representation 0100
still have 1 for the +4
1010
most significant bit -6 0101
+5
1001
(MSB) 0110
-7 1000 0111 +6
-8 +7
• Only one discontinuity now
• Only one zero
• One extra negative number

Dr. Shubhajit Roy Chowdhury SCEE, IIT MANDI


Two’s Complement Representation

Biggest reason two’s complement used in most


systems today?

The binary codes can be added and subtracted


as if they were unsigned binary numbers,
without regard to the signs of the numbers
they actually represent.

Dr. Shubhajit Roy Chowdhury SCEE, IIT MANDI


Two’s Complement Representation
For example, to add +4 and -3, we simply add the
corresponding binary codes, 0100 and 1101:

0100 (+4)
+1101 (-3)
0001 (+1)
NOTE: A carry to the leftmost column has
been ignored.
The result, 0001, is the code for +1, which IS
the sum of +4 and -3.

Dr. Shubhajit Roy Chowdhury SCEE, IIT MANDI


Twos Complement Representation

Likewise, to subtract +7 from +3:


0011 (+3)
- 0111 (+7)
1100 (-4)
NOTE: A “phantom” 1 was borrowed from
beyond the leftmost position.

The result, 1100, is the code for -4, the result


of subtracting +7 from +3.
Dr. Shubhajit Roy Chowdhury SCEE, IIT MANDI
Two’s Complement Representation
Summary - Benefits of Twos
Complements:

– Addition and subtraction are simplified


in the two’s-complement system,

– -0 has been eliminated, replaced by one


extra negative value, for which there is
no corresponding positive number.

Dr. Shubhajit Roy Chowdhury SCEE, IIT MANDI


Valid Ranges

• For any integer data representation,


there is a LIMIT to the size of number
that can be stored.

• The limit depends upon number of bits


available for data storage.

Dr. Shubhajit Roy Chowdhury SCEE, IIT MANDI


Unsigned Integer Ranges

Range = 0 to (2n – 1)
where n is the number of bits used to store
the unsigned integer.

Numbers with values GREATER than (2n – 1)


would require more bits. If you try to store
too large a value without using more bits,
OVERFLOW will occur.

Dr. Shubhajit Roy Chowdhury SCEE, IIT MANDI


Unsigned Integer Ranges
Example: On a system that stores
unsigned integers in 16-bit words:
Range = 0 to (216 – 1)
= 0 to 65535

Therefore, you cannot store numbers


larger than 65535 in 16 bits.

Dr. Shubhajit Roy Chowdhury SCEE, IIT MANDI


Signed S/M Integer Ranges
Range = -(2(n-1) – 1) to +(2(n-1) – 1)
where n is the number of bits used to store the
sign/magnitude integer.

Numbers with values GREATER than +(2(n-1) – 1)


and values LESS than -(2(n-1) – 1) would
require more bits. If you try to store too
large/too small a value without using more bits,
OVERFLOW will occur.

Dr. Shubhajit Roy Chowdhury SCEE, IIT MANDI


S/M Integer Ranges
Example: On a system that stores unsigned
integers in 16-bit words:

Range = -(215 – 1) to +(215 – 1)


= -32767 to +32767

Therefore, you cannot store numbers larger


than 32767 or smaller than -32767 in 16 bits.

Dr. Shubhajit Roy Chowdhury SCEE, IIT MANDI


Two’s Complement Ranges
Range = -2(n-1) to +(2(n-1) – 1)
where n is the number of bits used to store the
two-s complement signed integer.

Numbers with values GREATER than +(2(n-1) – 1)


and values LESS than -2(n-1) would require
more bits. If you try to store too large/too small
a value without using more bits, OVERFLOW
will occur.

Dr. Shubhajit Roy Chowdhury SCEE, IIT MANDI


Two’s Complement Ranges
Example: On a system that stores unsigned
integers in 16-bit words:

Range = -215 to +(215 – 1)


= -32768 to +32767

Therefore, you cannot store numbers larger


than 32767 or smaller than -32768 in 16 bits.

Dr. Shubhajit Roy Chowdhury SCEE, IIT MANDI


Using Ranges for Validity Checking

• Once you know how small/large a value


can be stored in n bits, you can use this
knowledge to check whether you
answers are valid, or cause overflow.
• Overflow can only occur if you are
adding two positive numbers or two
negative numbers

Dr. Shubhajit Roy Chowdhury SCEE, IIT MANDI


Using Ranges for Validity Checking
Ex 1:
Given the following 2’s complement
equations in 5 bits, is the answer valid?

11111 (-1) Range =


+11101 (-3) -16 to +15
11100 (-4)  VALID

Dr. Shubhajit Roy Chowdhury SCEE, IIT MANDI


Using Ranges for Validity Checking
Ex 2:
Given the following 2’s complement
equations in 5 bits, is the answer valid?

10111 (-9) Range =


+10101 (-11) -16 to +15
01100 (-20)  INVALID

Dr. Shubhajit Roy Chowdhury SCEE, IIT MANDI


Floating Point Numbers

• Now you've seen unsigned and signed


integers. In real life we also need to be able
represent numbers with fractional parts (like: -
12.5 & 45.39).

 Called Floating Point numbers.


 You will learn the IEEE 32-bit floating
point representation.

Dr. Shubhajit Roy Chowdhury SCEE, IIT MANDI


Floating Point Numbers
• In the decimal system, a decimal point
(radix point) separates the whole
numbers from the fractional part
• Examples:
37.25 ( whole = 37, fraction = 25/100)
123.567
10.12345678

Dr. Shubhajit Roy Chowdhury SCEE, IIT MANDI


Floating Point Numbers

For example, 37.25 can be analyzed as:

101 100 10-1 10-2


Tens Units Tenths Hundredths
3 7 2 5

37.25 = (3 x 10) + (7 x 1) + (2 x 1/10) + (5 x 1/100)

Dr. Shubhajit Roy Chowdhury SCEE, IIT MANDI


Binary Equivalence
The binary equivalent of a floating point number
can be determined by computing the binary
representation for each part separately.
1) For the whole part:
Use subtraction or division method
previously learned.
2) For the fractional part:
Use the subtraction or multiplication
method (to be shown next)

Dr. Shubhajit Roy Chowdhury SCEE, IIT MANDI


Fractional Part – Multiplication Method

In the binary representation of a floating point


number the column values will be as follows:

… 25 24 23 22 21 20 . 2-1 2-2 2-3 2-4 …


… 32 16 8 4 2 1 . 1/2 1/4 1/8 1/16…
… 32 16 8 4 2 1 . .5 .25 .125 .0625…

Dr. Shubhajit Roy Chowdhury SCEE, IIT MANDI


Fractional Part – Multiplication Method
Ex 1. Find the binary equivalent of 0.25
Step 1: Multiply the fraction by 2 until the fractional part
becomes 0 .25
x2
0.5
x2
1.0

Step 2: Collect the whole parts in forward order. Put them after the
radix point
. .5 .25 .125 .0625
. 0 1

Dr. Shubhajit Roy Chowdhury SCEE, IIT MANDI


Fractional Part – Multiplication Method
Ex 2. Find the binary equivalent of 0.625
Step 1: Multiply the fraction by 2 until the fractional
part becomes 0 .625
x 2
1.25
x 2
0.50
x 2
Step 2: Collect the whole parts in forward order.
Put them after the radix point 1.0
. .5 .25 .125 .0625
. 1 0 1

Dr. Shubhajit Roy Chowdhury SCEE, IIT MANDI


Problem storing binary form

• We have no way to store the radix point!

• Standards committee came up with a way


to store floating point numbers (that have
a decimal point)

Dr. Shubhajit Roy Chowdhury SCEE, IIT MANDI


IEEE Floating Point Representation

• Floating point numbers can be stored into 32-


bits, by dividing the bits into three parts:
the sign, the exponent, and the mantissa.

1 2 9 10 32

Dr. Shubhajit Roy Chowdhury SCEE, IIT MANDI


IEEE Floating Point Representation

• The first (leftmost) field of our floating


point representation will STILL be the
sign bit:

–0 for a positive number,


–1 for a negative number.

Dr. Shubhajit Roy Chowdhury SCEE, IIT MANDI


Storing the Binary Form
How do we store a radix point?
- All we have are zeros and ones…

Make sure that the radix point is ALWAYS in the


same position within the number.

Use the IEEE 32-bit standard


 the leftmost digit must be a 1

Dr. Shubhajit Roy Chowdhury SCEE, IIT MANDI


Solution is Normalization
Every binary number, except the one
corresponding to the number zero, can be
normalized by choosing the exponent so that the
radix point falls to the right of the leftmost 1 bit.

37.2510 = 100101.012 = 1.0010101 x 25

7.62510 = 111.1012 = 1.11101 x 22

0.312510 = 0.01012 = 1.01 x 2-2

Dr. Shubhajit Roy Chowdhury SCEE, IIT MANDI


IEEE Floating Point Representation
• The second field of the floating point number
will be the exponent.
• The exponent is stored as an unsigned 8-bit
number, RELATIVE to a bias of 127.
– Exponent 5 is stored as (127 + 5) or 132
• 132 = 10000100
– Exponent -5 is stored as (127 + (-5)) or 122
• 122 = 01111010

Dr. Shubhajit Roy Chowdhury SCEE, IIT MANDI


IEEE Floating Point Representation
• The mantissa is the set of 0’s and 1’s to
the right of the radix point of the
normalized (when the digit to the left of the
radix point is 1) binary number.
Ex: 1.00101 X 23
(The mantissa is 00101)

 The mantissa is stored in a 23 bit field, so


we add zeros to the right side and store:
00101000000000000000000

Dr. Shubhajit Roy Chowdhury SCEE, IIT MANDI


Decimal Floating Point to
IEEE standard Conversion

Ex 1: Find the IEEE FP representation of


40.15625

Step 1.
Compute the binary equivalent of the
whole part and the fractional part. (i.e.
convert 40 and .15625 to their binary
equivalents)
Dr. Shubhajit Roy Chowdhury SCEE, IIT MANDI
Decimal Floating Point to
IEEE standard Conversion

40 .15625
- 32 Result: -.12500 Result:
8 101000 .03125 .00101
- 8 -.03125
0 .0

So: 40.1562510 = 101000.001012

Dr. Shubhajit Roy Chowdhury SCEE, IIT MANDI


Decimal Floating Point to
IEEE standard Conversion

Step 2. Normalize the number by moving the


decimal point to the right of the leftmost one.

101000.00101 = 1.0100000101 x 25

Dr. Shubhajit Roy Chowdhury SCEE, IIT MANDI


Decimal Floating Point to
IEEE standard Conversion

Step 3. Convert the exponent to a biased


exponent

127 + 5 = 132

And convert biased exponent to 8-bit unsigned


binary:

13210 = 100001002

Dr. Shubhajit Roy Chowdhury SCEE, IIT MANDI


Decimal Floating Point to
IEEE standard Conversion

Step 4. Store the results from steps 1-3:

Sign Exponent Mantissa


(from step 3) (from step 2)

0 10000100 01000001010000000000000

Dr. Shubhajit Roy Chowdhury SCEE, IIT MANDI


Decimal Floating Point to
IEEE standard Conversion
Ex 2: Find the IEEE FP representation of –24.75
Step 1. Compute the binary equivalent of the whole
part and the fractional part.

24 .75
- 16 Result: - .50 Result:
8 11000 .25 .11
- 8 - .25
0 .0
So: -24.7510 = -11000.112

Dr. Shubhajit Roy Chowdhury SCEE, IIT MANDI


Decimal Floating Point to
IEEE standard Conversion

Step 2.
Normalize the number by moving the decimal
point to the right of the leftmost one.

-11000.11 = -1.100011 x 24

Dr. Shubhajit Roy Chowdhury SCEE, IIT MANDI


Decimal Floating Point to
IEEE standard Conversion.

Step 3. Convert the exponent to a biased


exponent
127 + 4 = 131
==> 13110 = 100000112

Step 4. Store the results from steps 1-3

Sign Exponent mantissa


1 10000011 1000110..0

Dr. Shubhajit Roy Chowdhury SCEE, IIT MANDI


IEEE standard to Decimal
Floating Point Conversion.

• Do the steps in reverse order

• In reversing the normalization step move the


radix point the number of digits equal to the
exponent:
– If exponent is positive, move to the right
– If exponent is negative, move to the left

Dr. Shubhajit Roy Chowdhury SCEE, IIT MANDI


IEEE standard to Decimal
Floating Point Conversion.

Ex 1: Convert the following 32-bit binary number


to its decimal floating point equivalent:

Sign Exponent Mantissa

1 01111101 010..0

Dr. Shubhajit Roy Chowdhury SCEE, IIT MANDI


IEEE standard to Decimal
Floating Point Conversion..

Step 1: Extract the biased exponent and unbias


it

Biased exponent = 011111012 = 12510

Unbiased Exponent: 125 – 127 = -2

Dr. Shubhajit Roy Chowdhury SCEE, IIT MANDI


IEEE standard to Decimal
Floating Point Conversion..

Step 2: Write Normalized number in the form:

Mantissa Exponent
1 . ____________ x 2 ----

For our number:


-1. 01 x 2 –2

Dr. Shubhajit Roy Chowdhury SCEE, IIT MANDI


Overview of IEEE 754 Standard Formats

Some features of the ANSI/IEEE standard floating-point number representation formats.


––––––––––––––––––––––––––––––––––––––––––––––––––––––––
Feature Single / Short Double / Long
––––––––––––––––––––––––––––––––––––––––––––––––––––––––
Word width (bits) 32 64
Significand bits 23 + 1 hidden 52 + 1 hidden
Significand range [1, 2 – 2–23] [1, 2 – 2–52]
Exponent bits 8 11
Exponent bias 127 1023
Zero (0) e + bias = 0, f = 0 e + bias = 0, f = 0
Denormal e + bias = 0, f  0 e + bias = 0, f  0
represents  0.f  2–126 represents 0.f 2–1022
Infinity () e + bias = 255, f = 0 e + bias = 2047, f = 0
Not-a-number (NaN) e + bias = 255, f  0 e + bias = 2047, f  0
Ordinary number e + bias  [1, 254] e + bias  [1, 2046]
e  [–126, 127] e  [–1022, 1023]
represents 1.f  2e represents 1.f  2e
min 2–126  1.2  10–38 2–1022  2.2  10–308
max  2128  3.4  1038  21024  1.8  10308
––––––––––––––––––––––––––––––––––––––––––––––––––––––––
Dr. Shubhajit Roy Chowdhury SCEE, IIT MANDI
Floating-Point Addition and
Subtraction
Addition
Algorithms
Assume e1  e2; alignment shift (preshift) is needed if e1 > e2

( s1  b e1) + ( s2  b e2) = ( s1  b e1) + ( s2 / b e1–e2)  b e1


= ( s1  s2 / b e1–e2)  b e1 =  s  b e

Example: Numbers to be added:


x = 25  1.00101101 Operand with
y = 21  1.11101101 smaller exponent
to be preshifted
Operands after alignment shift:
x = 25  1.00101101
y = 25  0.000111101101
Extra bits to be
Result of addition: rounded off
s = 25  1.010010111101
s = 25  1.01001100 Rounded sum

Dr. Shubhajit Roy Chowdhury SCEE, IIT MANDI


Floating Point Addition and Subtraction

Dr. Shubhajit Roy Chowdhury SCEE, IIT MANDI


Floating Point Adder and Subtractor
x Operands y
Unpack
Signs Exponents Significands

Add/
Sub
Selective complement
Mu x Sub and possible swap

Align significands

cout cin
Control Add
& sign
logic

Normalize

Round and
selective complement

Add Normalize

Sign Exponent Significand


Pack
s Sum/Difference

Dr. Shubhajit Roy Chowdhury SCEE, IIT MANDI


Floating-Point Multiplication
Multiplication
and Division
( s1  b e1)  ( s2  b e2) = ( s1  s2 )  b e1+e2

Because s1  s2  [1, 4), postshifting may be needed for normalization

Overflow or underflow can occur during multiplication or normalization

Division

( s1  b e1) / ( s2  b e2) = ( s1 / s2 )  b e1-e2

Because s1 / s2  (0.5, 2), postshifting may be needed for normalization

Overflow or underflow can occur during division or normalization

Dr. Shubhajit Roy Chowdhury SCEE, IIT MANDI


Floating Point Multiplication

Dr. Shubhajit Roy Chowdhury SCEE, IIT MANDI


Floating Point Multiplier
Floating-point operands

Unpack

XOR Add
Exponents
Multiply
Significands

Adjus t
Exponent Normalize

Round

Adjus t
Normalize
Exponent

Pack

Product

Dr. Shubhajit Roy Chowdhury SCEE, IIT MANDI


Thank You

Dr. Shubhajit Roy Chowdhury SCEE, IIT MANDI

You might also like