0% found this document useful (0 votes)
6 views

Module 3

Uploaded by

fanofharry53
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

Module 3

Uploaded by

fanofharry53
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 71

COMPUTER

ORGANIZATION AND
ARCHITECTURE
1
Course Code : CSE 2151
Credits : 04
ARITHMETIC AND LOGIC UNIT

MODULE 3
ARITHMETIC AND LOGIC UNIT

3
ADDITION AND SUBTRACTION
▪ OVERFLOW RULE:
▪ If two numbers are added, and they are both positive or both negative, then overflow occurs if and
only if the result has the opposite sign.

▪ SUBTRACTION RULE:
▪ To subtract one number (subtrahend) from another (minuend), take the twos complement (negation)
of the subtrahend and add it to the minuend.

4
ADDITION AND SUBTRACTION: HARDWARE

5
MULTIPLICATION: UNSIGNED INTEGERS
▪ Perform immediate addition to eliminate the need for
additional registers to store the partial products
▪ Save time:
▪ 1 multiplier: add and shift operation
▪ 0 multiplier: shift operation

6
MULTIPLICATION: UNSIGNED INTEGERS

7
MULTIPLICATION: UNSIGNED INTEGERS

8
MULTIPLICATION: UNSIGNED INTEGERS
▪ 45 (101101) X 33 (100001)=1485

C A Q M
0 000000 100001 101101
101101
0 101101 100001 101101 Add First Cycle
0 010110 110000 Shift
0 001011 011000 101101 Shift 2nd Cycle

0 000101 101100 101101 Shift 3rd cycle


0 000010 110110 101101 Shift 4th cycle
0 000001 011011 101101 Shift 5th cycle
101101
0 101110 011011 101101 Add
0 010111 001101 101101 Shift 6th cycle
Product
9
MULTIPLICATION: SIGNED INTEGERS- 2’S COMPLEMENT
▪ -5 (1011) X -3 (1101) = -113 (10001111)

10
MULTIPLICATION: NEGATIVE MULTIPLICAND
▪ -13 (10011) X +11 (01011) = -143 (1101110001)

11
MULTIPLICATION: UNSIGNED V/S SIGNED

12
READ FROM….
▪ Go through examples given in Section 2.12 and 2.15 of the Reference Book (textbook 1)

13
TOPICS COVERED FROM
▪ Textbook 2:
▪ Chapter 10: 10.3

14
MULTIPLICATION: UNSIGNED V/S SIGNED

15
MULTIPLICATION: BOOTH’S ALGORITHM

16
MULTIPLICATION: BOOTH’S ALGORITHM

17
BOOTH’S ALGORITHM: 13X-6
▪ M=13=01101 Q=-6=11010 -M=10011
A Q Q -1 M
00000 11010 0 01101 Initial values
00000 01101 0 01101 Shift Right-1st cycle
10011 A=A-M [Subtract M from A(Adding 2’s
10011 01101 0 01101 complement of M)]
11001 10110 1 01101 Shift Right- 2nd cycle
01101 A=A+M
00110 10110 1 01101
00011 01011 0 Shift Right- 3rd cycle
10011 A=A-M [Subtract M from A(Adding 2’s
10110 01011 0 01101 complement of M)]
11011 00101 1 01101 Shift Right- 4th cycle
11101 10010 1 01101 Shift Right 5th cycle

Taking 2’s 0001001110→78 18


complement Product=-78
BOOTH’S ALGORITHM: 23X29
▪ M=23=010111 Q=29=011101 -M=101001
A Q Q -1 M
000000 011101 0 010111 Initial values
101001
101001 011101 0 A=A-M
110100 101110 1 010111 Shift Right-1st cycle
010111
001011 101110 1 A=A+M
000101 110111 0 010111 Shift Right- 2nd cycle

101001
101110 110111 0 A=A-M
110111 011011 1 010111 Shift Right- 3rd cycle

111011 101101 1 010111 Shift Right- 4th cycle

111101 110110 1 010111 Shift Right 5th cycle


010111
010100 110110 1 A=A+M
001010 011011 0 010111 Shift Right- 6th cycle
001010011011→667 19
Product=667
MULTIPLICATION: BOOTH’S ALGORITHM

20
HOW BOOTH’S ALGORITHM WORKS: +VE MULTIPLIER
▪ Consider the case of a positive multiplier consisting of one block of 1s surrounded by 0s
M * (00011110) = M * (24 + 23 + 22 + 21) = M * (16 + 8 + 4 + 2)
= M * 30
M * (00011110) = M * (25 - 21) = M * (32 - 2)
= M * 30
▪ In general,

▪ the product can be generated by one addition(Adding the content of 25 place value) and one
subtraction (Subtracting the content of 21 place value) of the multiplicand.
▪ Booth’s algorithm conforms to this scheme by performing a subtraction when the first 1 of the
block is encountered (1–0) and an addition when the end of the block is encountered (0–1).
M * (01111010) = M * (26 + 25 + 24 + 23 + 21)
= M * (27 - 23 + 22 - 21) 21
HOW BOOTH’S ALGORITHM WORKS: -VE MULTIPLIER
▪ Let X be a negative number in twos complement notation: X = {1xn-2xn-3 …..x1x0}
▪ Then the value of X can be expressed as follows:

▪ The leftmost bit of X is 1, because X is negative. Assume that the leftmost 0 is in the kth position.
Then,

▪ And the value of X is:

▪ From Equation
▪ we can say that: 2n-2 + 2n-3 + ….+ 2k+1 = 2n-1 - 2k+1
▪ Rearranging: -2n-1 + 2n-2 + 2n-3 + ...+ 2k+1 = -2k+1 (10.7)
▪ Substituting Equation (10.7) into Equation (10.6), we have
X = -2k+1 + (xk-1 * 2k-1) + …+ (x0 * 20) (10.8)
22
HOW BOOTH’S ALGORITHM WORKS: -VE MULTIPLIER
▪ Consider the multiplication of some multiplicand by (-6). In twos complement representation,
using an 8-bit word, (-6) is represented as 11111010.
▪ -6 = -27 + 26 + 25 + 24 + 23 + 21
▪ M * (11111010) = M * (-27 + 26 + 25 + 24 + 23 + 21)
▪ M * (11111010) = M * (-23 + 21) which is equivalent to
▪ M * (11111010) = M * (-23 + 22 - 21) [right to left:0-1→ - 21, 1-0→ +22, 0-1→-23]

23
DIVISION

24
DIVISION

25
DIVISION

26
DIVISION: ALGORITHM
▪ Assumption: divisor V and the dividend D are positive and that |V|< |D|.
▪ If |V|= |D|, then the quotient =1 and the remainder=0.
▪ If |V|> |D|, then Q=0 and R=D. The algorithm can be summarized as follows:
1. Load the twos complement of the divisor into the M register; that is, the M register contains
the negative of the divisor. Load the dividend into the A, Q registers. The dividend must be
expressed as a 2n-bit positive number. Thus, for example, the 4-bit 0111 becomes
00000111.
2. Shift A, Q left 1 bit position.
3. Perform A=A-M. This operation subtracts the divisor from the contents of A.
4.
a. If the result is nonnegative (most significant bit of A=0), then set Q0=1
b. If the result is negative (most significant bit of A=1), then set Q0=0, and restore the previous value of
A.
5. Repeat steps 2 through 4 as many times as there are bit positions in Q.
6. The remainder is in A and the quotient is in Q.

27
DIVISION: EXAMPLE
▪ Divide 8 by 3; M=-3=11101; Q=01000
A Q
00000 01000 Initial values
00000 10000 Shift Left
11101 Subtract Divisor(Add 2’s complement of divisor)
11101 + 00011 10000 Set Q0=0
00000 10000 Restore 1st Cycle
00001 00000 Shift Left
11101 Subtract Divisor(Add 2’s complement of divisor)
11110 + 00011 00000 Set Q0=0
00001 00000 Restore 2nd Cycle
00010 00000 Shift Left
11101 Subtract Divisor(Add 2’s complement of divisor)
11111+ 00011 00000 Set Q0=0
00010 00000 Restore 3rd Cycle
00100 00000 Shift Left
11101 Subtract
00001 00001 Set Q0=1 4th cycle
00010 00010 Shift left
11101 Subtract Divisor(Add 2’s complement of divisor)
11111 + 00011 00010 Set Q0=0 28
00010 00010 Restore 5th Cycle
DIVISION
▪ Consider the following examples of integer division with all possible combinations of
signs of D and V:
D=7 V=3 ➔ Q=2 R=1
D=7 V = -3 ➔ Q = -2 R = 1
D = -7 V = 3 ➔ Q = -2 R = -1
D = -7 V = -3 ➔ Q=2 R = -1
▪ (-7)/(3) and (7)/(-3) produce different remainders.
▪ The magnitudes of Q and R are unaffected by the input signs
▪ The signs of Q and R are easily derivable from the signs of D and V.
▪ sign(R) = sign(D)
▪ sign(Q) = sign(D) * sign(V).

▪ One way to do twos complement division is to convert the operands into unsigned
values and, at the end, to account for the signs by complementation where needed.
29
▪ This is the method of choice for the restoring division algorithm.
EXERCISE
1. Given x and y in twos complement notation i.e., x=0101 and y=1010,compute the product
p=x*y with Booth’s algorithm
2. Use the Booth algorithm to multiply 23 (multiplicand) by 29 (multiplier), where each number
is represented using 6 bits
3. Divide 145 by 13 in binary twos complement notation, using 12-bit words. Use the restoring
division algorithm

30
TOPICS COVERED FROM
▪ Textbook 2:
▪ Chapter 10: 10.3

31
FLOATING-POINT NUMBERS
▪ The basic IEEE format is a 32-bit representation that comprises of
▪ a sign bit,
▪ 23 significant bits, and
▪ 8 bits for a signed exponent of the scale factor

▪ IEEE standard also defines a 64-bit representation to accommodate


▪ more significant bits, and
▪ more bits for the signed exponent, resulting in much higher precision and a much larger range of
values
▪ In general, a binary floating-point number can be represented by (2008 version of IEEE
Standard 754):
▪ a sign for the number
▪ some significant bits
▪ a signed scale factor exponent for an implied base of 2

32
TOPICS COVERED FROM
▪ Textbook 1:
▪ Chapter 1: 1.4.1, 1.4.2, 1.5

33
FLOATING-POINT NUMBERS

▪ A binary floating-point number is represented by (2008 version of IEEE Standard 754):


▪ a sign for the number
▪ some significant bits
▪ a signed scale factor exponent for an implied base of 2

34
IEEE STANDARD FLOATING-POINT FORMATS (32 BIT)
▪ The basic IEEE format is a 32-bit representation that comprises of
▪ a sign bit,
▪ 23 significant bits, and
▪ 8 bits for a signed exponent of the scale factor

35
IEEE STANDARD FLOATING-POINT FORMATS (32 BIT)
▪ Example:
▪ Sign bit: 0, hence +ve number
▪ Mantissa, M: 1.00101000000000000000000
▪ Exponential, E: E’-127, ➔(E’→001010002= 4010)
E: 40-127= -87,
E: 2-87

▪ Actual binary number:


▪ 0.0000000000000000000000000000000000000000000000000000000000000000000000000000000000
0000100101
▪ Corresponding decimal value:
▪ 0.0000000000000000000000000074720904942534238208598929703585511674646113533526659011
8408203125
▪ In the value represented, what about the digit found to the left side of the decimal point?
▪ always be equal to 1
▪ can be left out in IEEE floating-point representation
36
STEPS TO CONVERT 32-BIT REPRESENTATION TO DECIMAL.
1. Obtain the mantissa and rewrite the value as V=1.M
2. To obtain E
i. Convert E’ to its equivalent decimal value
ii. E= E’-127 ➔2E
3. Move point in V towards left or right based on E
i. Move right if +ve
ii. Move left if –ve
iii. Digits before point is the integral part and after is the fractional part.
4. Convert the integral part of binary to decimal equivalent
i. Multiply each digit separately from left side of point till the first digit by 20, 21, 22, … respectively.
ii. Find the sum of all the products obtained in step 1.i.
5. Convert the fractional part of binary to decimal equivalent
i. Divide each digit separately from right side of point till the end by 21, 22, 23, … respectively.
ii. Find the sum of all the products obtained in step 2.i.
6. Add both integral and fractional part to obtain decimal number.
37
32-BIT REPRESENTATION TO DECIMAL: EXAMPLE
▪ IEEE754 32-bit format: 0 10000001 01000110011001100110011
S E’ M
▪ S=0, hence +ve number
▪ E’= 100000012=12910
▪ E = E’-127 =129-127
E=2
▪ Value represented = 1. 01000110011001100110011 × 22
= 101.000110011001100110011
= 5 .1

38
DECIMAL TO 32-BIT REPRESENTATION
▪ 40.15625
▪ S=0, since it is a positive number
▪ Mantissa, M,
▪ 40➔ 1010002 0.15625➔ .001012
▪ =101000.001012 If point moved towards left, then +ve exponent else -ve exponent
▪ =1.0100000101 × 25 Ignore 1 before decimal point
▪ Therefore, M=01000001010000000000000 Fill the remaining positions in the right with zeroes
▪ Exponent, E’, in excess-127 representation:
▪ E’=E+127
▪ =5+127
▪ =132
▪ E’=100001002
▪ 32-bit representation:
▪ 0 10000100 01000001010000000000000

39
STEPS TO CONVERT DECIMAL TO 32-BIT REPRESENTATION
A. Sign bit, S:
i. If positive, the first bit will be a 0
ii. If negative, the first bit will be a 1.
B. Mantissa, M:
i. Divide the integral part by 2 to get its binary equivalent, I (remainder in bottom-up)
ii. Multiply the fractional part by 2 to get its binary equivalent, F (value before point in top-down)
iii. Represent it as I.F
iv. Adjust the point to obtain 1.M
v. M must be 23-bit long. Fill the remaining bits in vector with zeroes

40
STEPS TO CONVERT DECIMAL TO 32-BIT REPRESENTATION
C. Exponent, E’, in “Excess 127 form”:
i. Count the number of places the binary point needs to be moved until a single digit of 1 sits by itself on
the left side of the binary point.
a. Point moved towards left count is positive else it is negative.
ii. Add 127 to your result above.
a. 8-bit binary numbers can range from 0 to 255,
b. exponents in single precision format can range from -126 to +127, that is from 2-126 to 2127 or,
c. approximately, 10-38 to 1038 in size.
d. In “excess 127 form” negative exponents range from 0 to 126, and positive exponents range from 128 to 255.
e. 127 represents a power of zero.
iii. Translate the sum (which will always be a positive value after adding 127) into binary form.
iv. This should be represented using 8 bits, so zeros may be added to the left side to ensure a string length of
8 bits.
v. This 8-bit string represents the exponent.

41
DECIMAL TO 32-BIT REPRESENTATION
▪ Represent -0.09375 in IEEE754 format
▪ S=1, since it is a negative number
▪ Mantissa, M: 0.09375 to binary
▪ 0.09375 x 2 = 0.1875 0 (remember, read downwards)
▪ 0.1875 x 2 = 0.375 0
▪ 0.375 x 2 = 0.75 0
▪ 0.75 x 2 = 1.50 1
▪ 0.50 x 2 = 1.00 1
▪ = 0.000112
▪ =1.1 × 2-4 If point moved towards left, then +ve exponent else -ve exponent. Ignore 1 before point
▪ Therefore, M=10000000000000000000000 Fill the remaining positions in the right with zeroes
▪ Exponent, E’, in excess-127 representation:
▪ E’=E+127
▪ =-4+127
▪ =123
▪ E’=11110112= 01111011
▪ -0.09375 will be represented in IEEE754 format as
▪ 1 01111011 10000000000000000000000
42
IEEE STANDARD FLOATING-POINT FORMATS (64 BIT)
▪ IEEE standard also defines a 64-bit representation to accommodate
▪ more significant bits, and
▪ more bits for the signed exponent, resulting in much higher precision and a much larger range of
values
▪ The 11-bit excess-1023 exponent E’ has the
▪ range 1 ≤ E’ ≤ 2046 for normal values, with 0 and 2047 used to indicate special values,
▪ The actual exponent E is in the range −1022 ≤ E ≤ 1023
▪ providing scale factors of 2−1022 to 21023 (approximately 10±308)

43
SPECIAL VALUES: 32-BIT REPRESENTATION
Sl.No. E’ M Meaning

1. =0 =0 value 0 is represented

2. =255 =0 value ∞ is represented


3. =0 ≠0 denormal numbers are represented. Their value is ±0.M ×2−126.
There is no implied one to the left of the binary point, and M is
any nonzero 23-bit fraction
4. =255 ≠0 the value represented is called Not a Number (NaN).
A NaN represents the result of performing an invalid operation
such as 0/0 or √−1

46
EXCEPTIONS
▪ In conforming to the IEEE Standard, a processor must set exception flags if any of the following
conditions arise when performing operations
▪ underflow,
▪ overflow,
▪ divide by zero,
▪ Inexact: name for a result that requires rounding in order to be represented in one of the normal formats
▪ Invalid: exception occurs if operations such as 0/0 or √−1 are attempted

▪ When an exception occurs, the result is set to one of the special values.

47
ADDITION EXAMPLE- BINARY
▪ A=96.625 + B=12.125
▪ Step i: Convert A to binary representation
▪ 1100000.101 ➔After normalizing we get 1.100000101 X 26
▪ E’= 6+127=133 =10000101
▪ In IEEE 32-bit format: A= 01000010110000010100000…….

▪ Step ii: Convert B to binary representation


▪ 1100.001 ➔After normalizing we get 1.100001 X 23
▪ E’= 3+127=130 = 10000010
▪ In IEEE 32-bit format: B= 01000001010000100000……

▪ Step 1: Choose the number with the smaller exponent and shift its mantissa right a number of steps
equal to the difference in exponents (Shift point to left).
▪ Shift the mantissa of smaller number, B, to the right by 3 bits we get
▪ 1.10000100000 ---Original mantissa (with hidden bit considered)
▪ 0.11000010000 ---shifting by 1 bit
▪ 0.011000010000 ---shifting by 2 bits
▪ 0.001100001000 ---shifting by 3 bits
48
ADDITION EXAMPLE- BINARY
▪ Step 2:
▪ Set the exponent of the result equal to the larger exponent
▪ 10000101 (exponent of A)

▪ Step 3:
▪ Perform addition on the mantissas and determine the sign of the result
1.100000101000000 +
0.001100001000000
1.101100110000000
▪ Step 4:
▪ Normalize the resulting value, if necessary
▪ Result is already normalized

▪ In 32-bit format: 0 10000101 101100110000


▪ which is 108.75 in decimal

49
ADD/SUBTRACT RULE
1. Choose the number with the smaller exponent
2. Shift its mantissa right to the number of steps equal to the difference in exponents. (Shift point to
left)
3. Set the exponent of the result equal to the larger exponent.
4. Perform addition/subtraction on the mantissas and determine the sign of the result.
5. Normalize the resulting value, if necessary.

50
ADDITION EXAMPLE- BINARY
▪ 0.25 = 0 01111101 00000000000000000000000 + 100 = 0 10000101 10010000000000000000000
▪ Step 1: align radix points
▪ shifting the mantissa LEFT by 1 bit DECREASES THE EXPONENT by 1 and RIGHT by 1 INCREASES THE
EXPONENT by 1
▪ we want to shift the mantissa right, because the bits that fall off the end should come from the least significant end
of the mantissa
▪ choose to shift the .25, since we want to increase it's exponent.
▪ shift by 10000101
-01111101
00001000 (8) places.
00000000000000000000000 (original value)
10000000000000000000000 (shifted 1 place) (note that hidden bit is shifted into msb of mantissa)
01000000000000000000000 (shifted 2 places)
00100000000000000000000 (shifted 3 places)
0010000000000000000000 (shifted 4 places)
00001000000000000000000 (shifted 5 places)
00000100000000000000000 (shifted 6 places)
00000010000000000000000 (shifted 7 places)
00000001000000000000000 (shifted 8 places)
51
ADDITION EXAMPLE- BINARY
▪ Step 2: add (don't forget the hidden bit for the 100)
1.10010000000000000000000 (100) +
0.00000001000000000000000 (.25)
1.10010001000000000000000
▪ Step 3: normalize the result (get the "hidden bit" to be a 1)
▪ Result is already normalized

▪ In 32-bit format: 0 10000101 10010001000000000000000

52
SUBTRACTION EXAMPLE- BINARY
▪ A=96.625 - B=12.125
▪ Convert A to binary representation
▪ 1100000.101 ➔After normalizing we get 1.100000101 X 26
▪ E’= 6+127=133 =10000101
▪ In IEEE 32-bit format: 01000010110000010100000…….

▪ Convert B to binary representation


▪ 1100.001 ➔After normalizing we get 1.100001 X 23
▪ E’= 3+127=130 = 10000010
▪ In IEEE 32-bit format: 01000001010000100000……

▪ Step 1:
▪ Choose the number with the smaller exponent and shift its mantissa right a number of steps equal to the
difference in exponents.
▪ A=0 10000101 10000010100000……. B=0 10000010 10000100000……
▪ Shift the mantissa of smaller number, B, to the right by 3 bits we get
▪ 1.10000100000 ---Original mantissa (with hidden bit considered)
▪ 0.11000010000 ---shifting by 1 bit
▪ 0.011000010000 ---shifting by 2 bits
▪ 0.001100001000 ---shifting by 3 bits
53
SUBTRACTION EXAMPLE- BINARY
▪ Step 2:
▪ Set the exponent of the result equal to the larger exponent
▪ 10000101 (exponent of A)

▪ Step 3:
▪ Perform subtraction on the mantissas and determine the sign of the result
1.10000010100000 -
0.00110000100000
1.01010010000000
▪ Step 4:
▪ Normalize the resulting value, if necessary
▪ Result is already normalized

▪ In 32-bit format: 0 10000101 010100100000000


▪ which is 84.5 in decimal

54
SUBTRACTION EXAMPLE- BINARY
▪ 0.25 = 0 01111101 00000000000000000000000 + 100 = 0 10000101 10010000000000000000000
▪ Step 1: align radix points
▪ shifting the mantissa LEFT by 1 bit DECREASES THE EXPONENT by 1 and RIGHT by 1 INCREASES THE
EXPONENT by 1
▪ we want to shift the mantissa right, because the bits that fall off the end should come from the least significant end
of the mantissa
▪ choose to shift the .25, since we want to increase it's exponent.
▪ shift by 10000101
-01111101
00001000 (8) places.
00000000000000000000000 (original value)
10000000000000000000000 (shifted 1 place) (note that hidden bit is shifted into msb of mantissa)
01000000000000000000000 (shifted 2 places)
00100000000000000000000 (shifted 3 places)
0010000000000000000000 (shifted 4 places)
00001000000000000000000 (shifted 5 places)
00000100000000000000000 (shifted 6 places)
00000010000000000000000 (shifted 7 places)
00000001000000000000000 (shifted 8 places)
55
SUBTRACTION EXAMPLE- BINARY
▪ Step 2: add (don't forget the hidden bit for the 100)
1.10010000000000000000000 (100) -
0.00000001000000000000000 (.25)
1.10001111000000000000000
▪ Step 3: normalize the result (get the "hidden bit" to be a 1)
▪ Result is already normalized

▪ In 32-bit format: 0 10000101 10001111000000000000000


133 1.10001111000000000000000
133-127=6
=1.10001111000000000000000 X 26
=1100011.11000000000000000
= 99.75
56
MULTIPLY AND DIVIDE RULE
▪ Multiply Rule
A. Add the exponents and subtract 127 to maintain the excess-127 representation.
B. Multiply the mantissas and determine the sign of the result.
C. Normalize the resulting value, if necessary.

▪ Divide Rule
A. Subtract the exponents and add 127 to maintain the excess-127 representation.
B. Divide the mantissas and determine the sign of the result.
C. Normalize the resulting value, if necessary.

57
TOPICS COVERED FROM
▪ Textbook 1:
▪ Chapter 1: 1.4.2,
▪ Chapter 9: 9.7, 9.7.1

58
ADDITION EXAMPLE- BINARY
▪ A=15.5 + B=15.5
▪ Step i: Convert A to binary representation
▪ 1111.1 ➔After normalizing we get 1.1111 X 23
▪ E’= 3+127=130 =10000010
▪ In IEEE 32-bit format: A= 01000001011110000000000000000000

▪ Step ii: Convert B to binary representation


▪ 1111.1 ➔After normalizing we get 1.1111 X 23
▪ E’= 3+127=130 =10000010
▪ In IEEE 32-bit format: B= 01000001011110000000000000000000

▪ Step 1: Choose the number with the smaller exponent and shift its mantissa right a number of steps
equal to the difference in exponents.
▪ Both are of equal exponent. Hence no shift

▪ Step 2:
▪ Set the exponent of the result equal to the larger exponent
▪ 10000010 (exponent of A or B) 59
ADDITION EXAMPLE- BINARY
▪ Step 3:
▪ Perform addition on the mantissas and determine the sign of the result
1.11110000000000000000000 +
1.11110000000000000000000
11.11100000000000000000000
▪ Step 4:
▪ Normalize the resulting value, if necessary
▪ 1.111100000000000000000000 × 21
▪ Adjust resultant exponent E’ by adding exponent from the normalized resultant mantissa
10000010 + (if the exponent is –ve, subtract it from resultant E’)
00000001
10000011
▪ In 32-bit format: 0 10000011 111100000000000000000000
▪ which is 31.0 in decimal
60
MULTIPLY AND DIVIDE RULE
▪ Multiply Rule
A. Add the exponents and subtract 127 to maintain the excess-127 representation.
B. Multiply the mantissas and determine the sign of the result.
C. Normalize the resulting value, if necessary.

▪ Divide Rule
A. Subtract the exponents and add 127 to maintain the excess-127 representation.
B. Divide the mantissas and determine the sign of the result.
C. Normalize the resulting value, if necessary.

61
MULTIPLICATION EXAMPLE- BINARY
▪ A= 0 10000100 0100 × B= 1 00111100 1100

▪ Step 1: add exponents and subtract 127


▪ 132+60-127=65. and unsigned representation for 65 is 01000001.
▪ Step 2: Multiply the mantissa. Don't forget hidden bit
1.0100 x 1.1100
00000
00000
10100
10100
10100
1000110000 becomes 10.00110000
normalize the result:
1.000110000 X 21
Step 3: Adjust the exponent : 65+1=66 = 01000010

1 01000010 000110000
62
MULTIPLICATION EXAMPLE- BINARY
▪ A=96.625 × B=12.125 1.100000101 x 1.100001
▪ Step i: Convert A to binary representation 1100000101
▪ 1100000.101 ➔After normalizing 1.100000101 X 26
0000000000
0000000000
▪ E’= 6+127=133 =10000101
0000000000
▪ In IEEE 32-bit format: A= 0 10000101 10000010100000…….
0000000000
▪ Step ii: Convert B to binary representation 1100000101
▪ 1100.001 ➔After normalizing 1.100001 X 23 1100000101 ,
▪ E’= 3+127=130 = 10000010 10010010011100101
▪ In IEEE 32-bit format: B= 0 10000010 10000100000……
➔10.010010011100101
▪ Step 1: add exponents and subtract 127
▪ 133+130-127=136 ➔ 10001000 (unsigned representation) ▪ Step 3: Adjust the exponent:
▪ 136+1=137➔ 10001001

▪ Step 2: Multiply the mantissa ▪ 32-bit representation:


▪ 0 10001001 0010010011100101
1.100000101 x 1.100001= 10.010010011100101
▪ = 1.0010010011100101 X 210
▪ =10010010011.100101
After normalizing 1.0010010011100101 X 21
▪ =1024+128+16+2+1+ 0.5+.0625+.015625
▪ =1171.578125 63
DIVISION EXAMPLE- BINARY
▪ A= 127.03125 ÷ B= 16.9375 1.1 1 1____________
▪ Step i: Convert A to binary representation 1.00001111| 1.1 1 1 1 1 1 0 0 0 0 1
▪ 1111111.00001 ➔After normalizing 1.11111100001 X 26
▪ E’= 6+127=133 =10000101
1.0 0 0 0 1 1 1 1
▪ In IEEE 32-bit format: A= 0 10000101 11111100001…… 0111011010
▪ Step ii: Convert B to binary representation 100001111
▪ 10000.1111 ➔After normalizing 1.00001111 X 24 0110010110
▪ E’= 4+127=131 = 10000011 100001111
▪ In IEEE 32-bit format: B= 0 10000011 00001111000……
0100001111
▪ Step 1: subtract exponents and add 127 100001111
▪ 133-131+127=129 ➔ 10000001 (unsigned representation) 000000000
▪ Step 2: Divide the mantissa
1.11111100001 ÷ 1.00001111= 1.111
The result is already normalized.
▪ The result in IEEE 32-bit format:
0 10000001 11100000000…….
64
DIVISION EXAMPLE- BINARY
▪ A= 97.0 ÷ B= 12.125 1.0______
▪ Step i: Convert A to binary representation 1.100001 | 1.100001
▪ 1100001.0 ➔After normalizing 1.100001 X 26
▪ E’= 6+127=133 =10000101
1.100001
▪ In IEEE 32-bit format: A= 0 10000101 10000100…… 0000000
▪ Step ii: Convert B to binary representation
▪ 1100.001 ➔After normalizing 1.100001 X 23
▪ E’= 3+127=130 = 10000010
▪ In IEEE 32-bit format: B= 0 10000010 10000100000……

▪ Step 1: subtract exponents and add 127


▪ 133-130+127=130 ➔ 10000010 (unsigned representation)

▪ Step 2: Divide the mantissa: 1.100001 ÷ 1.100001= 1.0

The result is already normalized.


▪ The result in IEEE 32-bit format:
0 10000010 00000000000…….
i.e.,1.0 X23 = 1000 =8 in decimal 65
GUARD BITS AND TRUNCATION
▪ Mantissas of initial operands and final results are limited to 24 bits, including the

implicit leading 1
▪ To attain maximum accuracy in the final results it is important to retain extra bits,

often called guard bits, during the intermediate steps.


▪ Removing guard bits in generating a final result requires that the extended

mantissa be truncated to create a 24-bit number that approximates the longer


version.

66
WAYS OF TRUNCATION: CHOPPING
▪ Remove the guard bits and make no changes in the retained bits
▪ e.g. To truncate from 6 bits to 3 bits:

▪ 0.b−1b−2b−3000 to 0.b−1b−2b−3111 are truncated to 0.b−1b−2b−3

▪ The error in the 3-bit result ranges from 0 to 0.000111

▪ The error in chopping, ranges from 0 to almost 1 in the least significant position of

the retained bits.


▪ In the example above, it is b−3 position. The result of chopping is a biased

approximation because the error range is not symmetrical about 0

67
WAYS OF TRUNCATION: VON NEUMANN ROUNDING
▪ If the bits to be removed are all 0s,
▪ they are simply dropped, with no changes to the retained bits.

▪ If any of the bits to be removed are 1,


▪ the least significant bit of the retained bits is set to 1

▪ All 6-bit fractions where b−4b−5b−6 ≠ 000 are truncated to 0.b−1b−21

▪ The error in this truncation method ranges between −1 and +1 in the LSB position

of the retained bits


▪ approximation is unbiased because the error range is symmetrical about 0.

▪ When three guard bits are used, the value 0.001100 is truncated to 0.001

68
WAYS OF TRUNCATION: VON NEUMANN ROUNDING
▪ 0.001 00000 → 0.001 (truncate. 0 error)

▪ 0.001 11111 → 0.001 (but the value is near 0.010, hence -1 error at LSB position of retaining bits)

▪ 0.010 11111 → 0.011 (almost nearest value. Almost 0 error)

▪ 0.010 00001 → 0.011 (+1 error)

69
WAYS OF TRUNCATION: ROUNDING
▪ Achieves the closest approximation to the number being truncated and is an unbiased
technique.
▪ A 1 is added to the LSB position of the bits to be retained if there is a 1 in the MSB position of
the bits being removed. Thus,
▪ 0.b−1b−2b−31. . . is rounded to 0.b−1b−2b−3 + 0.001
▪ 0.b−1b−2b−30. . . is rounded to 0.b−1b−2b−3.

▪ Except for the case in which the bits to be removed are 10 . . . 0.


▪ This is a tie situation; the longer value is halfway between the two closest truncated representations.

▪ To break the tie


▪ choose the retained bits to be the nearest even number
▪ the value 0.b−1b−20100 is truncated to 0.b−1b−20
▪ the value 0.b−1b−21100 is truncated to 0.b−1b−21 + 0.001.

▪ Also termed as “round to the nearest number or nearest even number in case of a tie”
▪ The error range is approximately −1/2 to +1/2 in the LSB position of the retained bits. 70
WAYS OF TRUNCATION: ROUNDING
▪ Best method
▪ But most difficult to implement because it requires an addition operation and a possible
renormalization.
▪ This rounding technique is the default mode for truncation specified in the IEEE floating-point
standard
▪ When three guard bits are used, using Rounding procedure, the value 0.001100 is truncated to
0.010
▪ Similarly,
i. 0.111 011=0.111
ii. 0.110 011=0.110
iii. 0.111 101=0.111+0.001=1.000
iv. 0.111 100=0.111+0.001=1.000
v. 0.110 100=0.110
71
vi. 0.101 100=0.101+0.001=0.110
IMPLEMENTING ROUNDING
▪ Requires only three guard bits to be carried along during the intermediate steps in performing
an operation.
▪ The first two of these bits are the two most significant bits of the section of the mantissa to be
removed.
▪ The third bit is the logical OR of all bits beyond these first two bits in the full representation of
the mantissa.
▪ It should be initialized to 0.
▪ If a 1 is shifted out through this position while aligning mantissas, the bit becomes 1 and retains
that value; hence, it is usually called the sticky bit.

72
TOPICS COVERED FROM
▪ Textbook 1:
▪ Chapter 9: 9.7.1, 9.7.2

73

You might also like