Chapter-3 2
Chapter-3 2
Addition and subtraction: - In adding two 32 bit numbers, the lack of a 33rd bit means that when overflow
In addition digits are added bit by bit from right to left, with carries passed to occurs, the sign bit is set with the value of the result instead of the proper sign
the next digit to the left, just as you would do by hand. of the result.
Subtraction uses addition: the appropriate operand is simply negated before Hence, overflow occurs when adding two positive numbers and the sum is
being added. negative, or vice versa. This means a carry out occurred into the sign bit.
Adding (6)10 + (7)10 Overflow occurs in subtraction when we subtract a negative number from a
positive number and get a negative result, or when we subtract a positive
number from a negative number and get a positive result. This means a borrow
occurred from the sign bit.
Multiplication: -
Subtracting (6)10 from (7)10. It can be done directly. Multiplying 1000ten by 1001ten:
This register is then shifted left 1 bit each step to align the multiplicand with
the sum being accumulated in the 64-bit Product register.
The Multiplicand register, ALU, and Multiplier register are all 32 bits wide, with Division requires two operands, called the dividend and divisor, and the
only the Product register left at 64 bits. Now the product is shifted right. The result, called the quotient and remainder.
separate Multiplier register also disappeared. The multiplier is placed in the right The relationship between the components:
half of the Product register. The Product register should really be 65 bits to hold Dividend = Quotient × Divisor + Remainder
the carry from the adder.
Signed multiplication: - A division algorithm and hardware: -
To perform signed multiplication, first convert the multiplier and multiplicand to
positive numbers and then remember the original signs. The algorithms should
then be run for 31 iterations, leaving the signs out of the calculation. We need
negate the product only if the original signs disagree.
Example: -
Multiply 0010 × 0011
Signed division: -
Remember the signs of the divisor and dividend and then negate the quotient if
the signs disagree.
Thus the correctly signed division algorithm negates the quotient if the signs of
the operands are opposite and makes the sign of the nonzero remainder match
the dividend.
Example: -
7 ÷ 2 = Quotient 3, Remainder 1
-7 ÷ 2 = Quotient -3, Remainder -1
7 ÷ -2 = Quotient -3, Remainder 1
-7 ÷ -2 = Quotient 3, Remainder -1
IEEE 754 even has a symbol for the result of invalid operations, such as 0/0 or
subtracting infinity from infinity. This symbol is NaN, for Not a Number.
If we use two’s complement or any other notation in which negative exponents
have a 1 in the most significant bit of the exponent field, a negative exponent The double precision representation is
will look like a big number. (-1)1 × (1 + .1000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
For example, 1.0two × 2-1 would be represented as 0000two) × 2(1022-1023)
Example: -
Convert the binary floating point to decimal floating point
The value 1.0two × 2+1 would look like the smaller binary number
ANS: -
The sign bit is 1, the exponent field contains 129, and the fraction field contains
The desirable notation must therefore represent the most negative exponent as
1 × 2-2 = 1/4, or 0.25. Using the basic equation,
00 . . . 00two and the most positive as 11 . . . 11two. This convention is called
(-1)S × (1 + Fraction) × 2(Exponent - Bias) = (-1)1 × (1 + 0.25) × 2(129-127)
biased notation.
= -1 × 1.25 × 22
IEEE 754 uses a bias of 127 for single precision, so an exponent of -1 is
= -1.25 × 4
represented by the bit pattern of the value -1 + 127ten, or 126ten = 0111 1110two,
= -5.0(ANS)
and +1 is represented by 1 + 127, or 128ten = 1000 0000two. The exponent bias
Floating point addition: -
for double precision is 1023.
Assumption: - Assume to store only four decimal digits of the significand and two
Biased exponent means that the value represented by a floating-point number
decimal digits of the exponent.
is really (-1)S × (1 + Fraction) × 2(Exponent - Bias)
Example: -
The range of single precision numbers is then from as small as ±1.0000 0000
Floating-point addition: 9.999ten × 101 + 1.610ten × 10-1.
0000 0000 0000 000two × 2-126 to as large as ±1.1111 1111 1111 1111 1111
Step-1: -
111two × 2+127.
To be able to add these numbers properly, we must align the decimal point of the
Example: -
number that has the smaller exponent.
Show the IEEE 754 binary representation of the number -0.75ten in single and
1.610 × 10-1 = 0.01610 × 101. But we can represent only 4 digits of the significand.
double precision.
Thus, 0.01610 × 101 can be written as 0.016 × 101.
ANS: -
3 3 Step-2: -
The number -0.75ten is also – ( 4 ) ten or – ( 22 ) ten Perform the addition of significand.
11 𝑡𝑤𝑜 9.999 + 0.016 = 10.015
It is also represented by the binary fraction – ( ) ten or -0.11two
22𝑡𝑒𝑛
Thus, sum is 10.015 × 101.
CHAPTER-3 ARITHMETIC FOR COMPUTERS
Step-3: - Binary floating point addition: -
This sum is not in normalized scientific notation, so we need to adjust it: Example: -
10.015ten × 101 = 1.0015ten × 102 Perform the addition of numbers 0.5ten and -0.4375ten in binary
Whenever the exponent is increased or decreased, we must check for overflow or ANS: -
underflow—that is, we must make sure that the exponent still fits in its field. Binary versions of the two numbers in normalized scientific notation, assuming
Step-4: - that we keep 4 bits of precision are as follows.
Since we assumed that the significand can be only four digits long (excluding the 0.510 = 1.000 × 2-1
2
sign), we must round the number. The number 1.0015ten × 10 is rounded to four -0.437510 = -1.110two × 2-2
2
digits in the significand to 1.002ten × 10 . Step-1: -
Flow chart for floating point addition: - The significand of the number with the lesser exponent (-1.11two × 2-2) is shifted
right until its exponent matches the larger number.
-1.110two × 2-2 = -0.111two × 2-1
Step-2:
Add the significand.
1.000 × 2-1 + (-0.111 × 2-1) = 0.001 × 2-1
Step-3:
Normalize the sum checking for overflow and underflow. 0.001two × 2-1 = 0.010two
× 2-2 = 0.100two × 2-3
= 1.000two × 2-4
Since 127 ≥ -4 ≥ -126, there is no overflow or underflow. (The biased exponent
would be -4 + 127, or 123, which is between 1 and 254, the smallest and largest
unreserved biased exponents.)
Step-4: -
Round the sum 1.000two × 2-4
The sum already fits exactly in 4 bits, so there is no change to the bits due to
rounding.
This sum is then
1 1
1.000two × 2-4 = 0.0001000two = 0.0001two = 4 = = 0.0625ten.
2 16
There are three digits to the right of the decimal point for each operand, so the
decimal point is placed six digits from the right in the product significand:
10.212000ten
Assuming that we can keep only three digits to the right of the decimal point, the
product is 10.212 × 105
Step-3: -
This product is unnormalized, so we need to normalize it:
10.212ten × 105 = 1.0212ten × 106.
At this point, we can check for overflow and underflow.
Step-4: -
We assumed that the significand is only four digits long (excluding the sign), so
we must round the number. The number 1.0212ten × 106 is rounded to four digits in
the significand to 1.021ten × 106.
CHAPTER-3 ARITHMETIC FOR COMPUTERS
Step-5: - Binary floating point multiplication: -
The sign of the product depends on the signs of the original operands. If they are Multiply, 1.000two × 2-1 by -1.110two × 2-2
both the same, the sign is positive; otherwise, it’s negative. Step-1: -
Hence, the product is +1.021ten × 106. Adding the exponents without bias:
-1 + (-2) = -3
Flow chart for multiplication: - Or, using the biased representation:
(-1 + 127) + (-2 + 127) - 127 = (-1 - 2) + (127 + 127 - 127)
= -3 + 127 = 124
Step-2: -
Multiply the significands.
Step-3: -
The product is 1.110000two × 2-3, but we need to keep it to 4 bits, so it is 1.110two ×
2-3.
Step-4: -
Now we check the product to make sure it is normalized, and then check the
exponent for overflow or underflow. The product is already normalized and, since
127 ≥ -3 ≥ -126, there is no overflow or underflow.
Step-5: -
Rounding the product makes no change: 1.110two × 2-3
Step-6: -
Since the signs of the original operands differ, make the sign of the product
negative. Hence, the product is -1.110two × 2-3