0% found this document useful (0 votes)
22 views27 pages

L12 Representation of Numbers

The document discusses different methods for representing integers and floating point numbers in computers. It describes sign-magnitude, one's complement, and two's complement representation of integers. For floating point numbers, it explains scientific notation representation using sign, significand and exponent fields, as well as the IEEE 754 standard.

Uploaded by

Rajdeep Bora
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views27 pages

L12 Representation of Numbers

The document discusses different methods for representing integers and floating point numbers in computers. It describes sign-magnitude, one's complement, and two's complement representation of integers. For floating point numbers, it explains scientific notation representation using sign, significand and exponent fields, as well as the IEEE 754 standard.

Uploaded by

Rajdeep Bora
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 27

CO 214

Computer Architecture and


Organization

Number Representation
Representation of Integers
➢ For purposes of storage and processing, computer uses
strings of binary digits to represent all forms of
information, including numbers. No special symbols are
available for the minus sign and radix point.
• Only binary digits (0 and 1) may be used to represent
numbers, positive/ negative, large and small.
➢ If we are limited to nonnegative integers
• 8-bit word can represent the numbers from 0 to 255
➢ If we are to include negative numbers as well, there are
different conventions such as
• Sign-Magnitude representation
• 1’s complement representation
• 2’s complement representation
• Biased Representation
Representation of Integers
➢ Sign and Magnitude
• The simplest form of representation employs
the MSB as a sign bit
• If the sign bit is 0, the number is positive; if the
sign bit is 1, the number is negative.
• In an n-bit word, the rightmost (n – 1) bits hold
the magnitude of the integer.
• Example:

• Disadvantages:
o Addition and subtraction become complex
o Two representations of 0
Representation of Integers
➢ 1’s complement
▪ 1’s complement representation also uses the MSB as
a sign bit – making it easy to test whether an integer
is positive or negative
▪ Negative of a number is obtained by inverting all of
the bits of the positive binary value
▪ Range of values: –(2^(n – 1) – 1) to +(2^(n – 1) – 1)
▪ Example:
+18 = 00010010
– 18 = 11101101
▪ Addition and subtraction is simple addition of two
binary numbers

▪ Disadvantage:
▪ Two representation for zero: 00000000 is +0 and
11111111 is –0
Representation of Integers
➢ 2’s complement
▪ 2’s complement representation also uses the
MSB as a sign bit – making it easy to test whether
an integer is positive or negative
▪ Negative of a number is obtained by inverting all
of the bits of the positive binary value and adding
a 1 to the result
▪ Range of values: –(2^(n – 1)) to +(2^(n – 1) – 1)
▪ Example:
+18 = 00010010
- 18 = 11101110
▪ Addition and subtraction is simple addition of
two binary numbers
▪ One representation for zero: 00000000
Representation of Integers
▪ Most computer integer representations of computer uses
2’s complement
▪ Other signed integer representations may be used -
• Biased representation: add a bias value to the number
and represent using the corresponding binary
equivalent
▪ Fixed Point Representation
• Above mentioned representation are called fixed point
representations because the radix point is assumed to
have fixed position – right of the rightmost digit (LSB)
Integer Arithmetic – 2’s Complement
Negation
Take the Boolean complement of each bit of the integer, add 1
to the result

Negative of negative is positive


Integer Arithmetic – 2’s Complement
Addition and Subtraction Example 1:
▪ Addition operation proceeds as simple 114 + (-58) = 56
addition of two unsigned integers
(114)10 = (01110010)2
▪ In some instances, if there is a carry bit (58)10 = (00111010)2
beyond the end of the word it is ignored. (-58)10 = (11000110)2
▪ If the result is larger than can be held in the (114)10 + (-58) =
word size being used – it is called overflow. (01110010)2
▪ When overflow occurs, the ALU must signal so + (11000110)2
that the result is not used for processing = (00111000)2 = (56)10
• OVERFLOW RULE: If two numbers are added,
Example 2:
and they are both positive or both negative,
(-114) + 58 = -56
then overflow occurs if and only if the result
has the opposite sign. (-114)10 = (10001110)2
▪ Subtraction is achieved using addition. (-114)10 + 58 =
• SUBTRACTION RULE: To subtract one number (10001110)2
(subtrahend) from another (minuend), take + (00111010)2
the twos complement (negation) of the = (11001000)2 = (-56)10
subtrahend and add it to the minuend.
Integer Arithmetic – 2’s Complement
Multiplication
▪ Multiplication of positive numbers is straightforward
• 4-bit multiplied to 4-bit results in 8-bits
• Filling up the left most values in the partial product with
binary 1s.
▪ For negative numbers,
▪ simple multiplication of 2’s complement value will not give
the correct result (e.g. -5 * -3 = -113)
▪ If the multiplier is –ve, compute the product by taking the
magnitude of the multiplier and then take a 2’s compliment.
Integer Arithmetic – 2’s Complement
➢ Division
▪ Division of unsigned binary integers is straightforward

▪ 2’s complement division for negative numbers can be done


by converting the operands into unsigned values perform
division and then account for the sign on the resultant
quotient
• D=Q*V+R
• sign(R) = sign(D)
• sign(Q) = sign(D) * sign(V)
Questions?
Floating Point Representation
▪ By assuming a fixed binary or radix point, numbers with a
fractional component may be represented - but limited in the
range of numbers
▪ Solution: Use scientific notation (floating point)
• 976,000,000,000,000 = 9.76 * 10^14, and
• 0.0000000000000976 = 9.76 * 10^(-14)
▪ Any number can be represented in the form
• Sign: + or –; Significand S; Exponent E
• The base B is implicit and is the same for all the quantities
Floating Point Representation
➢ Sign
▪ 0 for positive; 1 for negative numbers
➢ Exponent
▪ Exponent is stored using a biased representation
▪ A fixed value, called the bias, is subtracted from the field to get
the true exponent value
▪ Typically, bias is (2^(k – 1) – 1), where k is the number of bits in
the binary exponent
▪ For an 8-bit field the range of unsigned numbers is 0 to 255
Bias = 2^7 – 1 = 127,
actual exponent values are in the range -127 to +128
▪ Advantage of biased representation: Comparison is easier
Floating Point Representation
Significand
▪ A floating point number can have many representations. e.g.
binary 10100 can be represented as-
0.101 x 2^5 or 101 x 2^2 or
0.0101 x 2^6 or 1.01 x 2^4
▪ To simplify operations need a standard representation –
normalization
▪ For a base 2 representation a normalized number is one in
which the MSB is of the significand is 1 i.e.
▪ In such normal numbers – MSB is implicit, not stored
▪ A 23bit field is used to store 24 bit significand
▪ The representation as presented will not accommodate a value
of 0
▪ Include a special bit-pattern of 0s.
Floating Point Representation

Overflow – when an arithmetic operation results in an absolute value greater


than can be expressed with an exponent of 128
Underflow – when the fractional magnitude is too small – less serious problem
because result can generally be approximated by 0
Floating Point Representation

➢ Trade offs
▪ Numbers represented in floating-point notation are not
spaced evenly along the number line, unlike fixed-point
numbers
▪ Range determines maximum and minimum number of
values that can be represented
▪ Precision determines the minimum fractional value that
can be represented accurately
▪ Trade-off between range and precision:
• more number of bits in significand => more precision
• more number of bits in exponent => greater range
Floating Point – IEEE Standard

▪ IEEE standard 754 for floating point numbers was


developed to facilitate portability of programs from one
processor to another
▪ 3 basic binary formats defined with lengths of 32, 64, and
128 bits
• Exponents of length 8, 11, 15 respectively
▪ The standard defines extended precision formats, which
extend a supported basic format by providing additional
bits in the exponent (extended range) and in the
significand (extended precision)
• Defines only standards; implementation dependent
Floating Point – IEEE 754 Standard
Floating Point Arithmetic
▪ For addition and subtraction – both operands must have the
same exponent value
• This may require shifting the radix point on one of the operands
to achieve alignment
▪ Multiplication and division are more straightforward.
Floating Point Arithmetic
▪ A floating-point operation may produce one of these
conditions:
• Exponent overflow: A positive exponent exceeds the
maximum possible exponent value; may be designated as
+∞ or –∞.
• Exponent underflow: A negative exponent is less than the
minimum possible exponent value; it may be reported as 0.
• Significand underflow: While aligning significands, digits
may flow off the right end of the significand; requires
rounding
• Significand overflow: Addition of two significands of the
same sign may result in a carry out of the most significant
bit; requires realignment
Floating Point Arithmetic
▪ In floating-point arithmetic, addition and
subtraction are more complex than multiplication
and division
• There is need for alignment
▪ There are four basic phases of the algorithm for
addition and subtraction
1. Check for zeros
2. Align the significands
3. Add or subtract the significands
4. Normalize the result.
Floating Point
Guard Bits
▪ For a floating-point operation, the exponent and significand of the
operands are loaded into ALU registers provided for them
▪ In case of the significand, the length of the register is more than
the length of the significand plus an implied bit
▪ The register contains additional bits, called guard bits, which are
provided to hold the LSBs of the significand in case of shifting of
the bits during alignment with the other operand.
▪ This is required to maintain precision

For example, computing x-y: x = 1.00...00 * 2^1 and y = 1.11...11 * 2^0


Floating Point
➢ Rounding
▪ The result of any operation on the significands is generally
stored in a longer register
▪ When the result is put back into the floating-point format,
the extra bits must be eliminated in such a way as to
produce a result that is close to the exact result – rounding.
▪ Round to nearest: The result is rounded to the nearest
representable number
▪ Round toward +∞ : The result is rounded up toward plus
infinity.
▪ Round toward –∞: The result is rounded down toward
negative infinity
▪ Round toward 0: The result is rounded toward zero.
IEEE 754
Special cases:
➢ Denormalized numbers
▪ When exponent is all 0’s, it is considered a special case and
the floating point number thus represented is assumed to
be a de-normalized number – i.e. the implicit integral part of
the significand is assumed to be 0 (not 1)
• To represent 0 (zero): Exponent all 0’s, significand all 0’s

➢ NaN (not a number)


▪ To signal error conditions: overflow, underflow, division by 0
• Exponent of all 0’s and non-zero significand
Floating Point – IEEE 754 Standard
Questions?
Thank You

You might also like