Module 1 Data Rep
Module 1 Data Rep
DATA REPRESENTATION
1. Unsigned Numbers:
Unsigned numbers don’t have any sign, these can contain only magnitude of the
number. So, representation of unsigned binary numbers are all positive numbers
only. For example, representation of positive decimal numbers are positive by
default. We always assume that there is a positive sign symbol in front of every
number.
2. Signed Numbers:
Signed numbers contain sign flag, this representation distinguish positive and
negative numbers. This technique contains both sign bit and magnitude of a
number. For example, in representation of negative decimal numbers, we need
to put negative symbol in front of given decimal number.
Representation of Signed Binary Numbers:
There are three types of representations for signed binary numbers. Because of
extra signed bit, binary number zero has two representation, either positive (0)
or negative (1), so ambiguous representation. But 2’s complementation
representation is unambiguous representation because of there is no double
representation of number 0. These are: Sign-Magnitude form, 1’s complement
form, and 2’s complement form which are explained as following below.
Sign Magnitude
Sign magnitude is a very simple representation of negative numbers. In sign
magnitude the first bit is dedicated to represent the sign and hence it is called
sign bit.
Sign bit ‘1’ represents negative sign.
Sign bit ‘0’ represents positive sign.
In sign magnitude representation of a n – bit number, the first bit will represent
sign and rest n-1 bits represent magnitude of number.
For example,
+25 = 011001
Where 11001 = 25
And 0 for ‘+’
-25 = 111001
Where 11001 = 25
And 1 for ‘-‘.
These are above smallest positive number and largest positive number which
can be store in 32-bit representation as given above format. Therefore, the
smallest positive number is 2-16 ≈ 0.000015 approximate and the largest
positive number is (215-1)+(1-2-16)=215(1-2-16) =32768, and gap between these
numbers is 2-16.
We can move the radix point either left or right with the help of only integer
field is 1.
As mentioned in Table 1 the single precision format has 23 bits for significand (1
represents implied bit, details below), 8 bits for exponent and 1 bit for sign.
For example, the rational number 9÷2 can be converted to single precision float
format as following,
9(10) ÷ 2(10) = 4.5(10) = 100.1(2)
The result said to be normalized, if it is represented with leading 1 bit, i.e. 1.001 (2) x
22. (Similarly when the number 0.000000001101 (2) x 23 is normalized, it appears as
1.101(2) x 2-6). Omitting this implied 1 on left extreme gives us the mantissa of float
number. A normalized number provides more accuracy than corresponding de-
normalized number. The implied most significant bit can be used to represent even
more accurate significand (23 + 1 = 24 bits) which is
called subnormal representation. The floating point numbers are to be represented
in normalized form.
The subnormal numbers fall into the category of de-normalized numbers. The
subnormal representation slightly reduces the exponent range and can’t be
normalized since that would result in an exponent which doesn’t fit in the field.
Subnormal numbers are less accurate, i.e. they have less room for nonzero bits in the
fraction field, than normalized numbers. Indeed, the accuracy drops as the size of the
subnormal number decreases. However, the subnormal representation is useful in
filing gaps of floating point scale near zero.
In other words, the above result can be written as (-1) 0 x 1.001(2) x 22 which yields the
integer components as s = 0, b = 2, significand (m) = 1.001, mantissa = 001 and e =
2. The corresponding single precision floating number can be represented in binary
as shown below,
As mentioned in Table – 1 the double precision format has 52 bits for significand (1
represents implied bit), 11 bits for exponent and 1 bit for sign. All other definitions
are same for double precision format, except for the size of various components.
Precision:
The smallest change that can be represented in floating point representation is called
as precision. The fractional part of a single precision normalized number has exactly
23 bits of resolution, (24 bits with the implied bit). This corresponds to log (10) (223) =
6.924 = 7 (the characteristic of logarithm) decimal digits of accuracy. Similarly, in
case of double precision numbers the precision is log (10) (252) = 15.654 = 16 decimal
digits.
Sign bit is the first bit of the binary representation. ‘1’ implies negative number
and ‘0’ implies positive number.
Example: 11000001110100000000000000000000 This is negative number.
Exponent is decided by the next 8 bits of binary representation. 127 is the unique
number for 32 bit floating point representation. It is known as bias. It is
determined by 2k-1 -1 where ‘k’ is the number of bits in exponent field.
There are 3 exponent bits in 8-bit representation and 8 exponent bits in 32-bit
representation.
Thus
bias = 3 for 8 bit conversion (2 3-1 -1 = 4-1 = 3)
bias = 127 for 32 bit conversion. (2 8-1 -1 = 128-1 = 127)
Example: 01000001110100000000000000000000
10000011 = (131)10
131-127 = 4
Hence the exponent of 2 will be 4 i.e. 2 4 = 16.
Mantissa is calculated from the remaining 23 bits of the binary representation. It
consists of ‘1’ and a fractional part which is determined by:
Example:
01000001110100000000000000000000
The fractional part of mantissa is given by:
1*(1/2) + 0*(1/4) + 1*(1/8) + 0*(1/16) +……… = 0.625
Thus the mantissa will be 1 + 0.625 = 1.625
The decimal number hence given as: Sign*Exponent*Mantissa = (-
1)0*(16)*(1.625) = 26
Sign bit is the first bit of the binary representation. ‘1’ implies negative number
and ‘0’ implies positive number.
Example: To convert -17 into 32-bit floating point representation Sign bit = 1
Exponent is decided by the nearest smaller or equal to 2 n number. For 17, 16 is
the nearest 2n. Hence the exponent of 2 will be 4 since 2 4 = 16. 127 is the unique
number for 32 bit floating point representation. It is known as bias. It is
determined by 2k-1 -1 where ‘k’ is the number of bits in exponent field.
Thus bias = 127 for 32 bit. (2 8-1 -1 = 128-1 = 127)
Now, 127 + 4 = 131 i.e. 10000011 in binary representation.
Mantissa: 17 in binary = 10001.
Move the binary point so that there is only one bit from the left. Adjust the
exponent of 2 so that the value does not change. This is normalizing the number.
1.0001 x 24. Now, consider the fractional part and represented as 23 bits by
adding zeros.
00010000000000000000000
Ripple carry adder circuit.
Multiple full adder circuits can be cascaded in parallel to add an N-bit number.
For an N- bit parallel adder, there must be N number of full adder circuits. A
ripple carry adder is a logic circuit in which the carry-out of each full adder is
the carry in of the succeeding next most significant full adder. It is called a
ripple carry adder because each carry bit gets rippled into the next stage. In a
ripple carry adder the sum and carry out bits of any half adder stage is not valid
until the carry in of that stage occurs.Propagation delays inside the logic
circuitry is the reason behind this. Propagation delay is time elapsed between
the application of an input and occurance of the corresponding output. Consider
a NOT gate, When the input is “0” the output will be “1” and vice versa. The
time taken for the NOT gate’s output to become “0” after the application of
logic “1” to the NOT gate’s input is the propagation delay here. Similarly the
carry propagation delay is the time elapsed between the application of the carry
in signal and the occurance of the carry out (Cout) signal. Circuit diagram of a
4-bit ripple carry adder is shown below.
In ripple carry adders, for each adder block, the two bits that are to be added
are available instantly. However, each adder block waits for the carry to arrive
from its previous block. So, it is not possible to generate the sum and carry of
any block until the input carry is known. The block waits for
the block to produce its carry. So there will be a considerable time
delay which is carry propagation delay.
Consider the above 4-bit ripple carry adder. The sum is produced by the
corresponding full adder as soon as the input signals are applied to it. But the
carry input is not available on its final steady state value until
carry is available at its steady state value. Similarly depends
on and on . Therefore, though the carry must propagate to
all the stages in order that output and carry settle their final
steady-state value.
The propagation time is equal to the propagation delay of each adder block,
multiplied by the number of adder blocks in the circuit. For example, if each
full adder stage has a propagation delay of 20 nanoseconds, then will
reach its final correct value after 60 (20 × 3) nanoseconds. The situation gets
worse, if we extend the number of stages for adding more number of bits.
Carry Look-ahead Adder :
A carry look-ahead adder reduces the propagation delay by introducing more
complex hardware. In this design, the ripple carry design is suitably
transformed such that the carry logic over fixed groups of bits of the adder is
reduced to two-level logic. Let us discuss the design in detail.
Consider the full adder circuit shown above with corresponding truth table.
We define two variables as ‘carry generate’ and ‘carry
propagate’ then,
From the above Boolean equations we can observe that C4 does not have to
wait for C3 and C2 to propagate but actually C4 is propagated at the same time
as C3 and C2. Since the Boolean expression for each carry output is the sum of
products so these can be implemented with one level of AND gates followed
by an OR gate.
The implementation of three Boolean functions for each carry output (C2, C3
and C4) for a carry look-ahead carry generator shown in below figure.
Time Complexity Analysis :