DATA REPRESENTATION
1.1 INTRODUCTION
We may define a digital conmputer or a digital system as a machine which accepts a stream
of symbols, stores them, processes them according to precise rules and produces a stream
of symbols at its output. At the simplest level, a digital processor may accept a single
number at its input, perform an operation on it and produce another number at its output.
For example, a processor to find the square of a one digit number would fall in
category. At a more complex level a large number of symbols may be processed ui
extensive rules. As an example consider a digital system to automatically print a book. Su
a system should accept a large text or the typewritten material. Given the number of letters
which could be accommodated on a line (page width) and the rules for hyphenating a wOrd
it should determine the space to be left between words on a line so that all lines are alianes
on both the left and right hand sides of a page. The processor should also arrange lines
into paragraphs and pages as directed by commands. Decisions to leave space for figures
should be made. A multitude of such decisions are to be taken before a well laid out book
is obtained. Such complex processing would require extensiVe special facilities such as a
large amount of storage, electronic circuits to count and manipulate characters, and a
printer which has a complete assortment of various sizes and styles of letters. Regardless
of the complexity of processing, there are some basic features which are common to all
digital processing of information which enable us to treat the subject in a unified manner
These features are:
Digital Logic and Computer Organization
1. All streams of input symbols to a digital system are encoded with two distinct
symbols. These symbols, 0 (zero) and 1 (one), are known as binary digits or bits.
Bits can be stored and processed reliably and inexpensively with currently available
electronic circuits.
2. Instructions for manipulating symbols are to be precisely specified such that a
machine can be built to execute each instruction. The instructions for manipulation
are also
3. A digital computer has a storage unit in which the symbols to be manipulated are
stored. The encoded instructions for manipulating the symbols are also stored in
the storage unit.
4. Bit manipulation instructions
are realized by electronic circuits. Examples of simple
manipulation instructions are: add two bits, compare two bits and move one bit
from one storage unit to another. Complex manipulation instructions maybe built
using simple instructions. A sequence of instructions for accomplishing a complex
task may be stored in the storage unit and is called a program. The idea of building a complex instruction
with a sequence of simple instructions is important in building digital computers.
The logic design of digital computers and systems consists of implementing the four
basic steps enumerated above keeping in view the engineering constraints such as thne
availability of processing elements, their cost, reliability, maintainability and ease of fabrication.
At this stage, we should distinguish between the design of a general purpose digital
Computer and that of a specialized digital subsystem. Even though the four basic steps in
design are common to both, the constraints which are peculiar to each of these lead to a
difference in the philosophy of design.
A general purpose machine is designed to perform a variety of tasks. Each task
requires the execution of a different sequence of processing rules. The processing rules to
be followed vary widely. At the outset one may not be able to predict all the tasks he may
like to do with a machine. A flexible design is thus required. This flexibility is achieved by
carefully selecting the elementary operations to be implemented through electronic circuits.
These electronic circuits are together called hardware. One may realize a complex operation
by using various sequences of elementary operations. For example, one may realize a
multiplication operation by repeated use of addition operation, which may be thought of as
a macro operation. A set of måcros Could be used tO perform more complex tasks. One can
thus build up a hierarchy of programs, all stored in the computer's memory, which can be
invoked by any user to perform a very complex task. A user need not work only with the
elementary operations available as hardware functions. He can use the hierarchy of programs
which constitute the software of a computer and which is an integral part of a general
purpose digital computer.
It should be observed that it is possible to perform macro operations entirely by
specially designed electronic circuits rather than by using programs. Thus, software can be
replaced by hardware and vice versa. What basic tasks are to be performed by hardware
and what are to be done by combined software and hardware is an engineering design
decision which depends on cost versus speed requirements and other constraints prevailing
at a given time. One of the purposes of this book is to bring out the hardware-software
trade-off, which is important in the design of general purpose computers.
This book deals with two aspects of digital computer design namely, computer logic
and computer organization. There are three layers in computer design as shown in Figure 1.1.
The bottom-most layer deals with digital circuits, which are used for arithmetic and logic
operations. It deals with combining these logic blocks to perform more complex logical
functions. Important topics that computer logic covers are representation of data (numerical,
character, graphics, audio and video) as strings of binary digits, use of Boolean algebra as
a modelling tool, physical realization of Boolean functions using logic gates, combinational
and sequential logic circuits and how to realize a digital processing requirement specification
using combinational and sequential logic circuits.
Top Layer
Middle Layer
Bottom Layer
Computer organization primarily deals with combining building blocks described in
computer logic as a programmable computer system. Besides arithmetic logic unit it is also
about designing memory, I/0 systems and ensuring their cooperative operation to carry out
a sequence of instructions namely, a program. In this book we will also be describing
important hardware-software trade-offs to ensure optimal functioning of a computer.
Computer architecture primarily deals with methods of alleviating speed mismatch
between CPU, Memory and I/0 units by a combination of hardware and software methods.
It also deals with the interaction of the hardware with the operating system to ensure easy
and optimal operation of a computer. We will not discuss this aspect of computer design
in this book.
1. Numbers
2. Characters
This chapter discusses the representation of data in digital systems. The five main
categories of data are:
3. Pictures or Images
4. Video
5. Audio
NUMBERING SYSTEMS
The most widely used number system is the positional system. In this system the positio
of various digits indicates the significance to be attached to that digit. For example. th
number 8072.443 is taken to mean
8 x 103
1000th
position
+0 x 102
100th
position
+7 x 101
Tenth
position
+2 x 100
Unit
position
and would be interpreted to mean:
+4 X 10-1 +4x 10-2
1/10th
position
1/100th
position
In this notation the zero in the number 8072 is significant as it fixes the position and
consequently, the weights to be attached to 8 and 7. Thus, 872 does not equal 8072.
An example of a non-positional number system is the Roman numeral system. This
number system is quite complicated due to the absence of a symbol for zero.
Positional number systems have a radix or a base. In the decimal system the radix
is 10. A number system with radix r will have r symbols and would be written as:
an an-1 a-2 a a-1 a2 ... a-m
+3 x 1O-3
1/1000th
position
The symbols anr an-l .., a-m used in the above representation should be one of the
r symbols allowed in the system. In the above representation a, is called the most signifi-
cant digit of the number and a-m (the last igit) is called the least significant digit.
The equivalent number in decimal is thus:
In digital systems and computers, the number system used has a radix 2 and is called
the binary system. In this system only two symbols namely, 0 and 1 are used. The symbol
is called a bit, a shortened form for binary digit.
1x 23 +0x 22 + 1x 21 + 1 x 20 + 1 x2-1 + 0 x 2-2+ 1 x2-3
A number in the binary system will be written as a sequence of 1s and Os. For
example, 1011.101 is a binary number and would mean:
8+0+2 +1+1/2 +0 + 1/8 = 11.625
Binary
0000
0001
0010
0011
0100
0101
Decimal
0110
0111
Binary number:
Hexadecimal:
Binary Equivalents of Decimal Numbers
Binary
Hexadecimal
10
11
100
101
Decimal Binary
101
2 at
TABLE 1.1
It is seen that the length of binary numbers can become quite long and cumbersome
for human use. Hexadecimal system (base 16) is thus often used to convert binary to a form
requiring lesser number of digits. The hexadecimal system uses the 16 symbols 0, 1, 2,
..., 7, ..., 9, A, B, C, D, E. As its radix 16 is a power of 2, namely 24, each group of four
bits has hexadecimal equivalent. This is shown in Table 1.2. It is, therefore, fairly simple
to convert binary to hexadecimal and vice versa. (One must contrast this with conversion
of binary to decimal.)
Decimal
10
Binary Numbers and Their Hexadecimal and Decimal Equivalents
11
110
111
1000
1001
1010
1011
TABLE 1.2
1010
Binary
1000
1001
1010
1011
1100
1101
1110
1111
Decimal
12
13
14
1011
15
ÀB
16
17
Hexadecimal
Example 1.1. Convert the following binary number to hexadecimal:
Binary
1100
1101
1110
1111
10000
10001
Decimal
As illustrated in Example 1.1, one may convert a binary number to hexadecimal by
grouping together successive four bits of the binary number starting with its least signifi-
cant bit. These four bit groups are then replaced by their hexadecimal equivalents.
0111
10
Because of the simplicity of binary to hexadecimal (abbreviated as Hex) conversion,
when converting from binary to decimal, it is often faster to first convert from binary to Hex
and then convert the Hex to decimal.
11
12
13
14
15
Example 1.2. Convert the following binary number to Hexadecimal
Binary number:
Hex number:
1.3
11
Observe that groups of four bits in the integral part of the binary number are formed
starting from the right most bit as leading 0s here are not significant. On the other hand
bits on the fractional part are grouped from left to right as the right most bits of the
fractional part are not significant.
The decimal equivalent of (3B5.DC)Hex is (using Table 1.2)
3 x 16 + B× 16 + 5 x 160· D
1011
and remainder r= a-
= 3 x 256 + 11 x 16 + 5 x 1 13 x 16-1 + 12 x 16-2
= 768 + 176 +5 13/16 + 12/256
= 949.859375
If we divide d by 2, we obtain:
0101
DECIMAL TO BINARY CONVERSION
In addition to knowing how to convert binary numbers to decimal, it is also necessary to
know the technique of changing a decimal number tO a binary number. The method is based
on the fact that a decimal number may be represented by:
and the remainder equals a.
2 19
29
d= a,2n + an-12n-1 + + a,2 + a20
Quotient q = d/2 = a,2n-1 + an-12-2 + . + a,20
24
22
x 16-1 + Cx 16-2
1101
q/2 = d/(2 x 2) = a2-2 t an-12-3 + + a,20
Observe that a, is the least significant bit of the binary equivalent of d. Dividing the
quotient by 2, we obtain:
Remainder
Thus, successive remainders obtained by division yield the bits of the binary number.
Division is terminated when q = 0, The prOcedure is illustrated in Example 1.3.
Example 1.3. Convert the decimal number 19 to binary.
Least significant bit
(1.1)
Most significant bit
(1.2)
2. For example, the decimal number 949 is converted to Hex in Example 1.4.
(1.3)
Thus. 19 = 10011. Check: 10011 = 1x 16! + 3 x 160 = 16 + 3 = 19.
Decimal to Hex conversion is similar. In this case 16 is used as the divisor instead of
1.4 BINARY CODED DECIMAL NUMBERS
We considered in the previous sections the methods of converting decimal numbers to
binary form and vice versa. There is another method of representing decimal numbers using
binary digits. This method is called binary coded decimal (BCD) representation.
There are 10 symbols in the decimal system namely, 0, 1, .., 9. Encoding is the
procedure of representing each one of these 10 symbols by a unique string consisting of
the two symbols of the binary system namely, 0 and 1. It is further assumed that the same number of
bits are used to represent any digit. The number of symbols which could be
represented using n bits is 2n. Thus, in order to represent the 10 decimal digits we require
at least four bits as three bits will allow only 23 = 8 possible distinct three-bit groups.
The method of encoding decimal numbers in binary is to make up a table of 10 unique
four-bit groups and allocate one four-bit group to each decimal digit as shown in Table 1.3.
Encoding Decimal Digits in Binary
Decimal Digit
0
TABLE 1.3
Binary Code
0000
0001
0010
0011
0100
0101
0110
0111
1000
1001
If we want to represent a decimal number, for example, 15, using the code given in
Table 1.3 we look up the table and get the binary code for 1 as 0001 and that for 5 as
0101 and code 15 by the binary code 00010101.
We must at this point distinguish carefully between encoding and conversion. For
example, 15 when converted to binary would be 1111. On the other hand, when it is
encoded each digit must be represented by a four-bit code and an encoding is 00010101.
It should be observed that encoding requires more bits compared to conversion. On the
average log, 10 = 3.32 bits are required when decimal numbers are converted to binary:
as compared with this 4 bits per digit are needed in encoding. The ratio (4/3.3) = 1.2 is
a measure of the extra bits (and consequently extra storage) required if an encoding is
used. On the other hand, conversion of decimal to binary is slower compared to encoding.
This is due to the fact that an algorithm involving successive division is needed for conver-
sion whereas encoding is by straightforward table look-up. The slowness of conversion is
not a serious problem in computations in which the volume of input/output is small. In
business computers, where input/output dominates, it is necessary to examine BCD repre-
sentation. In smaller digital systems such as desk calculators, digital clocks, etc., it is
uneconomical to incorporate complexX electronic circuits to convert decimal to binary and
vice versa. Thus, BCD representation should be considered.
We saw that we need at least 4 bits to represent a decimal digit. There are, however,
16 four-bit groups. We need only 10 of these 16 for encoding decimal digits. There are 30
billion ways we can pick an ordered sequence of 10 out of 16 items (in other Words there
are 16!/6! permutations of selecting 10 out of 16 items). These many codes can thus be
Constructed.
Fortunately, all these 3 x 1010 possible codes are not useful. Only a small number or
these are used in practice and they are chosen from the viewpoint of ease in arithmetic,
some error detection property, ease in coding, and any other property useful in a given
application. The useful codes may be divided broadly into four classes:
1.4.1
1. Weighted codes
2. Self complementing codes
3. Cyclic, Reflected or Gray codes and
4. Error detecting and correcting codes.
Weighted Codes
In a weighted code the decimal value of a code is the algebraic sum of the weights of those
columns in which a l appears. In other words, d= E w(i)b(i) where w(i)s are the weights
and b()s are either 0 or 1. Three weighted codes are iven in Table 1.4.
Decimal Digit
Examples of Weighted Codes
Weights
84 2 1
000 0
000 1
001 0
0011
0100
0 101
011 0
0 111
TABLE 1.4
1000
1001
Weights*
84 21
0000
1.4.2 Self-Complementing Codes
011 1
0110
0101
0100
1011
1010
1001
100 0
111 1
*An overbar is used to indicate a negative weight (2 = -2)
Weights
2421
000 0
000 1
001 0
001 1
0100
101 1
1 10 0
110 1
1110
1111
In a weighted code we may have negative weights. Further, the same weight may be
repeated twice as in the 2, 4, 2, 1 code. The criterion in choosing weights is that we must
be able to represent all the decimal digits from 0 through 9 using these weights.
The 8, 4, 2, 1 code uses the natural weights used in binary number representation.
Thus, it is known as Natural Binary Coded Decimal or NBCD for short. The first 10 groups
of four bits represent 0 through 9. The remaining six groups are unused and are illegal
Combinations. They may be used sometimes for error detection.
1.6
ALPHANUMERIC CODES
In the previous sections we saw how numerical data is represented in computers. Thic :.
the simplest data type. Historically, it was the first type of data processed by dioits
computers. The versatility of modern computers arises due to its ability to process a varih.
of data types.
Types of data may be classified as shown in Figure 1.4.
Numeric
Text
Data
Pictures
Fig. 1.4 Types of data.
Audio
Video
Textual data consists of alphabets, special characters and numbers when not used
in calculations (e.g., Telephone number).
Picture data are line drawings, photographs (both monochrome and colours), hand-
written data, fingerprints, medical images, etc. They are two dimensional and time invariant.
Audio data are sound waves such as speech and music. They are continuOus and
time varying signals.
Video data are moving pictures such as that taken by movie cameras. They are
actually a sequence of moving pictures. Like audio, video data is also time varying. In this
section we will describe the representation of textual data. We will describe other data types
and their representation in the next section.
As stated, textual data consists of alphabets and special characters besides decimal
numbers. These are normally the 26 English letters, the 10 decimal diqits and several
special haracters Such as +, -, X, , $, etc. In order to code these with binary numbers
one needs a string of binary digits. In the older computers the total number of characters was less than
64 and a string of six bits was used to code a character. In all current
machines the number of chaaracters has increased. Besides capital letters of the alphabet,
the lower case letters are also used and several mathematical symbols such as >, <, 2,
be have been introduced. This has made the six-bit byte inadequate to code all characters
oding schemes use seven or eight bits to code a character. With seven bits we can
e 128 characters, which is quite adequate. In order to ensure uniformity in coding
aracters, a standard seven bit code, ASCII (American Standard Code of Information
Interchange) has been evolved.
1.6.1 ASCII Code
The ASCII code is used to code two types of data. One type is the printable characters
Such as digits, letters and special characters. The other set is known as control characters,
which represent coded data to control the operation of digital computers and are not
printed. The ASCII code (in Hexadecimal) is given as Table 1.11.
Hex
NUL
SOH
STX
ETX
EOT
ENO
ACK
BEL
BS
HT
LF
VT
FF
CR
SO
DLE
DC1
DC2
DC3
DC4
NAK
SYN
ETB
CAN
EM
SUB
ESC
FS
GS
RS
US
ASCII Code for Characters
(Most significant hex digit)
SP
TABLE 1.11
$
&
W
X
DEL
It may be observed from Table 1.11 that the codes for the English letters are in the
same sequence as their lexical order, that is, the order in which they appear in a dictionary.
The hexadecimal codes of A, B, C, ... Z, in ASCII are respectively 41, 42, 43, 44, 45, .t
5A. This choice of codes is useful for alphabetical sorting, searching, etc.
A parity bit may be added to the seven-bit ASCII character code to yield an eight-bit
Code. A group of eight bits is known as a byte. A byte would be sufficient to represent a
Character or to represent two binary coded decimal digits. An abbreviation B is universally
Used for bytes. We will henceforth use B for byte.
It is possible to add redundant bits to a seven-bit ASCII code to make an eror Correcting code. At least
four check bits are required to detect and correct a single error. The
Construction of Hamming codes for characters is left as an exercise to the reader.