COA Chap 123
COA Chap 123
Chapter one
Computer organization
Is concerned with the way the hardware components operate and the way they are connected
together to form computer system.
It includes Hardware details transparent to the programmer such as control signal and
peripheral.
It describes how the computer performs.
Example: circuit design, control signals, memory types this all are under computer
organization.
frequently called micro architecture
Physical components
How to do?
Computer architecture
Is concerned with the structure and behavior of comp system as seen by the user.
It includes information, formats, instruction set and techniques for addressing memory.
It describes what the computer does.
Logic
What to do?
Logic gates
The fundamental building block of all digital logic circuits is the gate. Logical functions are
implemented by the interconnection of gates. A gate is an electronic circuit that produces an
output signal that is a simple Boolean operation on its input signals. The basic gates used in
digital logic are AND, OR, NOT, NAND, NOR, and XOR. The following figure depicts these
six gates. Each gate is defined in three ways: graphic symbol, algebraic notation, and truth table.
The symbol used here is the IEEE standard, IEEE Std 91. Note that the inversion (NOT)
operation is indicated by a circle.
Each gate shown in the following figure has one or two inputs and one output. However, it is
already stated that, all of the gates except NOT can have more than two inputs. Thus, they can be
implemented with thee inputs. In some cases, a gate is implemented with two outputs, one output
being the negation of the other output.
Department of IT Page 1
Computer Organization and Architecture
Here we introduce a common term: we say that to assert a signal is to cause signal line to make a
transition from its logically false (0) state to its logically true (1) state. The true (1) state is either
a high or low voltage state, depending on the type of electronic circuitry.
Typically, not all gate types are used in implementation. Design and fabrication are simpler if
only one or two types of gates are used. Thus, it is important to identify functionally complete
sets of gates. This means that any Boolean function can be implemented using only the gates in
the set. The following are functionally complete sets:
It should be clear that AND, OR, and NOT gates constitute a functionally complete set, because
they represent the three operations of Boolean algebra. For the AND and NOT gates to form a
Department of IT Page 2
Computer Organization and Architecture
functionally complete set, there must be a way to synthesize the OR operation from the AND and
NOT operations.
A +B=A ∙B
Combinational Circuits
A combinational circuit is an interconnected set of gates whose output at any time is a function
only of the input at that time. As with a single gate, the appearance of the input is followed
almost immediately by the appearance of the output, with only gate delays. In general terms, a
combinational circuit consists of n binary inputs and m binary outputs. As with a gate, a
combinational circuit can be defined in three ways:
Truth table: For each of the 2n possible combinations of input signals, the binary value
of each of the m output signals is listed.
Graphical symbols: The interconnected layout of gates is depicted.
Boolean equations: Each output signal is expressed as a Boolean function of its input
signals.
Sequential Circuits
Combinational circuits implement the essential functions of a digital computer. However, except
for the special case of ROM, they provide no memory or state information, elements also
essential to the operation of a digital computer. For the latter purposes, a more complex form of
digital logic circuit is used: the sequential circuit. The current output of a sequential circuit
depends not only on the current input, but also on the past history of inputs. Another and
generally more useful way to view it is that the current output of a sequential circuit depends on
the current input and the current state of that circuit.
In this section, we examine sequential circuits taking flip-flops as example. As will be seen, the
sequential circuit makes use of combinational circuits.
Flip-Flops
The simplest form of sequential circuit is the flip-flop. There are a variety of flip flops, all of
which share two properties:
Department of IT Page 3
Computer Organization and Architecture
The flip-flop is a bi stable device, i.e. has two stable states. It exists in one of two states
and, in the absence of input, remains in that state. Thus, the flip-flop can function as a 1-
bit memory.
The flip-flop has two outputs, which are always the complements of each other. These are
generally labeled Q and Q
The following figure shows a common configuration known as the S–R flip-flop or S–R latch.
The circuit has two inputs, S (Set) and R (Reset), and two outputs, Q andQ , and consists of two
NOR gates connected in a feedback arrangement.
First, let us show that the circuit is bi stable. Assume that both S and R are 0 and that Q is 0. The
inputs to the lower NOR gate are and Thus, the output means that the inputs to the upper NOR
gate are Q = 0 and S = 0. Thus the output Q’=1 means that the inputs to the upper NOR gate are
Q’=1 and R = 0 which has the output Q = 0. Thus, the state of the circuit is internally consistent
and remains stable as long as S = R = 0. A similar line of reasoning shows that the state Q = 1,
Q’=0 is also stable for R = S = 0. Thus, this circuit can function as a 1-bit memory. We can view
the output Q as the “value” of the bit. The inputs S and R serve to write the values 1 and 0,
respectively, into memory. To see this, consider the state Q = 0, Q’=1 S = 0, R = 0. Suppose that
S changes to the value 1. Now the inputs to the lower NOR gate are S = 1, Q = 0. After some
time delay Δt, the output of the lower NOR gate will be Q = 0. So, at this point in time, the
inputs to the upper NOR gate become R = 0, Q’=0.
After another gate delay of Δt the output Q becomes 1. This is again a stable state. The inputs to
the lower gate are now S = 1, Q = 1, which maintain the output Q = 0. As long as S = 1 and R =
Department of IT Page 4
Computer Organization and Architecture
0, the outputs will remain Q=1, Q’=0. Furthermore, if S returns to 0, the outputs will remain
unchanged.
The R output performs the opposite function. When R goes to 1, it forces Q=0, Q’=1 regardless
of the previous state of Q and Q’. Again, a time delay of 2Δt occurs before the final state is
established.
The output of the S–R latch changes, after a brief time delay, in response to a change in the
input. This is referred to as asynchronous operation. More typically, events in the digital
computer are synchronized to a clock pulse, so that changes occur only when a clock pulse
occurs. The following figure shows this arrangement. This device is referred to as a clocked S–R
flip-flop. Note that the R and S inputs are passed to the NOR gates only during the clock pulse.
D Flip-Flop
One problem with S–R flip-flop is that the condition must be avoided. One way to do this is to
allow just a single input. The D flip-flop accomplishes this. The following figure shows a gate
implementation and the characteristic table of the D flip-flop. By using an inverter, the non-clock
inputs to the two AND gates are guaranteed to be the opposite of each other.
Department of IT Page 5
Computer Organization and Architecture
The D flip-flop is sometimes referred to as the data flip-flop because it is, in effect, storage for
one bit of data. The output of the D flip-flop is always equal to the most recent value applied to
the input. Hence, it remembers and produces the last input. It is also referred to as the delay flip-
flop, because it delays a 0 or 1 applied to its input for a single clock pulse. We can capture the
logic of the D flip-flop in the following truth table:
D Qn+1
0 0
1 1
J–K Flip-Flop
Like the S–R flip-flop, it has two inputs. However, in this case all possible combinations of input
values are valid. The following figure shows a gate implementation of the J–K flip-flop, and next
Figure shows its characteristic table (along with those for the S–R and D flip-flops). Note that
the first three combinations are the same as for the S–R flip-flop. With no input asserted, the
output is stable. If only the J input is asserted, the output is stable. If only the J input is asserted,
the result is a set function, causing the output to be 1; if only the K input is asserted, the result is
a reset function, causing the output to be 0.When both J and K are 1, the function performed is
referred to as the toggle function: the output is reversed. Thus, if Q is 1 and 1 is applied to J and
K, and then Q becomes 0.
Department of IT Page 6
Computer Organization and Architecture
Binary Counters
Another useful category of sequential circuit is the counter. A counter is a register whose value is
easily incremented by 1 modulo the capacity of the register; that is, after the maximum value is
achieved the next increment sets the counter value to 0. Thus, a register made up of n flip-flops
can count up to 2n -1. An example of a counter in the CPU is the program counter.
Ripple Counter
An asynchronous counter is also referred to as a ripple counter, because the change that occurs to
increment the counter starts at one end and “ripples” through to the other end. The following
Department of IT Page 7
Computer Organization and Architecture
figure shows an implementation of a 4-bit counter using J–K flip-flops, together with a timing
diagram that illustrates its behavior. The timing diagram is idealized in that it does not show the
propagation delay that occurs as the signals move down the series of flip-flops. The output of the
leftmost flip-flop (Q 0 ) is the least significant bit. The design could clearly be extended to an
arbitrary number of bits by cascading more flip-flops.
In the illustrated implementation, the counter is incremented with each clock pulse. The J and K
inputs to each flip-flop are held at a constant 1. This means that, when there is a clock pulse, the
output at Q will be inverted (1 to 0; 0 to 1). Note that the change in state is shown as occurring
with the falling edge of the clock pulse; this is known as an edge-triggered flip-flop. Using flip-
flops that respond to the transition in a clock pulse rather than the pulse itself provides better
timing control in complex circuits. If one looks at patterns of output for this counter, it can be
seen that it cycles through 0000, 0001. . . 1110, 1111, 0000, and so on.
Synchronous Counters
The ripple counter has the disadvantage of the delay involved in changing value, which is
proportional to the length of the counter. To overcome this disadvantage, CPUs make use of
synchronous counters, in which all of the flip-flops of the counter change at the same time.
Department of IT Page 8
Computer Organization and Architecture
Registers
As an example of the use of flip-flops, let us first examine one of the essential elements of the
CPU: the register. As we know, a register is a digital circuit used within the CPU to store one or
more bits of data. Two basic types of registers are commonly used: parallel registers and shift
registers.
Parallel Registers
A parallel register consists of a set of 1-bit memories that can be read or written simultaneously.
It is used to store data. The 8-bit register of the following figure illustrates the operation of a
parallel register using D flip-flops. A control signal, labeled load, controls writing into the
register from signal lines, D11 through D18. These lines might be the output of multiplexers, so
that data from a variety of sources can be loaded into the register.
Shift Registers
A shift register accepts and/or transfers information serially. Consider, for example, the
following Figure , which shows a 5-bit shift register constructed from clocked D flip-flops. Data
are input only to the leftmost flip-flop. With each clock pulse, data are shifted to the right one
position, and the rightmost bit is transferred out.
Shift registers can be used to interface to serial I/O devices. In addition, they can be used within
the ALU to perform logical shift and rotate functions. In this latter capacity, they need to be
equipped with parallel read/write circuitry as well as serial.
Department of IT Page 9
Computer Organization and Architecture
Algebraic Simplification
Karnaugh Maps
For purposes of simplification, the Karnaugh map is a convenient way of representing a Boolean
function of a small number (up to four) of variables. The map is an array of 2 n squares,
representing all possible combinations of values of n binary variables.
Department of IT Page 10
Computer Organization and Architecture
The following figure shows the map of four squares for a function of two variables. It is essential
for later purposes to list the combinations in the order 00, 01, 11, 10. Because the squares
corresponding to the combinations are to be used for recording information, the combinations are
customarily written above the squares. The map can be used to represent any Boolean function in
the following way. Each square corresponds to a unique product in the sum-of-products form,
with a 1 value corresponding to the variable and a 0 value corresponding to the NOT of that
variable.
This process can be extended in several ways. First, the concept of adjacency can be extended to
include wrapping around the edge of the map. Thus, the top square of a column is adjacent to the
bottom square, and the leftmost square of a row is adjacent to the rightmost square. These
conditions are illustrated in Figure b and c. Second, we can group not just 2 squares but 2 n
adjacent squares (that is, 2, 4, 8, etc.). The next three examples in the above Figure show
groupings of 4 squares. Note that in this case, two of the variables can be eliminated. The last
three examples show groupings of 8 squares, which allow three variables to be eliminated.
1. Among the marked squares (squares with a 1), find those that belong to a unique largest
block of 1, 2, 4, or 8 and circle those blocks.
2. Select additional blocks of marked squares that are as large as possible and as few in
number as possible, but include every marked square at least once. The results may not
Department of IT Page 11
Computer Organization and Architecture
be unique in some cases. For example, if a marked square combines with exactly two
other squares, and there is no fourth marked square to complete a larger group, then there
is a choice to be made as two which of the two groupings to choose. When you are
circling groups, you are allowed to use the same 1 value more than once.
3. Continue to draw loops around single marked squares, or pairs of adjacent marked
squares, or groups of four, eight, and so on in such a way that every marked square
belongs to at least one loop; then use as few of these blocks as possible to include all
marked squares.
For more than four variables, the Karnaugh map method becomes increasingly cumbersome.
With five variables, two maps are needed, with one map considered to be on top of the other in
three dimensions to achieve adjacency. Six variables require the use of four tables in four
dimensions! An alternative approach is a tabular technique, referred to as the Quine–McKluskey
Department of IT Page 12
Computer Organization and Architecture
method. The method is suitable for programming on a computer to give an automatic tool for
producing minimized Boolean expressions.
The method is best explained by means of an example. Consider the following expression:
Let us assume that this expression was derived from a truth table. We would like to produce a
minimal expression suitable for implementation with gates.
The first step is to construct a table in which each row corresponds to one of the product terms of
the expression. The terms are grouped according to the number of complemented variables. That
is, we start with the term with no complements, if it exists, then all terms with one complement,
and so on. The following Table shows the list for the example expression, with horizontal lines
used to indicate the grouping. For clarity, each term is represented by a 1 for each un
complemented variable and a 0 for each complemented variable. Thus, we group terms
according to the number of 1s they contain.
The index column is simply the decimal equivalent and is useful in what follows. The next step
is to find all pairs of terms that differ in only one variable, that is, all pairs of terms that are the
same except that one variable is 0 in one of the terms and 1 in the other. Because of the way in
which we have grouped the terms, we can do this by starting with the first group and comparing
each term of the first group with every term of the second group. Then compare each term of the
second group with all of the terms of the third group, and so on. Whenever a match is found,
place a check next to each term, combine the pair by eliminating the variable that differs in the
two terms, and add that to a new list. Thus, for example, the terms ABCD and ABCD are
combined to produce ABC. This process continues until the entire original table has been
examined. The result is a new table with the following entries:
BCD ACD
ABC BCD
Department of IT Page 13
ABD
Computer Organization and Architecture
The new table is organized into groups, as indicated, in the same fashion as the first table. The
second table is then processed in the same manner as the first. That is, terms that differ in only
one variable are checked and a new term produced for a third table. In this example, the third
table that is produced contains only one term: BD. In general, the process would proceed through
successive tables until a table with no matches was produced. In this case, this has involved three
tables.
Once the process just described is completed, we have eliminated many of the possible terms of
the expression. Those terms that have not been eliminated are used to construct a matrix, as
illustrated in the following Table. Each row of the matrix corresponds to one of the terms that
have not been eliminated (has no check) in any of the tables used so far. Each column
corresponds to one of the terms in the original expression. An X is placed at each intersection of
a row and a column such that the row element is “compatible” with the column element. That is,
the variables present in the row element have the same value as the variables present in the
column element. Next, circle each X that is alone in a column. Then place a square around each
X in any row in which there is a circled X. If every column now has either a squared or a circled
X, then we are done, and those row elements whose Xs have been marked constitute the minimal
expression. Thus, in our example, the final expression is
ABC + ACD + ABC + ACD
In cases in which some columns have neither a circle nor a square, additional processing is
required. Essentially, we keep adding row elements until all columns are covered.
Let us summarize the Quine–McKluskey method to try to justify intuitively why it works. The
first phase of the operation is reasonably straightforward. The process eliminates unneeded
variables in product terms. Thus, the expression is equivalent to AB, because
ABC + ABC = AB (C + C) = AB
After the elimination of variables, we are left with an expression that is clearly equivalent to the
original expression. However, there may be redundant terms in this expression, just as we found
redundant groupings in Karnaugh maps. The matrix layout assures that each term in the original
expression is covered and does so in a way that minimizes the number of terms in the final
expression.
Department of IT Page 14
Computer Organization and Architecture
Is the symbolic notation used to describe the micro-operation transfers among registers. It
transfers the operation result to the same or another register. Information transferred from one
register to another is designated in symbolic form by means of replacement operator.
R2 R1
Physical considerations
Gate delays
Propagation delay, or gate delay, is the length of time which starts when the input to a logic
gate becomes stable and valid to change, to the time that the output of that logic gate is stable
and valid to change.
Fan-in
Is the maximum number of inputs that the logic gate can accept.
Fan-out
Is the maximum number of gates that can be connected to the output of the gate.
Department of IT Page 15
Computer Organization and Architecture
Chapter two
Data Representation
A bit is a binary digit. So a bit is a zero or a one. Bits can be implemented in computer hardware
using switches. If the switch is on then bit is one and if the switch is off then the bit is zero. A bit
is limited to representing two values.
Since the alphabet contains more than two letters, a letter cannot be represented by a bit. A byte
is a sequence of bits. Since the mid 1960’s a byte has been 8 bits in length. 01000001 is an
example of byte. Since there are 8 bits in a byte there are 2 8 different possible sequences for one
byte, ranging from 00000000 to 11111111. This means that a byte can be used to represent any
type of value with no more than 28 =256 possible values. Since the number of things that you can
enter on a computer keyboard is smaller than 256 (including all keystroke pairs, like shift or
control plus another key), a code for a keystroke is represented with a code within a byte.
A word is the number of bits that are manipulated as a unit by the particular CPU of the
computer. Today most CPUs have a word size of 32 or 64 bits. Data is fetched from memory to
the processor in word size chunks and manipulated by the ALU in word size chunks. All other
things being equal, (and they never are), larger word size implies faster and more flexible
processing.
Data in computers is represented in binary form. The represented data can be number, text,
movie, color (picture), sound, or anything else. It is up to the application software that presents
the data to portray the data accordingly. We enter data into a computer using letters, digits &
special symbols. But inside the computer, there is no color, letter, digit or any other character
inside the computer system unit.
If the numbers we want to represent are only positive (unsigned) integers, the solution is straight
forward; simply represent the unsigned integer with its binary value. For example, 34 is
represented as 00100010 in 8 bits. In this section, the discussion is on representation of signed
integers. Signed integers can be are represented in several alternative ways. These alternatives
are used for various purposes based on their convenience for the applications.
Computer storage has a limited capacity to hold data. The number of bits available for data
representation determines the range of integers we can represent. With 4 bits, it is possible to
represent a total of 16 integers. If the number of available bits increases to 5, we can represent 32
integers. In fact with every bit added, the number of possible integers we can represent is
Department of IT Page 16
Computer Organization and Architecture
doubled. In general, the number of integers we can represent with n bits is 2 n . Singed integers
include positive integers, negative integers, as well as zero. The 2 n places are partitioned among
the negative, positive, and zero. For example, with 8 bits, it is possible to represent 256 different
integers. Typically, 128 of them are positive integers and zero while the rest 128 of them are
negative integers.
The number systems (bases) we will discuss are: decimal, binary, octal, and hexadecimal.
The decimal number system, also called the base 10 number system, is the number system we
use in our day-to-day life. The preference of this number system by humans is attributed to their
nature that humans have 10 fingers. It is believed that humans start counting using their fingers.
This fact is the basis for the preference of the decimal number system by humans.
Although the number system is easily understood by humans, it cannot be used to represent data
in computers because there are only two (binary) states in a computer system. On the other hand,
the binary number system, also known as base 2 number system, has two digits 0 and 1. This
makes it useful for data representation in computers. The two digits of the binary number system
correspond to the two distinct states of the digital electronics. A binary digit is referred to as a
bit.
We can associate the two digits of the binary number system with two states of electrical
systems, magnetic systems, and switches. The following table shows the conventional
association. It is also possible to exchange the association but it becomes confusing since it is the
opposite of the convention (what is agreed on).
0 1
Electronic No current There is current
Magnetic Demagnetized Magnetized
Switch Off On
Data representation using the binary number system results in a large string of 0s and 1s. This
makes the represented data large and difficult to read. Writing such a binary string becomes
tedious as well. For the sake of writing the binary strings in a short hand form and make them
readable, the octal and the hexadecimal number systems are used.
Department of IT Page 17
Computer Organization and Architecture
Octal number system, also called base 8 number system, has 8 different symbols: 0, 1, 2, 3, 4, 5,
6, and 7. The octal number system is used to write binary numbers in short form. An octal
number has about one-third of the digits in its binary equivalent.
The hexadecimal number system, also called base 16 number system, has 16 different symbols 0,
1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, and F. The hexadecimal number system is usually referred
as hex for short. It is used to write binary numbers in short form. A hex number has about one-
fourth of the digits in its binary equivalent. Memory addresses and MAC addresses are usually
written in hex.
Step 1: Divide the given decimal number by m (the desired base). The result will have a quotient
and a remainder.
Step 2: Divide the quotient by m. Still you get a quotient and a remainder.
Step 3: Repeat step 2 until the quotient becomes 0. You should note that we are conducting
integer division. In integer division n/m, the quotient is 0 whenever n < m.
Step 4: Collect and arrange the remainders in such a way that the first remainder is the least
significant digit and the last remainder is the most significant digit (i.e., Rn Rn-1 … R2 R1 ).
Example: Convert the following decimal number 47 into binary, octal, and hexadecimal.
a. Conversion to binary
In order to convert the given decimal numbers into binary (base 2), they are divided by 2.
Quotient Remainder
47 ÷ 2 23 1
23 ÷ 2 11 1
11 ÷ 2 5 1
5÷2 2 1
2÷2 1 0
1÷2 0 1
Since the quotient becomes 0 at the last division, the division has to stop and we should collect
the remainders starting from the last one. Hence the result is 101111 2 . Note that, starting from
Department of IT Page 18
Computer Organization and Architecture
the second division, at each consecutive division the quotient of the previous division is used as
the dividend.
b. Conversion to octal
Here the numbers are divided by 8 because the required base is octal (base 8).
Quotient Remainder
47 ÷ 8 5 7
5÷8 0 5
Therefore, 47 = 578
c. Conversion to hexadecimal
Since the conversion now is into hexadecimal (base 16) the given decimal numbers are divided
by 16.
Quotient Remainder
47 ÷ 16 2 15
2 ÷ 16 0 2
Remember that the remainders are all in decimal and when you write the result in the required
base, the remainders has to be converted into that base. The hexadecimal equivalent for the
decimal 15 is F and that of 2 is 2. For conversion of decimal numbers into binary and octal or
vice versa, there is no problem of looking for the equivalent of each remainder. You need such
conversion of a remainder only when the remainder is a double digit number. Therefore, 47 =
2F16
1100012 = (1 × 25 ) + (1 × 24 ) + (0 × 23 ) + (0 × 22 ) + (0 × 21 ) + (1 × 20 )
= (1 × 32) + (1 × 16) + (0 × 8) + (0 × 4) + (0 × 2) + (1 × 1)
= 32 + 16 + 0 + 0 + 0 + 1
= 49
Therefore, 1100012 = 49
Department of IT Page 19
Computer Organization and Architecture
It is evident that calculating the product of 0s since the product is 0 and do not contribute
anything to the final result. However, you should remember to skip the positional value as well.
228 = (2 × 81 ) + (2 × 80 )
= (2 × 8) + (2 × 1)
= 16 + 2
= 18
Therefore, 228 = 18
D116 = (13 × 161 ) + (1 × 160 ); you should be aware that the calculations are in decimal thus the
hex digit D must first be converted into its decimal equivalent
(13).
= (13 × 16) + (1 × 1)
= 208 + 1
= 209
It is possible to use decimal number system as an intermediate base to convert from any base to
any other base. However, for conversion from binary to octal or vice versa, there is a very simple
method.
Step 1: Group the binary digits (bits) starting from the rightmost digit. Each group should contain
3 bits. If the remaining bits at the leftmost position are fewer than 3, add 0s at the front.
Step 2: For each 3-bit binary string, find the corresponding octal number. Table 3-1 shows the
conversion equivalents.
A. 110011
110 011
6 3
Department of IT Page 20
Computer Organization and Architecture
The bits are grouped in three with the equivalent octal digit given below the three bit group.
Thus, 1100112 = 638
B. 1101111
Since we are left with a single bit at the leftmost position, two 0s are added at the front to make
create a three-bit group. The result shows that 1101111 2 = 1578 .
Step 1: For each octal digit, find the equivalent three digit binary number.
Step 2: If there are leading 0s for the binary equivalent of the leftmost octal digit, remove them.
Example: Find the binary equivalent for the octal numbers 73 and 160.
A. 73
7 3
111 011
Since there are no leading 0s at the leftmost position (the bits for octal 7), there is no 0 to
remove. Therefore, 738 = 1110112
B. 160
1 6 0
001 110 000
The binary equivalent for the leftmost octal digit 1 has two 0s. To get the final result, remove
them and concatenate the rest. Thus, 160 8 = 11100002
One possible way to convert a binary number to hexadecimal, is first to convert the binary
number to decimal and then from decimal to hex. Nevertheless, the simple way to convert binary
numbers to hex is by grouping as used in conversion to octal. Here a single group has 4 bits.
Step 1: Starting from the rightmost bit, group the bits in 4. If the remaining bits at the leftmost
position are fewer than 4, add 0s at the front.
Department of IT Page 21
Computer Organization and Architecture
Step 2: For each 4-bit group, find the corresponding hexadecimal number. You can use Table 3-1
to find conversion equivalents.
A. 1110110001
0011 1011 0001
2 B 1
B. 10011110
1001 1110
9 E
Step 1: For each hexadecimal digit, find the equivalent four digit binary number.
Step 2: If there are leading 0s for the binary equivalent of the leftmost hexadecimal digit,
remove them.
Example: Find the binary equivalents for the hexadecimal numbers 1C and 823.
A. 1C
1 C
0001 1100
After removing the leading 0s for the binary equivalent of the leftmost hexadecimal number 1,
the result becomes 11100. Thus, 1C16 = 111002
B. 823
8 2 3
1000 0010 0011
There is no leading 0s for the binary equivalent of the hexadecimal number 8, we simply
concatenate the binary digits to get the final result. Hence, 823 16 = 1000001000112
Department of IT Page 22
Computer Organization and Architecture
The decimal number system can be used as an intermediate conversion base. As it is shown in
the above sections, however, it the binary number system is quite convenient for conversion to or
from octal and hexadecimal. To convert an octal number to a hexadecimal number or from
hexadecimal to octal, the binary number system is used as an intermediate base.
Example 2: Find the octal equivalent for the hexadecimal number 3D5
Convert 3D516 to binary
3 D 5
0011 1101 0101
In common real number expression the location of the radix point is indicated by placing a dot
(or a comma in some countries) character. In computers, representation of numbers similar to the
scientific notation is available and is known as floating point number. Unlike the scientific
notation, there is no radix point character in floating point numbers. For the representation of
Department of IT Page 23
Computer Organization and Architecture
integers, the only required information is the magnitude and the sign of the number. To represent
floating point numbers or in scientific notation we need the following information:
Of these required pieces of information, only the first four are necessary to represent floating
point numbers in computers. The base of the number is not necessary for representation because
it is always 2 as the number system of computers is binary.
Among the variety of floating number representations in computers, the IEEE 754 standard is the
most widely used. It is commonly referred as the IEEE floating point. Two formats of IEEE
floating point are:
Single precision: 32 bits wide representation with 23 bits significand, one sign bit, and 8
bits of exponent. The accuracy of binary numbers with 23 bits significant is equivalent to
accuracy of 7 digits of decimal number.
Double precision: 64 bits wide representation of which 52 bits are for the signicand, one
bit for the sign of the number and 11 bits for the exponent. Double precision numbers are
accurate to 16 digits of decimal number.
In digital work, two types of complements of a binary number are used for complemental
subtraction:
a) One’s complement
The one’s complement of binary number is obtained by changing its each 0 into a 1 and each 1
into a 0. It is also called radix-minus-one complement. For Example, one’s complement of 100 2
is 0112 and of 11102 is 00012 .
b) Two’s complement
The two’s complement of binary number is obtained by adding 1 to its 1’s complement.
Department of IT Page 24
Computer Organization and Architecture
It is also known as true complement. Suppose we asked to find 2’s complement of 10112 . Its 1’s
complement is 01002. Next add 1 to get 01012. Hence 2’s complement of 10112 is 01012. The
complement method of subtraction reduces subtraction on an addition process.
In this method instead of subtracting a number, we add its 1’s complement to the minuend. The
last carry (whether 0 or 1) is added to get the final answer. The rules for subtraction by 1’s
complement are as under:
1. Compute the ones complement of the subtrahend by changing all its 1s to 0s and all its 0s
to 1s.
2. Add its complement to the minuend
3. Perform the end-around carry of the last 1 or 0
4. If there is no end-around carry (i.e. 0 carry), then the answer must be re-complemented
and a negative sign attached to it.
5. If the end-around carry is 1, no re-complement is necessary.
Department of IT Page 25
Computer Organization and Architecture
Department of IT Page 26
Computer Organization and Architecture
Department of IT Page 27
Computer Organization and Architecture
Chapter 3
The von Neumann model of computer architecture was first described in 1946 in the famous
paper by Burks, Goldstein, and von Neumann (1946). A number of very early computers or
computer like devices had been built, starting with the work of Charles Babbage, but the simple
structure of a stored-program computer was first described in this landmark paper. The authors
pointed out that instructions and data consist of bits with no distinguishing characteristics. Thus a
common memory can be used to store both instructions and data. The differentiation between
these two is made by the accessing mechanism and context; the program counter accesses
instructions while the effective address register accesses data. If by some chance, such as a
programming error, instructions and data are exchanged in memory, the performance of the
program is indeterminate. Before von Neumann posited the single address space architecture, a
number of computers were built that had disjointed instruction and data memories.
The von Neumann architecture consists of three major subsystems: instruction processing,
arithmetic unit, and memory, as shown in the following figure. A key feature of this architecture
is that instructions and data share the same address space. Thus there is one source of addresses,
the instruction processing unit, to the memory. The output of the memory is routed to either the
Instruction Processing Unit or the Arithmetic Unit.
Control unit
Department of IT Page 28
Computer Organization and Architecture
A control unit (CU) handles all processor control signals. It directs all input and output flow,
fetches code for instructions from micro programs and directs other units and models by
providing control and timing signals. A CU component is considered the processor brain because
it issues orders to just about everything and ensures correct instruction execution.
A CU takes its input from the instruction and status registers. Its rules of operation, or micro
program, are encoded in a programmable logic array (PLA), random logic or read-only memory
(ROM).
Instruction Cycle
A program residing in the memory unit of the computer consists of a sequence of instructions.
The program is executed by going through a cycle for each instruction; the so called fetch-
decode-execute cycle. Each instruction cycle in turn is subdivided into a sequence of phases. In
the basic computer each instruction cycle consists of the following phases:
Department of IT Page 29
Computer Organization and Architecture
Upon the completion of step 4, the control goes back to step 1 to fetch, decode and execute the
next instruction. This process continues indefinitely unless a HALT instruction is encountered.
Initially, the program counter PC is loaded with the address of the first instruction in the program
and SC is cleared to 0, providing a decoded timing signal T0. After each clock pulse, SC is
incremented by one, so that the timing signals go through a sequence T0, T1, T2 and so on. Since
the address lines of the memory unit are hardwired to AR, address of the instruction to be fetched
must first be placed in AR. Thus the first register transfer in an instruction cycle should be
transferring the content of PC (address of the instruction to be brought in for execution) to AR
T0 : AR ← PC
Bus selects PC, LD(AR) = 1. Next the current instruction is brought from memory into IR and
PC is incremented
Therefore, micro-operations for the fetch and decode phases can be specified by the following
register transfer statements.
T0: AR ← PC
T1: IR ← M[AR], PC ← PC + 1
T2: D0, …, D7 ← Decode IR(12-14), AR ← IR(0-11), I ← IR(15)
Instruction set
The instruction set, also called instruction set architecture (ISA), is part of a computer that
pertains to programming, which is basically machine language. The instruction set provides
commands to the processor, to tell it what it needs to do. The instruction set consists of
addressing modes, instructions, native data types, registers, memory architecture, interrupt, and
exception handling, and external I/O.
Instruction types
a. Data Manipulation Instructions:
i. Arithmetic: For arithmetic operations such as-Add, Subtract, Multiply, Divide etc.
ii. Logical: Logical operations such as- AND, OR, NOT, etc.
iii. Bit Manipulation: Binary operations such as- Shift, Rotate, Set, Clear etc.
Department of IT Page 30
Computer Organization and Architecture
iv. Compare & Test: Performing operations such as- Compare, Test etc.
b. Data Transfer Instructions:
To perform frequently required data transfers from one location to another such asi.
i. Transfer between registers: Move etc.
ii. Transfer between memory locations: Move etc.
iii. Transfer between register and memory locations: Load, Store etc.
iv. Transfer between registers/ memory locations and Input/ Output devices:
Input, Output etc.
c. Program Control Instructions:
To enable implementation of non-sequential execution of instructions such as program loops, if-
then-else conditional execution, subroutine/ function execution, etc. These are of two categories:
i. Branch or Jump Instructions:
- Unconditional Branch: Branch, Jump etc.
- Conditional Branch: Branch if Equal, Branch if Greater Than etc.
ii. Subroutine Instructions: Call Subroutine, Return from Subroutine etc.
Instruction Formats
The physical and logical structure of computers is normally described in reference manuals
provided with the system. Such manuals explain the internal construction of the CPU, including
the processor registers available and their logical capabilities. They list all hardware
implemented instructions, specify their binary code format and provide a precise definition of
each instruction. A computer will usually have a variety of instruction code formats.
It is the function of the control unit within the CPU to interpret each instruction code and provide
the necessary control functions needed to process the instruction. The format of an instruction is
usually depicted in a rectangular box symbolizing the bits of the instruction as they appear in
memory word or in a control register. The bits of the instruction are divided in to groups called
Fields. The most common fields found in instruction formats are:
1. An operation code filed that specifies the operation to be performed
2. An address field that designate a memory address or a processor register
3. A mode field that specifies the way the operand or the effective address is determined
Other special fields are sometimes employed under certain circumstances, for example a field
that gives the number of shifts in a shift type instruction. The operation code field of an
instruction is a group of bits that define various processor operations, such as add, subtract,
complement, and shift. The most common operations available in computer instructions are
enumerated and discussed in Section 5.6. The bits that define the mode field of an instruction
code specify the variety of alternatives for choosing the operand from the given address. The
various addressing modes that have been formulated for digital computers are presented in
Section 5.5. In this section we are concerned with the address field of an instruction format and
consider the effect of including multiple address fields in an instruction.
Department of IT Page 31
Computer Organization and Architecture
Operations specified by computer instructions are executed on some data stored in memory or
processor registers. Operands residing in memory are specified by their memory address.
Operands residing in processor registers are specified with register address. A register address is
a binary number of k bytes that defines one of 2 k registers in the CPU. Thus, a CPU with 16
processor registers R0 through R15 will have a register address field of 4-bits. The binary
number 0101, for example, will designate register R5.
Computers may have instruction of several lengths containing varying number of addresses. The
number of address field in the instruction format of a computer depends on the internal
organization of its registers. Most computers fall into one of three types CPU organizations.
1. Single accumulator organization
2. General register organization
3. Stack organization
In an accumulator-type organization, all operations are performed with an implied accumulator
register. The instruction format in this type of computer uses one address field. For example, the
instruction that specifies an arithmetic addition is defined by an assembly language instruction as
ADD X, where X is the address of the operand. The ADD instruction in this case results in the
operation AC AC + M[X]. AC is the accumulator register and M[X] symbolizes the memory
word located at address X.
The instruction format in a computer with a general register organization type needs three
register address fields. Thus, the instruction for an arithmetic addition may be written in an
assembly language as ADD R1, R2, R3 to denote the operation R1 R2 + R3. The number of
address field in the instruction can be reduced from three to two if the destination register is the
same as one of the source register. Thus the instruction ADD R1, R2 would denote the operation
R1 R1 + R2. Only register addresses for R1 and R2 need be specified in this instruction.
Computers with multiple processor registers use the move instruction with a mnemonics MOV to
symbolize a transfer instruction. Thus the instruction MOV R1, R2 denotes the transfer R1 R2
(or R2 R1) depending on the particular computer. Thus, transfer-type instructions need two
address fields to specify the source and the destination.
General register–type computers employ two or three address fields in their instruction format.
Each address field may specify a processor register or a memory word. An instruction
symbolized by ADD R1, X would specify the operation R1 R1 + M[x]. It has two address
fields, one for register R1 and the other for the memory address X.
Computers with stack-organization would have PUSH and POP instructions which require an
address field. Thus, the instruction PUSH X will push the word at address X to the top of the
stack. The stack pointer is updated automatically. Operation-type instructions do not need an
address field in stack-organized computers. This is because the operation is performed on the
two items that are on top of the stack. The instruction ADD in a stack-organized computer
Department of IT Page 32
Computer Organization and Architecture
consists of an operation code only with no address field. This operation has the effect of popping
the two top numbers from the stack, adding the numbers, and pushing the sum into the stack.
There is no need to specify operands with an address filed since all operands are implied to be in
the stack.
Most computers fall in to one of the three types of organizations that have just been described.
Some computers combine features from more than one organizational structure. For example, the
Intel 8080 microprocessor has seven CPU registers, one of which is an accumulator register. As
a consequence, the processor has some of the characteristics of a general register type and some
of the characteristics of an accumulator type. All arithmetic and logic instructions, as well as the
load and store instructions, use the accumulator register, so these instructions have only one
address field. On the other hand, instructions that transfer data among the seven processor
registers have a format that contain two register address fields.
To illustrate the influence of the number of addresses on computer programs, we will evaluate
the arithmetic statement X = (A+B) (C+D) using zero, one, two, or three address instructions.
We will use symbols ADD, SUB, MUL and DIV for four arithmetic operations; and LOAD and
STORE for transfers to and from memory and AC register. We will the operand assume that the
operands are in memory addresses A, B, C, and D, and the result must be stored in memory at
address X.
Three-address Instruction
Computers with three-address instruction formats can use each address field to specify either a
processor register or a memory operand. The program in assembly language that evaluates X =
(A+B) (C+D) is shown below, together with comments that explain the register transfer
operation of each instruction.
ADD R1, A, B R1 M[A] + M[B]
ADD R2, C, D R2 M[C] + M[D]
MUL X, R1, R2 M[X] R1 R2
It is assumed that the computer has two processor registers, R1 and R2. The symbol M[A]
denotes the operand at memory address symbolized by A. The advantage of three-address format
is that it results in short programs when evaluating arithmetic expressions. The disadvantage is
that the binary-coded instructions require too many bits to specify three addresses.
Department of IT Page 33
Computer Organization and Architecture
One-Address Instruction
One-address instructions use an implied accumulator (AC) register for all data manipulations.
For multiplication and division there is a need for a second register. However, here we will
neglect the second register and assume that the AC contains the result of all operations. The
program to evaluate X = (A+B) (C+D) is:
LOAD A AC M[A]
ADD B AC AC + M[B]
STORE T M[T] AC
LOAD C AC M[C]
ADD D AC AC + M[D]
MUL T AC AC M[T]
STORE X M[X] AC
All operations are done between the AC register and a memory operand. T is the address of a
temporary memory location required for storing the intermediate result.
Zero-Address Instruction
A stack-organized computer does not use an address field for instructions ADD and MUL. The
PUSH and POP instructions, however, need an address field to specify the operand that
communicates with the stack. The following program shows how X = (A+B) (C+D) will be
written for a stack-organized computer (TOS stands for top-of-stack)
PUSH A TOS A
PUSH B TOS B
ADD TOS (A + B)
PUSH C TOS C
PUSH D TOS D
Department of IT Page 34
Computer Organization and Architecture
ADD TOS (C + D)
MUL TOS (C + D) (A + B)
POP X M[X] TOS
RISC Instructions
RISC stands for reduced instruction set computer. The instruction set of a typical RISC processor
is restricted to the use of load and store instructions when communicating between memory and
CPU. All other instructions are executed within the registers of the CPU without referring to the
memory. A program for a RISC-type CPU consists of LOAD and STORE instructions that have
one memory and one register address and computational-type instructions that have three
addresses with all three specifying processor registers. The following is a program to evaluate X
= (A+B) (C+D).
The load instructions transfer the operands from memory to CPU registers. The add and multiply
operations are executed with the data in the registers without accessing memory. The result of
the computations is stored in memory with the store instructio n.
Addressing Mode
The operation field of an instruction specifies the operation to be performed. This operation must
be executed on some data stored in computer registers or memory words. The way the operands
are chosen during program execution is dependent on the addressing mode of the instruction.
The addressing mode specifies a rule for interpreting or modifying the address field of the
instruction before the operand is actually referenced. Computers use addressing mode techniques
for the purpose of accommodating one or both of the following provisions.
Department of IT Page 35
Computer Organization and Architecture
The availability of the addressing modes give the experienced assembly language programmer
flexibility for writing programs that are more efficient with respect to the number of instructions
and execution time. To understand the various addressing mode to be presented in this section, it
is important that we understand the basic operation cycle of the computer. The control unit of a
computer is designed to go through an instruction cycle that is divided into three major phases.
There is one register in the computer called the program counter (PC) that keeps track of the
instructions in the program stored in memory. PC holds the address of the instruction to be
executed next and incremented each time an instruction is fetched from memory. The decoding
done in step 2, determines the operation to be performed, the addressing mode of the instruction,
and the location of the operand. The computer then executes the instruction and returns to step 1
to fetch the next instruction in sequence.
In some computers the addressing mode of the instruction is specified with a distinct binary
code, just like the operation code is specified. Other computers use a single binary code that
designates the operation and the mode of the instruction. Instructions may be defined with
variety of addressing modes, and sometimes, two or more addressing modes are combined in one
instruction.
An example of an instruction format with a distinct addressing mode field is shown in the
following Figure. The operation code specifies the operation to be performed. The mode field is
used to locate the operands needed for the operation. There may or may not be an address field in
the instruction. If there is an address field, it may designate a memory address or a processor
register. Moreover, the instruction may have more than one address field, and each address field
may be associated with its own particular addressing mode.
Although, most addressing modes specify the address field of the instruction there are two modes
that need no address field at all. These are the implied and immediate mode.
Implied Mode: In this mode the operands are specified implicitly in the definition of the
instruction. For example, the instruction “complement accumulator” is an implied mode
instruction because the operand in the accumulator register is implied in the definition of the
instruction. In fact, all register reference instructions that use an accumulator are implied mode
Department of IT Page 36
Computer Organization and Architecture
instructions. Zero-address instructions in stack-organized are implied mode instructions since the
operands are implied to be on top of the stack.
Immediate Mode: In this mode the operand is specified in the instruction itself. In other words,
an immediate mode instruction has an operand field rather than an address field. The operand
field contains the actual operand to be used in conjunction with the operation specified in the
instruction. Immediate mode instructions are useful for initializing registers to a constant value.
When the address field specifies a processor register, the instruction is said to be in the register
mode.
Register Mode: In this mode the operands are in registers that reside within the CPU. The
particular register is selected from a register field in the instruction. A k-bit field can specify any
of the 2k registers.
Register Indirect Mode: In this mode the instruction specifies a register in the CPU whose
contents give the address of the operand in memory. In other words, the selected register
contains the address of the operand rather than the operand itself. Before using a register indirect
mode instruction, the programmer must ensure that the memory address of the operand is placed
in the processor register with a previous instruction. A reference to the register is then equivalent
to specifying a memory address. The advantage of the register indirect mode instruction is that
the address field of the instruction uses fewer bits to select a register than would have been
required to specify a memory address directly.
Auto increment and Auto decrement Mode: This is similar to the register indirect mode
except that the register is incremented or decremented after (or before) its value is used to access
memory. When the address stored in the register refers to a table of a data in memory, it is
necessary to increment or decrement the register after every access to the table. The address field
of an instruction is used by the control unit in CPU to obtain the operand from memory.
Sometimes the value given in the address field is the address of the operand, but sometimes it is
just an address from which the address of the operand is calculated. To differentiate among the
various addressing mode it is necessary to distinguish between the address part of the instruction
and the effective address used by the control when executing the instruction. The effective
address is defined to be the memory address obtained from the computation dictated by the
given addressing mode. The effective address is the address of the operand in a computational
type instruction. It is the address where control branches in response to a branch type instruction.
Direct Addressing Mode: In this mode the effective address is equal to the address part of the
instruction. The operand resides in memory and its address is given directly by the address field
Department of IT Page 37
Computer Organization and Architecture
of the instruction. In a branch type of instruction the address field specifies the actual branch
address.
Indirect Addressing Mode: In this mode the address field of the instruction gives the address
where the effective address is stored in memory. Control fetches the instruction from memory
and uses its address part to access memory again to read the effective address.
Relative Addressing mode: In this mode the content of program counter is added to the address
part of the instruction in order to obtain the effective address. The address part of the instruction
is usually a signed number (in 2’s complement representation) which can be either negative or
positive. When this number is added to the content of the program counter, the result produces an
effective address whose position in memory is relative to the address of the next instruction.
Indexed Addressing Mode: In this mode the content of an index register is added to the address
part of the instruction to obtain the effective address. The address field of the instruction defines
the beginning address of a data array in memory. Each operand in the array is stored in memory
relative to the beginning address. The distance (offset) between the beginning address and the
address of the operand is the index value stored in the index register. In computers with a
dedicated index register, the index register is involved implicitly in index mode instructions.
Base Register Addressing Mode: In this mode the content of a base register is added to the
address part of the instruction to obtain the effective address. It is similar to the indexed
addressing. The difference is that the base register contains the base address and the address field
of the instruction contains the displacement (offset) relative to this base address. In contrast, in
indexing mode, the index register contains the offset relative to the address part of the
instruction.
The Stack stores the return address whenever a subroutine is called during the execution of a
program. The Jump or Branch to subroutine instruction pushes the address of the next instruction
Department of IT Page 38
Computer Organization and Architecture
following it onto the Stack. The RTS instruction removes the address from the Stack so the
program returns to the next instruction following the subroutine call.
In the PIC 16F84 the CALL instruction is used when Calling a subroutine, saving the current
program counter so that the Return operation knows where to restore the program counter. This
is accomplished automatically (as part of the CALL instruction) pushing the return address onto
the Stack, then when a Return instruction is executed, this address is popped off the stack and put
into the program counter.
Subroutine calls in machine language are characterized by the form of passing parameters and
the form of returning to the caller. The PALM processor has no instructions for subroutine
mechanisms, so the programmer has to prove his creativity. Subroutine calls and passing
parameters are independent from each other so they are described in two separate paragraphs.
A subroutine call consists of storing the current program counter and jumping to another
location. At the end of the subroutine a jump back to the origin is performed by jumping to the
location indicated by the stored program counter.
A jump instruction is any instruction that modifies register R0. Normally ADD/SUB R0,#xx is
used for relative jumps, LWI R0,#xxxx for absolute jumps and MOVE R0,xx resp. MOVE
R0,(Rx) for indirect jumps.
The return address is always passed in a register, most commonly in R2. Of course you can
calculate another address or point to a different location in the program if it is desired to return
somewhere else. But normally a return to the instruction following the subroutine call is desired,
so the INC2 instruction is used.
There are specially created mnemonics for the AS assembler for subroutines and branches:
Department of IT Page 39
Computer Organization and Architecture
The disadvantage of passing the return address in a register is obvious when nested or recursive
calls are needed because only a limited set of registers may be available. If all registers are used
by the program only one single call can be made. Therefore it is advised to create a stack, e.g.
with R15 as stack pointer, that will store all return addresses (and other stuff if desired).
Example:
Parameters
The most common forms of passing parameters are register values, common memory area or a
stack (e.g. with R15 as SP, see above). Implementing these forms is an easy task. Another form
that is often found in PALM programs is a parameter list following the call instruction. This is
realized in the following manner:
Department of IT Page 40
Computer Organization and Architecture
The subroutine fills a memory area with a certain value. This example fills the area from $0200
and a length of $0400 with blanks, so the screen will be cleared. The parameters build a kind of
mini-stack that lies directly after the subroutine call. The return address is the "stack pointer"
after all parameters has been popped.
I/O operations are accomplished through a wide assortment of external devices that provide a
means of exchanging data between the external environment and the computer. An external
device attaches to the computer by a link to an I/O module as shown in figure below. The link is
used to exchange control, status, and data between the I/O module and the external device. An
external device connected to an I/O module also called Interface is often referred to as a
peripheral device or simply a peripheral.
Department of IT Page 41
Computer Organization and Architecture
I/O bus
Data
Processor Address
Control
I/O
Interface Interface Interface Interface
Modules
Keyboard
and display Magnetic Magnetic
Printer
terminal disk tape
Input/output Problems
Input/output Module
It is the entity within a computer that is responsible for the control of one or more external
devices and for the exchange of data between those devices and main memory and/or CPU.
Thus I/O module is an:
Interface to CPU and memory
Interface to one or more peripherals
The major functions or requirements for an I/O module fall into the following five categories.
Control & Timing
CPU Communication
Device Communication
Data Buffering
Department of IT Page 42
Computer Organization and Architecture
Error Detection
During any period of time, the CPU may communicate with one or more external devices in
unpredictable patterns on the program’s need for I/O. The internal resources, main memory and
the CPU must be shared among number of activities including handling data I/O. Thus the I/O
device includes a control and timing requirement to coordinate the flow of traffic between
internal resources and external devices to the CPU. Thus CPU might involve in sequence of
operations like:
CPU checks I/O module device status
I/O module returns device status
If ready, CPU requests data transfer
I/O module gets data from device
I/O module transfers data to CPU
Variations for output, DMA, etc.
I/O module must have the capability to engage in communication with the CPU and external
device. Thus CPU communication involves:
Command decoding: The I/O module accepts commands from the CPU carried on the
control bus.
Data: data are exchanged between the CPU and the I/O module over data bus
Status reporting: Because peripherals are slow it is important to know the status of I/O
device. I/O module can report with the status signals common used status signals are
BUSY or READY. Various other status signals may be used to report various error
conditions.
Address recognition: just as each memory word has an address, there is address
associated with every I/O device. Thus I/O module must be recognized with an unique
address for each peripheral it controls.
The I/O module must also be able to perform device communication. This communication
involves commands, status information, and data. Some of the essentials tasks are listed below:
Error detection: I/O module is often responsible for error detection and subsequently
reporting errors to the CPU.
Data buffering: the transfer rate into and out of main memory or CPU is quite high, and
the rate is much lower for most of the peripherals. The data is buffered in the I/O module
and then sent to the peripheral device at its rate. In the opposite direction data are
buffered so as not to tie up the memory in a slow transfer operation. Thus I/O module
must be able to operate at both device and memory speeds.
Department of IT Page 43
Computer Organization and Architecture
Three techniques are possible for I/O operations or data transfer mode. They are:
Programmed I/O
Interrupt driven
Direct Memory Access (DMA)
Programmed I/O
With Programmed I/O, data are exchanged between the CPU and the I/O module. The CPU
executes a program that gives it direct control of the I/O operation, including sensing device
status, sending a read or write command and transferring data. When CPU issues a command to
I/O module, it must wait until I/O operation is complete. If the CPU is faster than I/O module,
there is wastage of CPU time.
The I/O module does not take any further action to alert CPU. That is it doesn’t interrupt CPU.
Hence it is the responsibility of the CPU to periodically check the status of the I/O module until
it finds that the operation is complete.
The sequences of actions that take place with programmed I/O are:
CPU requests I/O operation
I/O commands
To execute an I/O related instruction, the CPU issues an address, specifying the particular I/O
module and external device and an I/O command. Four types of I/O commands can be received
by the I/O module when it is addressed by the CPU. They are
A control command: is used to activate a peripheral and tell what to do. Example: a
magnetic tape may be directed to rewind or move forward a record.
A test command: is used to test various status conditions associated with an I/O module
and its peripherals. The CPU wants to know the interested peripheral for use. It also
Department of IT Page 44
Computer Organization and Architecture
wants to know the most recent I/O operation is completed and if any errors have
occurred.
A read command: it causes the I/O module to obtain an item of data from the peripheral
and place it in an internal buffer. The CPU then gets the data items by requesting I/O
module to place it on the data bus.
A write command: it causes the I/O module to take an item of data from the data bus and
subsequently transmit the data item to the peripheral.
I/O Mapping
When the CPU, main memory, and I/O module share a common bus two modes of addressing
are possible.
2. Isolated I/O
Separate address spaces
An interrupt is indicated by a signal sent by the device interface to the CPU via an interrupt
request line (on an external bus). This signal notifies the CPU that the signaling interface needs
to be serviced. The signal is held until the CPU acknowledges or otherwise services the interface
from which the interrupt originated.
Department of IT Page 45
Computer Organization and Architecture
The CPU checks periodically to determine if an interrupt signal is pending. This check is usually
done at the end of each instruction, although some modern machines allow for interrupts to be
checked for several times during the execution of very long instructions.
When the CPU detects an interrupt, it then saves its current state (at least the PC and the
Processor Status Register containing condition codes); this state information is usually saved in
memory. After the interrupt has been serviced, this state information is restored in the CPU and
the previously executing software resumes execution as if nothing had happened.
Department of IT Page 46