Computer Organisation - 1 Basic Structure of Computers
Computer Organisation - 1 Basic Structure of Computers
CHAPTER – 1
BASIC STRUCTURE OF COMPUTERS
Computer types: -
List of instructions are called programs & internal storage is called computer
memory.
Functional unit: -
A computer consists of five functionally independent main parts input, memory,
arithmetic logic unit (ALU), output and control unit.
Input ALU
1
Input device accepts the coded information as source program i.e. high level
language. This is either stored in the memory or immediately used by the processor to
perform the desired operations. The program stored in the memory determines the
processing steps. Basically the computer converts one source program to an object
program. i.e. into machine language.
Finally the results are sent to the outside world through output device. All of
these actions are coordinated by the control unit.
Input unit: -
The source program/high level language program/coded information/simply data
is fed to a computer through input devices keyboard is a most common type. Whenever a
key is pressed, one corresponding word or number is translated into its equivalent binary
code over a cable & fed either to memory or processor.
Memory unit: -
Its function into store programs and data. It is basically to two types
1. Primary memory
2. Secondary memory
1. Primary memory: - Is the one exclusively associated with the processor and operates
at the electronics speeds programs must be stored in this memory while they are being
executed. The memory contains a large number of semiconductors storage cells. Each
capable of storing one bit of information. These are processed in a group of fixed site
called word.
Number of bits in each word is called word length of the computer. Programs
must reside in the memory during execution. Instructions and data can be written into the
memory or read out under the control of processor.
Memory in which any location can be reached in a short and fixed amount of
time after specifying its address is called random-access memory (RAM).
The time required to access one word in called memory access time. Memory
which is only readable by the user and contents of which can’t be altered is called read
only memory (ROM) it contains operating system.
Caches are the small fast RAM units, which are coupled with the processor and
are aften contained on the same IC chip to achieve high performance. Although primary
storage is essential it tends to be expensive.
2
2 Secondary memory: - Is used where large amounts of data & programs have to be
stored, particularly information that is accessed infrequently.
Examples: - Magnetic disks & tapes, optical disks (ie CD-ROM’s), floppies etc.,
The control and the ALU are may times faster than other devices connected to a
computer system. This enables a single processor to control a number of external devices
such as key boards, displays, magnetic and optical disks, sensors and other mechanical
controllers.
Output unit:-
These actually are the counterparts of input unit. Its basic function is to send the
processed results to the outside world.
Control unit:-
It effectively is the nerve center that sends signals to other units and senses their
states. The actual timing signals that govern the transfer of data between input unit,
processor, memory and output unit are generated by the control unit.
1. First the instruction is fetched from the memory into the processor.
2. The operand at LOCA is fetched and added to the contents of R0
3. Finally the resulting sum is stored in the register R0
The preceding add instruction combines a memory access operation with an ALU
Operations. In some other type of computers, these two types of operations are performed
by separate instructions for performance reasons.
3
Load LOCA, R1
Add R1, R0
Transfers between the memory and the processor are started by sending the
address of the memory location to be accessed to the memory unit and issuing the
appropriate control signals. The data are then transferred to or from the memory.
MEMORY
MAR MDR
CONTROL
PC R0
R1
…
… ALU
IR …
…
Rn-1
n- GPRs
The fig shows how memory & the processor can be connected. In addition to the
ALU & the control circuitry, the processor contains a number of registers used for several
different purposes.
The instruction register (IR):- Holds the instructions that is currently being executed.
Its output is available for the control circuits which generates the timing signals that
control the various processing elements in one execution of instruction.
4
Besides IR and PC, there are n-general purpose registers R0 through Rn-
1 . The other two registers which facilitate communication with memory are: -
1. MAR – (Memory Address Register):- It holds the address of the location to be
accessed.
2. MDR – (Memory Data Register):- It contains the data to be written into or read
out of the address location.
An interrupt is a request signal from an I/O device for service by the processor.
The processor provides the requested service by executing an appropriate interrupt
service routine.
The Diversion may change the internal stage of the processor its state must be
saved in the memory location before interruption. When the interrupt-routine service is
completed the state of the processor is restored so that the interrupted program may
continue.
5
Bus structure: -
The simplest and most common way of interconnecting various parts of the
computer.
To achieve a reasonable speed of operation, a computer must be organized so that
all its units can handle one full word of data at a given time.
A group of lines that serve as a connecting port for several devices is called a
bus.
In addition to the lines that carry the data, the bus must have lines for address and
control purpose.
Since the bus can be used for only one transfer at a time, only two units can
actively use the bus at any given time. Bus control lines are used to arbitrate multiple
requests for use of one bus.
Low cost
Very flexible for attaching peripheral devices
Multiple bus structure certainly increases, the performance but also increases the
cost significantly.
All the interconnected devices are not of same speed & time, leads to a bit of a
problem. This is solved by using cache registers (ie buffer registers). These buffers are
electronic registers of small capacity when compared to the main memory but of
comparable speed.
6
The instructions from the processor at once are loaded into these buffers and then
the complete transfer of data at a fast rate will take place.
Performance: -
The most important measure of the performance of a computer is how quickly it
can execute programs. The speed with which a computer executes program is affected by
the design of its hardware. For best performance, it is necessary to design the compiles,
the machine instruction set, and the hardware in a coordinated way.
The total time required to execute the program is elapsed time is a measure of the
performance of the entire computer system. It is affected by the speed of the processor,
the disk and the printer. The time needed to execute a instruction is called the processor
time.
Just as the elapsed time for the execution of a program depends on all units in a
computer system, the processor time depends on the hardware involved in the execution
of individual machine instructions. This hardware comprises the processor and the
memory which are usually connected by the bus as shown in the fig c.
Bus
The pertinent parts of the fig. c is repeated in fig. d which includes the cache
memory as part of the processor unit.
Let us examine the flow of program instructions and data between the memory
and the processor. At the start of execution, all program instructions and the required data
are stored in the main memory. As the execution proceeds, instructions are fetched one
by one over the bus into the processor, and a copy is placed in the cache later if the same
instruction or data item is needed a second time, it is read directly from the cache.
The processor and relatively small cache memory can be fabricated on a single
IC chip. The internal speed of performing the basic steps of instruction processing on
chip is very high and is considerably faster than the speed at which the instruction and
7
data can be fetched from the main memory. A program will be executed faster if the
movement of instructions and data between the main memory and the processor is
minimized, which is achieved by using the cache.
For example:- Suppose a number of instructions are executed repeatedly over a short
period of time as happens in a program loop. If these instructions are available in the
cache, they can be fetched quickly during the period of repeated use. The same applies to
the data that are used repeatedly.
Processor clock: -
Processor circuits are controlled by a timing signal called clock. The clock
designer the regular time intervals called clock cycles. To execute a machine instruction
the processor divides the action to be performed into a sequence of basic steps that each
step can be completed in one clock cycle. The length P of one clock cycle is an important
parameter that affects the processor performance.
Processor used in today’s personal computer and work station have a clock rates
that range from a few hundred million to over a billion cycles per second.
Suppose that the average number of basic steps needed to execute one machine
cycle instruction is S, where each basic step is completed in one clock cycle. If clock rate
is ‘R’ cycles per second, the program execution time is given by
N S
T
R
this is often referred to as the basic performance equation.
We must emphasize that N, S & R are not independent parameters changing one
may affect another. Introducing a new feature in the design of a processor will lead to
improved performance only if the overall result is to reduce the value of T.
8
A substantial improvement in performance can be achieved by overlapping the execution
of successive instructions using a technique called pipelining.
Consider Add R1 R2 R3
This adds the contents of R1 & R2 and places the sum into R3.
The contents of R1 & R2 are first transferred to the inputs of ALU. After the
addition operation is performed, the sum is transferred to R3. The processor can read the
next instruction from the memory, while the addition operation is being performed. Then
of that instruction also uses, the ALU, its operand can be transferred to the ALU inputs at
the same time that the add instructions is being transferred to R3.
In the ideal case if all instructions are overlapped to the maximum degree
possible the execution proceeds at the rate of one instruction completed in each clock
cycle. Individual instructions still require several clock cycles to complete. But for the
purpose of computing T, effective value of S is 1.
Clock rate:- These are two possibilities for increasing the clock rate ‘R’.
1. Improving the IC technology makes logical circuit faster, which reduces the time
of execution of basic steps. This allows the clock period P, to be reduced and the
clock rate R to be increased.
2. Reducing the amount of processing done in one basic step also makes it possible
to reduce the clock period P. however if the actions that have to be performed by
an instructions remain the same, the number of basic steps needed may increase.
9
instruction a large number of instructions may be needed to perform a given
programming task. This could lead to a large value of ‘N’ and a small value of ‘S’ on the
other hand if individual instructions perform more complex operations, a fewer
instructions will be needed, leading to a lower value of N and a larger value of S. It is not
obvious if one choice is better than the other.
Performance measurements:-
It is very important to be able to access the performance of a computer, comp
designers use performance estimates to evaluate the effectiveness of new features.
The performance measure is the time taken by the computer to execute a given
bench mark. Initially some attempts were made to create artificial programs that could be
used as bench mark programs. But synthetic programs do not properly predict the
performance obtained when real application programs are run.
The program selected range from game playing, compiler, and data base
applications to numerically intensive programs in astrophysics and quantum chemistry. In
each case, the program is compiled under test, and the running time on a real computer is
measured. The same program is also compiled and run on one computer selected as
reference.
The ‘SPEC’ rating is computed as follows.
10
Means that the computer under test is 50 times as fast as the ultra sparc 10. This
is repeated for all the programs in the SPEC suit, and the geometric mean of the result is
computed.
Let SPECi be the rating for program ‘i’ in the suite. The overall SPEC rating for
the computer is given by
1
n
n
SPEC rating = SPECi
i 1
Since actual execution time is measured the SPEC rating is a measure of the
combined effect of all factors affecting performance, including the compiler, the OS, the
processor, the memory of comp being tested.
11
UNIT 3 COMBINATIONAL LOGIC
Analysis Procedure
1. Label all gate outputs that are a function of input variables with arbitrary
symbols. Determine the Boolean functions for each gate output.
2. Label the gates that are a function of input variables and previously labeled
gates with other arbitrary symbols. Find the Boolean functions for these
gates.
3. Repeat the process outlined in step 2 until the outputs of the circuit are
obtained.
Example:
F 1 = T3 + T 2
F1 = T3 + T2 = F2’T1 + ABC = A’BC’ + A’B’C + AB’C’ + ABC
We can derive the truth table in Table 4-1 by using the circuit of Fig.4-2.
Design procedure :
For each symbol of the Excess-3 code, we use 1’s to draw the map
for simplifying Boolean function :
Circuit implementation:
w = A + BC + BD = A + B(C + D)
Binary Adder-Subtractor:
SUM = AB’+A’B
CARRY = AB
These expressions shows`that, SUM output is EX-OR gate and the CARRY output is AND
gate. Figure shows the be implementation of half-adder with all the combinations
including the implementation using NAND gates only.
FULL ADDER : one that performs the addition of three bits(two
significant bits and a previous carry) is a full adder.
Simplified Expressions :
S = x’y’z + x’yz’ + xy’z’ + xyz
C = xy + xz + yz
Another implementation :
Full-adder can also implemented with two half adders and one OR gate (Carry
Look-Ahead adder).
S = z ⊕ (x ⊕ y)
This is also called Ripple Carry Adder ,because of the construction with full
adders are connected in cascade.
Carry Propagation :
Fig.4-9 causes a unstable factor on carry bit, and produces a longest propagation
delay.
The signal from Ci to the output carry Ci+1, propagates through an AND and OR
gates, so, for an n-bit RCA, there are 2n gate levels for the carry to propagate
from input to output.
Because the propagation delay will affect the output signals on different time, so
the signals are given enough time to get the precise and stable outputs.
The most widely used technique employs the principle of carry look-ahead to improve
the speed of the algorithm.
Boolean functions :
Si = Pi ⊕ Ci
Ci+1 = Gi + PiCi
C0 = input carry
C1 = G0 + P0C0
M = 1subtractor ; M = 0adder
Boolean expression of sum can be implemented by using two-input EX-OR gate in which
one of the input is Carry in and the other input is the output of another two-input EX-OR
gate with A and B as its inputs. On the other hand Carry output can implemented by
ORing the outputs of AND gates.
HALF-SUBTRACTOR
Half-subtractor is used to subtract one binary digit from another
to give DIFFERENCE output and a BORROW output. The truth table of a half-subtractor
is shown in figure. The Boolean expressions for half-subtractor are,
D=’B+AB’ and Bo=A’B
Here, the DIFFERENCE i.e. the D output is an EX-OR gate and the BORROW i.e. Bo is
AND gate with complemented input A. Figure shows the logic implementation of a half-
subtractor. Comparing a half-subtractor with a half-adder, it can be seen that, the
expressions for SUM and DIFFERENCE outputs are same. The expression for BORROW
in the case of the half-subtractor is more or less same with CARRY of the half-adder.
However, the case of BORROW output the the minuend is complemented and then
ANDing is done.
FULL SUBTRACTOR
Full subtractor performs subtraction of two bits, one is minuend and other is
subtrahend. In full subtractor ‘1’ is borrowed by the previous adjacent lower minuend
bit. Hence there are three bits are considered at the input of a full subtractor. There are
two outputs, that are DIFFERENCE output D and BORROW output Bo. The BORROW
output indicates`that the minuend bit requires borrow ‘1’ from the next minuend bit.
Figure shows the truth table of a full
subtractor. The K-maps for the two outputs are shown in figure. If we compare
DIFFERENCE output D and BORROW output Bo with full adder`it can be seen that the
DIFFERENCE output D is the same as that for the SUM output. Further, the BORROW
output Bo is similar to CARRY-OUT. In the case of a half-subtractor, A input is
complemented similar things are carried out in full subtractor.
From the Truth Table The Difference and Borrow will written as
Difference=A'B'C+A'BB'+AB'C'+ABC
Reduce it like adder
Then We got
Difference=A B C
Borrow=A'B'C+A'BC'+A'BC+ABC
=A'B'C+A'BC'+A'BC+A'BC+A'BC+ABC ----------> A'BC=A'BC+A'BC+A'BC
=A'C(B'+B)+A'B(C'+C)+BC(A'+A)
Borrow=A'C+A'B+BC
Block diagram
BINARY ADDER/SUBTRACTOR
Subtraction of binary numbers can be carried out by using the addition of 2’s
complement of subtrahend to the minuend. In this operation If the MSB of addition is a
‘0’, then the answer is correct and if MSB is ‘1’, then answer is having negative sign.
Hence, by using full adders subtraction can be carried out.
Figure above the realization of 4 bit adder-subtractor. From the figure it can be seen
that, the bits of the binary numbers are given to full adder through the XOR gates. The
control input is controls the addition or subtraction operation.
When the SUBTRACTION input is logic ‘0’ , the B3 B2 B1 B0 are passed to the full
adders. Hence, the output of the full adders is the addition of the two numbers.
Magnitude comparator:
The equality relation of each pair of bits can be expressed logically with an
exclusive-NOR function as:
A = A3A2A1A0 ; B = B3B2B1B0
xi=AiBi+Ai’Bi’ for i = 0, 1, 2, 3
(A = B) = x3x2x1x0
We inspect the relative magnitudes of pairs of MSB. If equal, we compare the
next lower significant pair of digits until a pair of unequal digits is reached.
If the corresponding digit of A is 1 and that of B is 0, we conclude that A>B.
(A>B)=
A3B’3+x3A2B’2+x3x2A1B’1+x3x2x1A0B’0
(A<B)=
A’3B3+x3A’2B2+x3x2A’1B1+x3x2x1A’0B0
Decoders :
S(x, y, z) = ∑(1, 2, 4, 7)
C(x, y, z) = ∑(3, 5, 6, 7)
Encoders:
z = D1 + D3 + D5 + D7
y = D2 + D3 + D6 + D7
x = D4 + D5 + D6 + D7
Priority encoder :
x = D2 + D3
y = D3 + D1D’2
V = D0 + D1 + D2 + D3
Multiplexers
Multiplexer is a special type of combinational circuit. There are n-data
inputs, one output and m select inputs with 2m = n. It is a digital circuit
which selects one of the n data inputs and routes it to the output. The
selection of one of the n inputs is done by the selected inputs. Depending
on the digital code applied at the selected inputs, one out of n data
sources is selected and transmitted to the single output Y. E is called the
strobe or enable input which is useful for the cascading. It is generally an
active low terminal that means it will perform the required operation
when it is low.
Block diagram
2 to 1 Multiplexer:
S = 1, Y = I1 0 I0
1 I1
2 : 1 multiplexer
4 : 1 multiplexer
16 : 1 multiplexer
32 : 1 multiplexer
Block Diagram
Truth Table
4-to-1 Line Multiplexer:
Multiplexer circuits can be combined with common selection inputs to provide multiple-bit
selection logic. Compare with Fig4-24.
Boolean function implementation :
A more efficient method for implementing a Boolean function of n variables with a
multiplexer that has n-1 selection inputs.
F(x, y, z) = (1,2,6,7)
1 : 2 demultiplexer
1 : 4 demultiplexer
1 : 16 demultiplexer
1 : 32 demultiplexer
Block diagram
Truth Table
College of Information Technology / Software Department
Lec. 10
Logic Design / First Class / 2018-2019
Figure (1)
04
The 3-to-8 line decoder consists of three input variables and eight output lines. Note that each of the
output lines represents one of the minterms generated from three variables. The internal combinational
circuit is realized with the help of INVERTER gates and AND gates. The operation of the decoder
circuit may be further illustrated from the input output relationship as given in the above table. Note
that the output variables are mutually exclusive to each other, as only one output is possible to be logic
1 at any one time.
Figure (2)
04
The above expression can be realized in Figure 3.
Figure (3)
Encoders
An encoder is a combinational network that performs the reverse operation of the decoder. An
encoder has 2n or less numbers of inputs and n output lines. The output lines of an encoder generate the
binary code for the 2n input variables. Figure 4 illustrates an eight inputs/three outputs encoder. It may
also be referred to as an octal-to-binary encoder where binary codes are generated at outputs according
to the input conditions. The truth table is given bellow.
04
The encoder in Figure 4 assumes that only one input line is activated to logic 1 at any particular
time, otherwise the other circuit has no meaning. It may be noted that for eight inputs there are a
possible 28 = 256 combinations, but only eight input combinations are useful and the rest are don’t-care
conditions. It may also be noted that D0 input is not connected to any of the gates. All the binary
outputs A, B, and C must be all 0s in this case. All 0s output may also be obtained if all input variables
D0 to D7 are logic 0. This is the main discrepancy of this circuit. This discrepancy can be eliminated by
introducing another output indicating the fact that all the inputs are not logic 0.
However, this type of encoder is not available in an IC package because it is not easy to implement
with OR gates and not much of the gates are used. The type of encoder available in IC package is
called a priority encoder. These encoders establish an input priority to ensure that only highest priority
input is encoded. As an example, if both D2 and D4 inputs are logic 1 simultaneously, then output will
be according to D4 only i.e., output is 100.
04
UNIT 1: DIGITAL LOGICAL CIRCUITS
What is Digital Computer? OR Explain the block diagram of digital computers.
Input Output
Devices Input-Output Processor Devices
The hardware of the computer is usually divided into three major parts.
The Central processing Unit (CPU) contains an arithmetic and logic unit for
manipulating data, a number of registers for storing data and control circuits for
fetching and executing instructions.
The memory of a computer contains storage for instructions and data, it is called a
Random Access Memory (RAM) because the CPU can access any location in memory
at random and retrieve the binary information within a fixed interval of time.
The input and output processor contains electronic circuit for communication and
controlling the transfer of information between the computer and the outside world.
The input and output device connected to the computer include keyboards, printers,
terminals, magnetic disk drives and other communication devices.
What is Gates? Explain the Logic Gates in brief.
LOGICAL GATES
Basic Gates
AND Gate:
In this type of gate output is high only when all its inputs are high.
If any single input is law then the output will remain low.
So it is said that in AND gate the output is only high when the input is also high.
SYMBOL:
TRUTH-TABLE:
INPUT OUTPUT
A B A AND B
0 0 0
0 1 0
1 0 0
1 1 1
OR Gate:
In this type of gate if any input signal is high then the output will be high.
The output is only low only when all the inputs are low
SYMBOL:
TRUTH-TABLE:
INPUT OUTPUT
A B A OR B
0 0 0
0 1 1
1 0 1
1 1 1
NOT Gate:
This type of gate is also known as “Inverter”.
It is a gate that contains only one input and only one output.
The output is always opposite than the input signals.
SYMBOL:
TRUTH-TABLE:
INPUT OUTPUT
A NOT A
(A’)
0 1
1 0
Universal Gates
NAND and NOR gates are known as universal gates because we can construct any gate using
NAND & NOR gate.
NOR Gate:
The NOR gate is the complement of the OR gate.
As shown in the truth table that the output of NOR gate is exactly opposite than the
output of OR gate.
This means that the output will be high when all the input is low.
SYMBOL:
TRUTH-TABLE:
INPUT OUTPUT
A B A NOR B
0 0 1
0 1 0
1 0 0
1 1 0
NAND Gate:
The NAND gate is an AND gate followed by NOT gate.
As shown in the truth table that the output of NAND gate is exactly opposite than the
output of AND gate.
This means that the output will be high when all the input is high.
SYMBOL:
TRUTH-TABLE:
INPUT OUTPUT
A B A NAND B
0 0 1
0 1 1
1 0 1
1 1 0
Exclusive Gates
EX-OR Gate:
This gate is produces high output whenever the two inputs are at opposite level.
The EX-OR gate is the gate that produces high output for Odd number of high inputs.
SYMBOL:
TRUTH-TABLE:
INPUT OUTPUT
A B A EX-OR B
0 0 0
0 1 1
1 0 1
1 1 0
EX-NOR Gate:
This gate is produces high output whenever the two inputs are at same level.
The EX-OR gate is the gate that produces high output for Even number of high inputs.
The truth table shows that output of this gate is exactly opposite of EX-OR gate.
SYMBOL:
TRUTH-TABLE:
INPUT OUTPUT
A B A EX-NOR B
0 0 1
0 1 0
1 0 0
1 1 1
a. Truth tables
b. Logic diagrams
c. Algebraic expression
x Y z F
0 0 0 0
0 0 1 1
0 1 0 0
0 1 1 1
1 0 0 1
1 0 1 1
1 1 0 1
1 1 1 1
Boolean Operations
THEOREM 1: x + y = x . y
F =xy’ + x’y
RULES FOR K- MAP:
Each cell with 1 must be included in at list 1 group.
Try to form the largest possible groups.
Try to end up with as few groups as possible.
Groups may be in sizes that are powered of 2.
Groups may be square or rectangular only.
Groups may be horizontal or vertical but not diagonal.
Groups may wrap around the table.
Groups may overleap.
The larger a group is, the more redundant inputs there are:
o Group of 1 has no redundant input.
o Group of 2 known as pair has 1 redundant input.
o Group of 4 known as quad has 2 redundant input.
o Group of 8 known as octet has 3 redundant input.
Sum-of-Products Simplification
A Boolean function represented by a truth table is plotted into the map by inserting 1's
into those squares where the function is 1.
Boolean functions can then be simplified by identifying adjacent squares in the
Karnaugh map that contain a 1.
A square is considered adjacent to another square if it is next to, above, or below it. In
addition, squares at the extreme ends of the same horizontal row are also considered
adjacent. The same applies to the top and bottom squares of a column. The objective
to identify adjacent squares containing 1's and group them together.
Groups must contain a number of squares that is an integral power of 2.
Groups of combined adjacent squares may share one or more squares with one or
more groups.
Each group of squares represents an algebraic term, and the OR of those terms gives
the simplified algebraic expression for the function.
To find the most simplified algebraic expression, the goal of map simplification is to
identify the least number of groups with the largest number of members.
The three variable maps for this function is shown in the figure 2.4
There are four squares marked with 1’s, one for each minterm that produces 1 for the
function. These squares belong to minterm 3,4,6,7 and are recognized from the figure
b.
Two adjacent squares are combined in the third column. This column belongs to both
B and C produces the term BC.
The remaining two squares with 1’s in the two corner of the second row are adjacent
and belong to row columns of C’, so they produce the term AC’.
The simplified expression for the function is the or of the two terms:
F = BC + AC’
The area in the map covered by this four variable consists of the squares marked with
1’s in fig 1.10. The function contains 1’s in the four corners that when taken as groups
give the term B’D’. This is possible because these four squares are adjacent when the
map is considered with the top and bottom or left and right edges touching.
The two 1’s on the bottom row are combined with the two 1’s on the left of the
bottom row to give the term B’C’.
The remaining 1 in the square of minterm 6 is combined with the minterm 2 to give the
term A’CD’.
Product-of-Sums Simplification
Another method for simplifying Boolean expressions can be to represent the function
as a product of sums.
This approach is similar to the Sum-of-Products simplification, but identifying adjacent
squares containing 0’s instead of 1’s forms the groups of adjacent squares.
Then, instead of representing the function as a sum of products, the function is
represented as a product of sums.
Examples
F(A,B,C,D) = (0,1,2,5,8,9,10)
The 1’s marked in the map of figure 2.7 represents the minterms that produces a 1 for
the function.
The squares marked with 0’s represent the minterm not included in F and therefore
denote the complement of F.
Combining the squares with 1’s gives the simplified function in sum-of-products form:
F = B’D +B’C’+A’C’D
If the squares marked with 0’s are combined as shown in the diagram, we obtain the
simplified complement function:
F’=(A’+B’)(C’D’)(B’+D)
The logic diagram of the two simplified expression are shown in fig 2.8
The sum of product expression us implemented in fig 2.8(a) with a group of of AND
gates, one for each AND term.
The output of the AND gates are connected to the inputs of a single OR gate. The
same function is implemented in fig 2.8(b) in product of sum form with a group of OR
gates, one for each OR term, the outputs of the OR gates are connected to the inputs
of a single And gate.
In each case it is assumed that the input variable are directly available in their
complement, so inverter are not included.
A combinational circuit is the circuit where more than 1 circuit is designed into single
component.
It has N no of inputs and M no of outputs.
It is basically used to design digital applications and it transforms the data into the
digital manner.
A combinational circuit is a connected arrangement of logic gates with a set of inputs
and outputs.
At any given time, the binary values of the outputs are a function of the binary values
of the inputs.
The design of a combinational circuit starts from a verbal outline of the problem and
ends in a logic circuit diagram. The procedure involves the following steps:
Arithmatic circuits:
It is made of different arithmetic operators. There will be addition, substraction,
division, modules and any other arithmetic operations.
Half-Adder
Full-Adder
It is a digital logic circuit whose output It is a digital logic circuit whose output
depends on the present inputs. depends on the present inputs as well as
previous inputs.
It can describe by the output values. It can describe by the output values as well
as state values.
Examples of combinational circuit are half Examples of sequential circuit are flip-flops
adder and full adder. like RS, Clocked RS, D and JK.
At any given time, the binary values of the Hence, a sequential circuit is an
outputs are a function of the binary values interconnection of flip-flops and gates.
of the inputs.
What is Flip-flops
SR Flip-Flop
Figure SR Flip-Flop
Inputs:
S (for set)
R (for reset)
C (for clock)
Outputs:
Q
Q'
If there is no signal at the clock input C, the output of the circuit cannot change
irrespective of the values at inputs S and R.
Only when the clock signals changes from 0 to 1 can the output be affected according
to the values in inputs S and R
If S =1 and R = 0 when C changes when C changes from 0 to 1 output Q is set to 1. If S =
0 and R =1 when C changes from 0 to 1.
If both S and R are 0 during the clock transition, output does not change.
When both S and R are equal to 1, the output is unpredictable and may go to either 0
or 1, depending on internal timing that occur within the circuit.
D Flip-Flop
D Flip-flop
Inputs:
D (for data)
C (for clock)
Outputs:
Q
Q'
The operation of the D flip-flop is as follow.
JK Flip-Flop
Jk Flip-Flop
Inputs:
J
K
C (for clock)
Outputs:
Q
Q'
T Flip-Flop
T Flip-Flop
Inputs:
T (for toggle)
C (for clock)
Outputs:
Q
Q'
Most flip-flops are edge-triggered flip-flops, which means that the transition occurs at
a specific level of the clock pulse.
A positive-edge transition occurs on the rising edge of the clock signal.
A negative-edge transition occurs on the falling edge of the clock signal.
Another type of flip-flop is called a master-slave flip-flop that is basically two flip-flops
in series.
Flip-flops can also include special input terminals for setting or clearing the flip-flop
asynchronously. These inputs are usually called preset and clear and are useful for
initialing the flip-flops before clocked operations are initiated.
Flip-Flop Excitation Tables
During the design of sequential circuits, the required transition from present state to
next state is known.
What the designer needs to know is what input conditions must exist to implement
the required transition.
This requires the use of flip-flop excitation tables.
Excitation Tables
0 1 1 0
1 0 0 1
1 1 X 0
0 0 0 X
0 1 1 X
1 0 X 1
1 1 X 0
T Flip-Flop Excitation Table
Q(t) Q(t+1) T
0 0 0
0 1 1
1 0 1
1 1 0
Sequential Circuits
Design Procedure
For m flip-flops and n inputs, the state table will consist of m columns for the present
state, n columns for the inputs, and m columns for the next state. The number of rows
in the table will be up to 2m+n, one row for each binary combination of present state
and inputs.
** Each flip-flop input equation specifies a logic diagram whose output must be
connected to one of the flip-flop inputs.
The three data inputs. A0, A1, and A2, are decoded into eight outputs, each output
representing one of the combinations of the three binary input variables.
The three inverters provide the complement of the inputs, and each of the eight AND
gates generates one of the binary combination.
A particular application of this decoder is a binary-to-octal conversion. The input
variables represent a binary number and the outputs represent the eight digits of the
octal number system.
However, a 3-to-8-line decoder can be used for decoding any 3-bit code to provide
eight outputs, one for each combination of the binary code.
Commercial decoders include one or more enable inputs to control the operation of
the circuit. The decoder of the Figure has one enable input, E.
The decoder is enabled when E is equal to 1 and disabled when E is equal to 0. The
operation of the decoder can be clarified using the truth table listed in Table.
When the enable input E is equal to 0, all the outputs are equal to 0 regardless of the
values of the other three data inputs.
The three x's in the table designate don't-care conditions. When the enable input is
equal to 1, the decoder operates in a normal fashion.
For each possible input combination, there are seven outputs that are equal to 0 and
only one that is equal to 1.
The output variable whose value is equal to 1 represents the octal number equivalent
of the binary number that is available in the input data lines.
Some decoders are constructed with NAND instead of AND gates. Since a NAND gate
produces the AND operation with an inverted output, it becomes more economical to
generate the decoder outputs in their complement form.
A 2-to-4-line decoder with an enable input constructed with NAND gates is shown in
Figure.
The circuit operates with complemented outputs and a complemented enable input
E. The decoder is enabled when E is equal to 0. As indicated by the truth table, only
one output is equal to 0 at any given time; the other three outputs are equal to 1.
The output whose value is equal to 0 represents the equivalent binary number in
inputs Ai and Ao.
The circuit is disabled when E is equal to 1, regardless of the values of the other two
inputs.
Decoder Expansion
A technique called decoder expansion can be utilized to construct larger decoders out
of smaller ones.
For example, two 2-to-4-line decoders can be combined to construct a 3-to-8-line
decoder. Figure below shows 3-8-line decoder constructed with two 2x4 decoders.
The above given Figure shows how the decoders with enable inputs can be connected
to form a larger decoder.
As you can see that there are two 2-to-4-line decoders are combined to achieve a 3-to-
8-line decoder.
The two least significant bits of the input are connected to both decoders.
The most significant bit is connected to the enable input of one decoder and through
an inverter to the enable input of the other decoder.
It is assumed that each decoder is enabled when its E input is equal to 1. When E is
equal to 0, the decoder is disabled and all its outputs are in the 0 level. When A2 = 0,
the upper decoder is enabled and the lower is disabled.
The lower decoder outputs become inactive with all outputs at 0. The outputs of the
upper decoder generate outputs Do through D3, depending on the values of A1 and
A0(while A2 = 0).
When A2= 1, the lower decoder is enabled and the upper is disabled. The lower
decoder output generates the binary equivalent D4, through D7 since these binary
numbers have a 1 in the A2 position.
What is Encoder?
An encoder is a digital circuit that performs the inverse operation of a decoder. An
encoder has 2n (or less) input lines and n output lines.
The output lines generate the binary code corresponding to the input value. An
example of an encoder is the octal-to-binary encoder, whose truth table is given
below.
Inputs outputs
D7 D6 D5 D4 D3 D2 D1 D0 A2 A1 A0
0 0 0 0 0 0 0 1 0 0 0
0 0 0 0 0 0 1 0 0 0 1
0 0 0 0 0 1 0 0 0 1 0
0 0 0 0 1 0 0 0 0 1 1
0 0 0 1 0 0 0 0 1 0 0
0 0 1 0 0 0 0 0 1 0 1
0 1 0 0 0 0 0 0 1 1 0
1 0 0 0 0 0 0 0 1 1 1
The encoder can be implemented with OR gates whose inputs are determined directly
from the truth table.
Output A0 =1 if the input octal digit is 1 or 3 or 5 or 7. Similar conditions apply for other
two outputs.
Octal-to-binary Encoder
Select Output
S1 S0 Y
0 0 I0
0 1 I1
1 0 I2
1 1 I3
4-to-1 line Multiplexer
Input Output
D S0 S1 F0 F1 F2 F3
0 0 0 0 0 0 0
0 0 1 0 0 0 0
0 1 0 0 0 0 0
0 1 1 0 0 0 0
1 0 0 1 0 0 0
1 0 1 0 1 0 0
1 1 0 0 0 1 0
1 1 1 0 0 0 1
simplification of Multiplexer.
o It is used to connecting two or more sources to a single destination among
computer units.
o It is used in digital circuits to control signal and in data routing.
o It is also useful in operation sequencing.
o It is useful to constructing a common bus system.
A 4-bit register is shown in the figure below. A clock transition applied to the C inputs
of the register will load all four inputs I 0 through I3 in parallel.
Shift Registers
A register capable of shifting its binary information in one or both directions is called a
shift register.
Shift registers are constructed by connecting flip-flops in cascade, where the output of
one flip-flop is connected to the input of the next flip-flop.
All flip-flops receive common clock pulses that initiate the shift from one stage to the
next.
A serial input shift register has a single external input (called the serial input) entering
an outermost flip-flop. Each remaining flip-flop uses the output of the previous flip-
flop as its input, with the last flip-flop producing the external output (called the serial
output).
A register capable of shifting in one direction is called a unidirectional shift register.
A register that can shift in both directions is called a bi-directional shift register.
The most general shift register has the following capabilities:
An input for clock pulses to synchronize all operations.
A shift-right operation and a serial input line associated with the shift-right.
A shift-left operation and a serial input line associated with the shift-left.
A parallel load operation and n input lines associated with the parallel transfer.
N parallel output lines.
A control state that leaves the information in the register unchanged even
though clock pulses are applied continuously.
A mode control to determine which type of register operation to perform.
The simplest possible shift register is one that uses only flip-flops, as shown in the
figure below.
A register that can shift in both directions is called a bi-directional shift register.
A 4-bit bidirectional shift register with parallel load is shown in figure below. Each
stage consists of a D flip-flop and a 4X1 MUX.
The two selection inputs S1 and S0 select one of the MUX data inputs for the D flip-
flop. The selection lines control the mode of operation of the register according to the
function table shown in table below.
When the mode control S1S0 = 00, data input 0 of each MUX is selected.
This condition forms a path from the output of each flip-flop into the input of the
same flip-flop.
The next clock transition refers into each flip-flop the binary value it held previously,
and no change of state occurs. When S1S0 = 01, the terminal marked 1 in each MUX has
a path to the D input of the corresponding flip-flop.
This causes a shift-right operation, with the serial input data transferred into flip-flop
A0 and the content of each flip-flop Ai-1 transferred into flip-flop Ai for i=1,2,3. When
S1S0 = 10 a shift-left operations results, with the other serial input data going into flip-
flop A3 and the content of flip-flop Ai+1 transferred into flip-flop Ai for I=0,1,2. When
S1S0 = 11, the binary information from each input I0 through I3 is transferred into the
corresponding flip-flop, resulting in a parallel load operation.
In the diagram, the shift-right operation shifts the contents of the register in the down
direction while the shift left operation causes the contents of the register to shift in
the upward direction.
Shift registers are often used to interface digital systems situated remotely from each
other. For example, suppose that it is necessary to transmit an n-bit quantity between
two points.
If the distance between the source and the destination is too far, it will be expensive
to use n lines to transmit the n bits in parallel.
It may be more economical to use a single line and transmit the information serially
one bit at a time.
The transmitter loads the n-bit data in parallel into a shift register and then transmits
the data from the serial output line.
The receiver accepts the data serially into a shift register through its serial input line.
When the entire n bits are accumulated they can be taken from the outputs of the
register in parallel.
Thus the transmitter performs a parallel-to-serial conversion of data and the receiver
converts the incoming serial data back to parallel data transfer.
UNIT 3: DATA REPRESENTATION
What is Number System?
A number system is a set of numbers, together with one or more operations such as
addition or multiplication.
There are mainly two types of number system
o Positional number system
o Non positional number system
NON POSITIONAL NUMBER SYSTEM
o In this system each symbol represents the same value so it is difficult to
perform arithmetic operation with such numbers.
POSITIONAL NUMBER SYSTEM
o In this system each number is identified by the position where it is placed. It
means that the value of number is depended on the position. In this kind of
system it is possible to perform the arithmetic operation very easily.
o There are mainly 4 types of positional number systems are commonly used.
Binary
Octal
Decimal
HexaDecimal
Binary Number System
o This number system is used in computers or digital systems
o 1.
o Each digit enter in the system is represented in the form of 0 and 1s
Octal Number Systems
o in this system there are 8 different characters are used to represent the
numbers.
o We can use 0 to 7 different symbols are used.
o In octal number system base is 8
o In this number system values increase from right to left as 1, 8, 64, 512, 4096..
Decimal Number System
o in this system there are 10 different characters are used to represent the
numbers.
o We can use 0,1,2,3,4,5,6,7,8,9 different symbols are used.
o In octal number system base is 10.
o In this number system each positions of number is given depending on weight.
o For example: 4123
input Multiplication of
A*B
A B
0 0 0
0 1 0
1 0 0
1 1 1
1010
0000
0000
1010
101101 0
The answer is (1011010)
Division
Table for binary division is given as under:
input Division of A / B
A B
0 0 Undefined
0 1 0
1 0 Undefined
1 1 1
o The steps for binary division are:
o Start from the left of the dividend.
o Perform subtraction in which the divisor is subtracted from the dividend
o If subtraction is possible put a 1 in the quotient and subtract the divisor from
the corresponding digits of the dividend else put a 0 in the quotient
o Bring down the next digit to the right of the remainder.
o Execute step 2 till there are no more digits left to strating down from the
dividend.
o e.g.
o 100001 / 110
0101 (quotient)
110 100001
110
1000
110
100
110
1001
110
11 (reminder)
Write a note on Floating Point Representation.
A number which has both an integer part as well as fractional part is called Real or
Floating point Number.
For e.g. 2.365, 78.738789, 6.5643 etc are the examples of floating point numbers.
The floating point representation of numbers has two pars.
The first part represents a sign fixed point number called Mantissa.
The second part designated the position of decimal or binary point; this part is known
as exponent.
The fixed point mantissa may be integer or fractional.
For example : the decimal no +6132.789 is represent in floating point with exponent
and fraction as follows:
The value of exponent indicates that the actual position of decimal point is 4 positions
to the right of the indicated decimal point in the fraction.
We can also represent this number as +0.6132789 × 104 like scientific notation.
In general floating point is represented in the form of m × re.
where
m = mantissa
r = radix
e = exponent.
An error detection code is a binary code that detects error during information
transmission.
The error detection code can not be correct error but gives just indication that the
error is presents.
There are various techniques to detect error but the most common is parity bit.
Parity Bit
A parity bit is an extra bit included with a binary message to make the total number of
1’s either odd or even.
Generally there are 2 techniques even and odd parity is used.
In even parity bit the sum of all 1’s is even while in odd parity the sum of all 1’s is odd.
For e.g. for three bit message has two possible parity shown in bellow table.
001 0 1
010 0 1
011 1 0
100 0 1
101 1 0
110 1 0
111 0 1
During transformation of information from one location to another parity bit is handle
as follows:
o At sending end, the message transferred to parity generator.
o The parity generator generates required parity.
o Then the message transferred to destination.
o At destination the parity checker checks the proper parity adopted.
o An error is detected if the checked parity doesn’t confirm to the adopted
parity.
The parity generator & checker network are logic circuits constructs with Exclusive or
gate.
UNIT 4: CENTRAL PROCESSING UNIT
Write a note on Central Processing Unit.
REGISTER SET
CU
ALU
The control unit, register sets and ALU are the major three components of central
processing unit.
The main memory is also a part of central processing unit.
The CPU of small computer contains a single microprocessor while the CPU of large
computer may contain multiple microprocessors.
The microprocessor contains two major parts that is CU – Control Unit and ALU –
Arithmetic and Logical Unit.
Control Unit: it controls the entire operations of the system. It also controls the input
and output devices.
Arithmetic & Logical Unit: it performs the mathematical as well as logical operations
on the instruction. This unit is responsible to generate the output on inputted
instructions.
Register sets: the register sets contain more than one registers. The registers are
basically used for storing immediate results during the process of executing
instruction. Various registers performs various task.
Explain General Register Organization.
The set of registers in a computer are connected to the ALU using busses and
Multiplexers.
A 14-bit control word specifies two source registers (SELA & SELB), a destination
register (SELD), and an operation (OPR).
The registers can be specified using three bits each as follows:
There are 14 – bit binary selection input in the unit and their combine value that specify
a control word.
The figure of 14 – bit combine word is as given below:
The concept of program interrupt is use to handle variety of problems that arrives out
of normal program sequence.
The interrupt facility is useful in multi programming Environment when two or more
program resided in the memory at same time.
The function of interrupt facility is to take care of data transfer of one or more
program while another program is currently being executed.
TYPES OF INTERRUPT :
External interrupt :
o The external interrupt caused by external events, they comes from Input
output devices, timing devices, timing devices of power supply or any other
External sources.
o The external interrupt are asynchronous and they are independent of The
program being executed at the time.
Internal interrupt:
o Internal interrupt is generated by some exceptional condition that cause by the
program itself.
o This type of interrupt arises from illegal or erroneous use of an Instruction or
data. It is known as traps.
o Examples that cause the internal interrupts are register overflow, stack
overflow, etc.
o This interrupt are synchronous with program. If the program is rerun the
internal interrupts will occur in the same place each time.
Software interrupt
A stack is an ordered set of elements only one of which can be used at a time
The point of access is called top of the stack
The number of elements in the stack or length of the stack is a variable
The items may be added to or deleted from the top of the stack as Push Down list or
Last In First Out (LIFO) List
A useful feature that is included in the CPU of most computer is stack or Last In First
Out (LIFO) List
A stack is a storage device that stores information in such a manner that the item
stored last is the first item retrieved.
The register that holds the address for the stack is called a Stack Pointer (SP) because
its value always points at the top item in the stack
The two operation of stack are PUSH and POP
The operation of insertion is called PUSH
The operation of deletion is called POP
There are two types of stack operation like REGISTER STACK & MEMORY STACK
REGISTER STACK
ADDRESS
FULL EMPTY
63
4
SP Z 3
Y 2
X 1
0
The fig shows the organization of 64 words register
stack. As shown in figure
Three items are place in stack that is x, y and z item is top of the stack. So that
SP is now 3.
SP means Stack pointer.
DR means Data Register
There are two types of operations are used in this type of Register Stack Operation
like PUSH and POP
PUSH
If the stack is not full then a new item is inserted with a push operation
If the stack is full and user tries to insert a new value in the stack then it is called Stack
Overflow error.
POP
A new item is deleted from the stack if the stack is not empty.
If the stack is empty and user tries to delete an item from the stack then it is called
Stack Underflow error.
MEMORY STACK
A stack exist as a stand alone unit or it can be implemented in RAM that attached to
CPU
There are three types of segments are used in the memory stack like Program, Data
and Stack.
The Program Counter (PC) points at the address of the next instruction in the program
The address Register (AR) points at an array of data
The Stack Pointer (SP) points at the top of the stack
This type of three segments are connected with the common address bus
There are two types of operations are used in this type of Memory Stack Operation
like PUSH and POP
PUSH
If the memory is not full then a new item is inserted with a push operation
If the memory is full and user tries to insert a new item in the memory then it is called
Memory Overflow error or Full Memory error.
POP
A new item is deleted from the memory if the memory is not empty.
If the memory is empty and user tries to delete an item from the memory then it is
called Memory Underflow error or Empty Memory error.
RPN has number of advantages over infix notation for expressing algebraic formulas.
Any formula can be expressed without parenthesis.
It is convenient for evaluating formulas on computer with stack.
The infix operator has precedence which is undesirable.
This disadvantages is eliminated By reverse polish notation.
The main advantage of polish notation is that any formula can be expressed without
Parenthesis.
This is possible by combining reverse polish notation with a stack registers.
Hence stack is very useful for handling long and complex problems involving
arithmetic Expressions.
This procedures is employed in some electronic calculators and computers.
In this method The arithmetic expressions first converted reverse polish notation and
then operands are pushed Into the stack in the order in which they appear.
Let us discuss one example for more clarification of this procedure. Consider
arithmetic
Expression.
(6*3) + (5*9)
In reverse polish notation it is expressed as
36*95*+
SHORT NOTE: ACCUMULATOR REGISTER
Some processor unit separate one register from all other so this type of register is
called an Accumulator Register
This register is also called AC or A register.
The name of this register is derived from Arithmetic addition process.
I/P DATA
SELECT B
SOURCE PROCESSPR OR
MEMORY
UNIT
ALU
ACCUMULATOR
REGI. (A)
O/P DATA
An Input – Output Processor (IOP) is a software program that is used to control input
– output operations
An Input – Output Processor (IOP) controls sending and receiving data between Input
– Output Process
An Input – Output Processor (IOP) finds and corrects sending and receiving errors
between Input – Output Process
An Input – Output Processor (IOP) assembles and disassembles messages between
Input – Output Process
An Input – Output operation is performed by the CPU
In figure, we can show that, Memory bus contains CPU and IOP for communication
Input / Output bus (I / O) Bus is connected with the Input – Output Processor (IOP)
with the help of Peripheral devices (PD).
An Input – Output Processor (IOP) communicates with the peripheral device through
the I/O bus and with the memory through the memory bus.
When IOP needs to transfer data to or from memory, it starts a memory cycle from the
CPU and then it transfers the data to or from memory.
In computer, CPU is the master processor and IOP is slave processor.
CPU sends information by using IOP path.
IOP contains status word into that memory location.
CPU checks the status word if the status word is correct then CPU inform to the IOP
about sending information.
After that, IOP reads and executes commands from the memory that located at the
specified location.
When the IOP completes the Input Output data transfer then it informs to the CPU for
data transfer is completed.
UNIT 5: INPUT OUTPUT ORGANIZATION
WRITE A NOTE ON MEMORY BUSES.
A collection of cables through which is transmitted from one peripherals devices into
another peripherals devices that is known as Bus
Buses can send the data information, address information, special instruction from
one device into another device.
There are many types of buses are used like
o Internal bus
o External bus
o System bus
o Instruction bus
o Memory bus
o Data bus
o Address bus
o Control bus
o Input output bus
Internal Bus
Internal bus is placed inside the processor.
Internal bus is used to send information between processor register and internal
components of processor
External Bus
External bus is placed outside the processor.
External bus is used to send information between one parts of computer into another
part of computer
System Bus
System bus is used to connect the processor with main memory
Instruction Bus
Instruction bus is used to fetch the instruction from main memory
Memory Bus
Memory bus is used to carry memory location.
Memory bus is used to connect system memory with the processor
Data Bus
Data bus is used to send data from one parts of computer into another part of
computer
Data bus is used to send data to the processor, memory and other parts of computers
Data bus is used to connect the processor, memory and other parts of computers for
communicate with each other for transferring the data
Address Bus
Address bus is used to send address from one parts of computer into another part of
computer
Address bus is used to send address to the processor, memory and other parts of
computers
Control Bus
Control bus is used to control the data and information of the data bus and address
bus
Input – Output Bus
Input – Output bus is used to connect between processor and input – output parts for
data transmission.
Input – Output bus contains data line, address line and control line for data
transmission
DMA Transfer
In this type of DMA process, When Input device needs to transfer data to or from
memory, it request to the DMA controller by setting DMA Bus Request Input to 1 and
DMA Bus Grant Input to 1.
Here 1 means Enable and 0 means Disable
Bus Request (BR) is used to pass the request to the CPU.
After that, CPU accept the request from the Bus Request (BS) signal when the Bus
Grant (BG) is Enable.
After that, CPU can read and write the request by using Read (RD) and Write (WR)
signal.
After that CPU gives address into that request by using Address Bus (A BUS) and also
give data into that request by using Data Bus (D BUS).
After that CPU can transfer to that request from one place into another by using DMA
controller.
When data transfer is completed then Bus Request and Bus Grant signal becomes
disable.
DMA method allows the Input / Output devices to directly communicate with the main
memory of a computer.
DMA method directly reads and writes data from the memory by using memory bus
DMA is a technique for moving data directly between main memory and peripheral of
computer without need of CPU
DMA Controller contains 3 types of register like Address Register, Word Count
Register and Control Register.
ADDRESS REGISTER
Address register contains the address of memory where the data from the input
device needs to be stored or from where the data for the output device needs to be
fetched
CPU gives this address to the data bus.
The address is transferred to the Address Register from the Data Bus through the
Internal Bus.
Address in this register is incremented after transferring each word to or from
memory
Word Count Register stores the number of words the need to be read from or written
into memory
It is decremented after each word transfer.
When the value in this register becomes zero, it is assumed that data transfer has
completed
CONTROL REGISTER
The control register specifies the mode of data transfer like it is a read operation or it
is a write operation
CPU reads the data through the data bus after setting to Read Input to 1
CPU writes the data through the data bus after setting to Write Input to 1
Universal Gates
Lesson Objectives:
In addition to AND, OR, and NOT gates, other logic gates like NAND and NOR are
also used in the design of digital circuits.
The small circle (bubble) at the output of the graphic symbol of a NOT gate is
formally called a negation indicator and designates the logical complement.
NAND Gate:
The NAND gate represents the complement of the AND operation. Its name is an
abbreviation of NOT AND.
The graphic symbol for the NAND gate consists of an AND symbol with a bubble on
the output, denoting that a complement operation is performed on the output of the
AND gate.
The truth table and the graphic symbol of NAND gate is shown in the figure.
The truth table clearly shows that the NAND operation is the complement of the
AND.
NOR Gate:
The NOR gate represents the complement of the OR operation. Its name is an
abbreviation of NOT OR.
The graphic symbol for the NOR gate consists of an OR symbol with a bubble on the
output, denoting that a complement operation is performed on the output of the OR
gate.
The truth table and the graphic symbol of NOR gate is shown in the figure.
The truth table clearly shows that the NOR operation is the complement of the OR.
Universal Gates:
A universal gate is a gate which can implement any Boolean function without need to
use any other gate type.
In practice, this is advantageous since NAND and NOR gates are economical and
easier to fabricate and are the basic gates used in all IC digital logic families.
1. All NAND input pins connect to the input signal A gives an output A’.
2. One NAND input pin is connected to the input signal A while all other input pins
are connected to logic 1. The output will be A’.
Thus, the NAND gate is a universal gate since it can implement the AND, OR
and NOT functions.
2. One NOR input pin is connected to the input signal A while all other input pins are
connected to logic 0. The output will be A’.
Thus, the NOR gate is a universal gate since it can implement the AND, OR and
NOT functions.
Equivalent Gates:
The shown figure summarizes important cases of gate equivalence. Note that bubbles
indicate a complement operation (inverter).
Two NOT gates in series are same as a buffer because they cancel each other as A’’ =
A.
Two-Level Implementations:
We have seen before that Boolean functions in either SOP or POS forms can be
implemented using 2-Level implementations.
For SOP forms AND gates will be in the first level and a single OR gate will be in the
second level.
For POS forms OR gates will be in the first level and a single AND gate will be in the
second level.
Note that using inverters to complement input variables is not counted as a level.
We will show that SOP forms can be implemented using only NAND gates, while
POS forms can be implemented using only NOR gates.
Introducing two successive inverters at the inputs of the AND gate results in the
shown equivalent implementation. Since two successive inverters on the same line
will not have an overall effect on the logic as it is shown before.
(see animation in authorware version)
By associating one of the inverters with the output of the first level OR gates and the
other with the input of the AND gate, it is clear that this implementation is reducible
to 2-level implementation where both levels are NOR gates as shown in Figure.
There are some other types of 2-level combinational circuits which are
• NAND-AND
• AND-NOR,
• NOR-OR,
• OR-NAND
AND-NOR functions:
Example 3: Implement the following function
F = XZ + Y Z + X YZ or
F = XZ + Y Z + XYZ
Instruction Code
An instruction code is a group of bits that instruct the computer to perform a specific operation.
Operation Code
The operation code of an instruction is a group of bits that define such operations as add,
subtract, multiply, shift, and complement. The number of bits required for the operation code of
an instruction depends on the total number of operations available in the computer. The
n
operation code must consist of at least n bits for a given 2 (or less) distinct operations.
Accumulator (AC)
Computers that have a single-processor register usually assign to it the name accumulator (AC)
accumulator and label it AC. The operation is performed with the memory operand and the
content of AC.
1|Page
UNIT-II
Unit 2 – Basic Computer Organization and Design
If we store each instruction code in one 16-bit memory word, we have available four bits
for operation code (abbreviated opcode) to specify one out of 16 possible operations,
and 12 bits to specify the address of an operand.
The control reads a 16-bit instruction from the program portion of memory.
It uses the 12-bit address part of the instruction to read a 16-bit operand from the data
portion of memory.
It then executes the operation specified by the operation code.
Computers that have a single-processor register usually assign to it the name
accumulator and label it AC.
If an operation in an instruction code does not need an operand from memory, the rest
of the bits in the instruction can be used for other purposes.
For example, operations such as clear AC, complement AC, and increment AC operate on
data stored in the AC register. They do not need an operand from memory. For these
types of operations, the second part of the instruction code (bits 0 through 11) is not
needed for specifying a memory address and can be used to specify other operations for
the computer.
Direct and Indirect addressing of basic computer.
The second part of an instruction format specifies the address of an operand, the
instruction is said to have a direct address.
In Indirect address, the bits in the second part of the instruction designate an address of
a memory word in which the address of the operand is found.
One bit of the instruction code can be used to distinguish between a direct and an
indirect address.
It consists of a 3-bit operation code, a 12-bit address, and an indirect address mode bit
designated by I.
The mode bit is 0 for a direct address and 1 for an indirect address.
A direct address instruction is shown in Figure 2.2. It is placed in address 22 in memory.
The I bit is 0, so the instruction is recognized as a direct address instruction.
The opcode specifies an ADD instruction, and the address part is the binary equivalent of
457.
The control finds the operand in memory at address 457 and adds it to the content of
AC.
The instruction in address 35 shown in Figure 2.3 has a mode bit I = 1, recognized as an
indirect address instruction.
The address part is the binary equivalent of 300.
The control goes to address 300 to find the address of the operand. The address of the
operand in this case is 1350. The operand found in address 1350 is then added to the
content of AC.
2|Page
UNIT-II
Unit 2 – Basic Computer Organization and Design
The indirect address instruction needs two references to memory to fetch an operand.
1. The first reference is needed to read the address of the operand
2. Second reference is for the operand itself.
The memory word that holds the address of the operand in an indirect address
instruction is used as a pointer to an array of data.
15 14 12 11 0
I Opcode Address
Memory Memory
300 1350
457 Operand
1350 Operand
+ +
AC AC
3|Page
UNIT-II
Unit 2 – Basic Computer Organization and Design
The data register (DR) holds the operand read from memory.
The accumulator (AC) register is a general purpose processing register.
The instruction read from memory is placed in the instruction register (IR).
The temporary register (TR) is used for holding temporary data during the processing.
The memory address register (AR) has 12 bits.
The program counter (PC) also has 12 bits and it holds the address of the next instruction
to be read from memory after the current instruction is executed.
Instruction words are read and executed in sequence unless a branch instruction is
encountered. A branch instruction calls for a transfer to a nonconsecutive instruction in
the program.
Two registers are used for input and output. The input register (INPR) receives an 8-bit
character from an input device. The output register (OUTR) holds an 8-bit character for
an output device.
4|Page
UNIT-II
Unit 2 – Basic Computer Organization and Design
5|Page
UNIT-II
Unit 2 – Basic Computer Organization and Design
Five registers have three control inputs: LD (load), INR (increment), and CLR (clear). Two
registers have only a LD input.
AR must always be used to specify a memory address; therefore memory address is
connected to AR.
The 16 inputs of AC come from an adder and logic circuit. This circuit has three sets of
inputs.
1. Set of 16-bit inputs come from the outputs of AC.
2. Set of 16-bits come from the data register DR.
3. Set of 8-bit inputs come from the input register INPR.
The result of an addition is transferred to AC and the end carry-out of the addition is
transferred to flip-flop E (extended AC bit).
The clock transition at the end of the cycle transfers the content of the bus into the
designated destination register and the output of the adder and logic circuit into AC.
6|Page
UNIT-II
Unit 2 – Basic Computer Organization and Design
7|Page
UNIT-II
Unit 2 – Basic Computer Organization and Design
8|Page
UNIT-II
Unit 2 – Basic Computer Organization and Design
labeled T0 in the diagram will trigger only those registers whose control inputs are
connected to timing signal T0.
SC is incremented with every positive clock transition, unless its CLR input is active.
This procedures the sequence of timing signals T0, T1, T2, T3 and T4, and so on. If SC is not
cleared, the timing signals will continue with T5, T6, up to T15 and back to T0.
T0 T1 T2 T3 T4 T0
Clock
T0
T1
T2
T3
T4
D3
CLR SC
The last three waveforms shows how SC is cleared when D3T4 = 1. Output D3 from the
operation decoder becomes active at the end of timing signal T 2. When timing signal T4
becomes active, the output of the AND gate that implements the control function D 3T4
becomes active.
This signal is applied to the CLR input of SC. On the next positive clock transition the
counter is cleared to 0. This causes the timing signal T 0 to become active instead of T5
that would have been active if SC were incremented instead of cleared.
9|Page
UNIT-II
Unit 2 – Basic Computer Organization and Design
Instruction cycle
A program residing in the memory unit of the computer consists of a sequence of
instructions. In the basic computer each instruction cycle consists of the following
phases:
1. Fetch an instruction from memory.
2. Decode the instruction.
3. Read the effective address from memory if the instruction has an indirect address.
4. Execute the instruction.
After step 4, the control goes back to step 1 to fetch, decode and execute the nex
instruction.
This process continues unless a HALT instruction is encountered.
The flowchart presents an initial configuration for the instruction cycle and shows how
the control determines the instruction type after the decoding.
If D7 = 1, the instruction must be register-reference or input-output type. If D7 = 0, the
operation code must be one of the other seven values 110, specifying a memory-
reference instruction. Control then inspects the value of the first bit of the instruction,
which now available in flip-flop I.
If D7 = 0 and I = 1, we have a memory-reference instruction with an indirect address. It is
then necessary to read the effective address from memory.
The three instruction types are subdivided into four separate paths. The selected
10 | P a g e
UNIT-II
Unit 2 – Basic Computer Organization and Design
operation is activated with the clock transition associated with timing signal T3.This can
be symbolized as follows:
D’7 I T3: AR M [AR]
D’7 I’ T3: Nothing
D7 I’ T3: Execute a register-reference instruction
D7 I T3: Execute an input-output instruction
When a memory-reference instruction with I = 0 is encountered, it is not necessary to do
anything since the effective address is already in AR.
However, the sequence counter SC must be incremented when D’7 I T3 = 1, so that the
execution of the memory-reference instruction can be continued with timing variable T4.
A register-reference or input-output instruction can be executed with the click
associated with timing signal T3. After the instruction is executed, SC is cleared to 0 and
control returns to the fetch phase with T0 =1. SC is either incremented or cleared to 0
with every positive clock transition.
Register reference instruction.
When the register-reference instruction is decoded, D7 bit is set to 1.
Each control function needs the Boolean relation D7 I' T3
15 12 11 0
0 1 1 1 Register Operation
These 12 bits are available in IR (0-11). They were also transferred to AR during time T2.
These instructions are executed at timing cycle T3.
The first seven register-reference instructions perform clear, complement, circular shift,
and increment microoperations on the AC or E registers.
The next four instructions cause a skip of the next instruction in sequence when
11 | P a g e
UNIT-II
Unit 2 – Basic Computer Organization and Design
The effective address of the instruction is in the address register AR and was placed
there during timing signal T2 when I = 0, or during timing signal T3 when I = 1.
The execution of the memory-reference instructions starts with timing signal T4.
AND to AC
This is an instruction that performs the AND logic operation on pairs of bits in AC and the
memory word specified by the effective address. The result of the operation is
transferred to AC.
D0T4: DRM[AR]
D0T5: AC AC DR, SC 0
ADD to AC
This instruction adds the content of the memory word specified by the effective address
to the value of AC. The sum is transferred into AC and the output carry C out is transferred
to the E (extended accumulator) flip-flop.
D1T4: DR M[AR]
D1T5: AC AC + DR, E Cout, SC 0
12 | P a g e
UNIT-II
Unit 2 – Basic Computer Organization and Design
LDA: Load to AC
This instruction transfers the memory word specified by the effective address to AC.
D2T4: DR M[AR]
D2T5: AC DR, SC 0
STA: Store AC
This instruction stores the content of AC into the memory word specified by the effective
address.
D3T4: M[AR] AC, SC 0
BUN: Branch Unconditionally
This instruction transfers the program to instruction specified by the effective address.
The BUN instruction allows the programmer to specify an instruction out of sequence
and the program branches (or jumps) unconditionally.
D4T4: PC AR, SC 0
BSA: Branch and Save Return Address
This instruction is useful for branching to a portion of the program called a subroutine or
procedure. When executed, the BSA instruction stores the address of the next
instruction in sequence (which is available in PC) into a memory location specified by the
effective address.
M[AR] PC, PC AR + 1
M[135] 21, PC 135 + 1 = 136
It is not possible to perform the operation of the BSA instruction in one clock cycle when
we use the bus system of the basic computer. To use the memory and the bus properly,
the BSA instruction must be executed with a sequence of two microoperations:
D5T4: M[AR] PC, AR AR + 1
D5T5: PC AR, SC 0
ISZ: Increment and Skip if Zero
These instruction increments the word specified by the effective address, and if the
incremented value is equal to 0, PC is incremented by 1. Since it is not possible to
13 | P a g e
UNIT-II
Unit 2 – Basic Computer Organization and Design
increment a word inside the memory, it is necessary to read the word into DR, increment
DR, and store the word back into memory.
D6T4: DR M[AR]
D6T5: DR DR + 1
D6T4: M[AR] DR, if (DR = 0) then (PC PC + 1), SC 0
Control Flowchart
14 | P a g e
UNIT-II
Unit 2 – Basic Computer Organization and Design
The terminal sends and receives serial information and each quantity of information has
eight bits of an alphanumeric code.
The serial information from the keyboard is shifted into the input register INPR.
The serial information for the printer is stored in the output register OUTR.
These two registers communicate with a communication interface serially and with the
AC in parallel.
The transmitter interface receives serial information from the keyboard and transmits it
to INPR. The receiver interface receives information from OUTR and sends it to the
printer serially.
The 1-bit input flag FGI is a control flip-flop. It is set to 1 when new information is
available in the input device and is cleared to 0 when the information is accepted by the
computer.
The flag is needed to synchronize the timing rate difference between the input device
and the computer.
The process of information transfer is as follows:
15 | P a g e
UNIT-II
Unit 2 – Basic Computer Organization and Design
Once the flag is cleared, new information can be shifted into INPR by striking another
key.
16 | P a g e
UNIT-II
Unit 2 – Basic Computer Organization and Design
Interrupt Cycle
The way that the interrupt is handled by the computer can be explained by means of the
flowchart shown in figure 2.13.
An interrupt flip-flop R is included in the computer.
When R = 0, the computer goes through an instruction cycle.
During the execute phase of the instruction cycle IEN is checked by the control.
If it is 0, it indicates that the programmer does not want to use the interrupt, so control
continues with the next instruction cycle.
If IEN is 1, control checks the flag bits.
If both flags are 0, it indicates that neither the input nor the output registers are ready
for transfer of information.
In this case, control continues with the next instruction cycle. If either flag is set to 1
while IEN = 1, flip-flop R is set to 1.
At the end of the execute phase, control checks the value of R, and if it is equal to 1, it
goes to an interrupt cycle instead of an instruction cycle.
17 | P a g e
UNIT-II
Unit 2 – Basic Computer Organization and Design
address.
Control then inserts address 1 into PC and clears IEN and R so that no more interruptions
can occur until the interrupt request from the flag has been serviced.
An example that shows what happens during the interrupt cycle is shown in Figure 2.14:
Suppose that an interrupt occurs and R = 1, while the control is executing the instruction
at address 255. At this time, the return address 256 is in PC.
The programmer has previously placed an input-output service program in memory
starting from address 1120 and a BUN 1120 instruction at address 1.
The content of PC (256) is stored in memory location 0, PC is set to 1, and R is cleared to
0.
At the beginning of the next instruction cycle, the instruction that is read from memory is
in address 1 since this is the content of PC. The branch instruction at address 1 causes
the program to transfer to the input-output service program at address 1120.
This program checks the flags, determines which flag is set, and then transfers the
required input or output information. Once this is done, the instruction ION is executed
to set IEN to 1 (to enable further interrupts), and the program returns to the location
where it was interrupted.
The instruction that returns the computer to the original place in the main program is a
branch indirect instruction with an address part of 0. This instruction is placed at the end
of the I/O service program.
The execution of the indirect BUN instruction results in placing into PC the return
address from location 0.
18 | P a g e
UNIT-II
Unit 2 – Basic Computer Organization and Design
19 | P a g e
UNIT-II
Unit 2 – Basic Computer Organization and Design
REFERENCE :
1. COMPUTER SYSTEM ARCHITECTURE, MORRIS M. MANO, 3RD EDITION, P RENTICE
HALL INDIA.
20 | P a g e
UNIT-II
UNIT-II
Stack Organization
Instruction Formats
Addressing Modes
Program Control
Introduction:
The main part of the computer that performs the bulk of data-processing operations is called the central
processing unit and is referred to as the CPU.
The CPU is made up of three major parts, as shown in Fig. 8-1.
The register set stores intermediate data used during the execution of the instructions.
The arithmetic logic unit (ALU) performs the required microoperations for executing the instructions.
The control unit supervises the transfer of information among the registers and instructs the ALU as to which
operation to perform.
1. Stack Organization:
A stack or last-in first-out (LIFO) is useful feature that is included in the CPU of most computers.
Stack:
o A stack is a storage device that stores information in such a manner that the item stored last is the first
item retrieved.
The operation of a stack can be compared to a stack of trays. The last tray placed on top of the stack is the first
to be taken off.
In the computer stack is a memory unit with an address register that can count the address only.
The register that holds the address for the stack is called a stack pointer (SP). It always points at the top item in
the stack.
The two operations that are performed on stack are the insertion and deletion.
The operation of insertion is called PUSH.
The operation of deletion is called POP.
These operations are simulated by incrementing and decrementing the stack pointer register (SP).
1
Page
Register Stack:
A stack can be placed in a portion of a large memory or it can be organized as a collection of a finite number of
memory words or registers.
The below figure shows the organization of a 64-word register stack.
The stack pointer register SP contains a binary number whose value is equal to the address of the word
is currently on top of the stack. Three items are placed in the stack: A, B, C, in that order.
In above figure C is on top of the stack so that the content of SP is 3.
For removing the top item, the stack is popped by reading the memory word at address 3 and decrementing the
content of stack SP.
Now the top of the stack is B, so that the content of SP is 2.
Similarly for inserting the new item, the stack is pushed by incrementing SP and writing a word in the next-
higher location in the stack.
In a 64-word stack, the stack pointer contains 6 bits because 26 = 64.
Since SP has only six bits, it cannot exceed a number greater than 63 (111111 in binary).
When 63 is incremented by 1, the result is 0 since 111111 + 1. = 1000000 in binary, but SP can accommodate
only the six least significant bits.
Then the one-bit register FULL is set to 1, when the stack is full.
Similarly when 000000 is decremented by 1, the result is 111111, and then the one-bit register EMTY is set 1
when the stack is empty of items.
DR is the data register that holds the binary data to be written into or read out of the stack.
PUSH:
Initially, SP is cleared to 0, EMTY is set to 1, and FULL is cleared to 0, so that SP points to the word at address 0
and the stack is marked empty and not full.
If the stack is not full (if FULL = 0), a new item is inserted with a push operation.
The push operation is implemented with the following sequence of microoperations:
2
The stack pointer is incremented so that it points to the address of next-higher word.
Page
A memory write operation inserts the word from DR the top of the stack.
The first item stored in the stack is at address 1.
The last item is stored at address 0.
If SP reaches 0, the stack is full of items, so FULL is to 1.
This condition is reached if the top item prior to the last push way location 63 and, after incrementing SP, the
last item is stored in location 0.
Once an item is stored in location 0, there are no more empty registers in the stack, so the EMTY is cleared to 0.
POP:
A new item is deleted from the stack if the stack is not empty (if EMTY = 0).
The pop operation consists of the following sequence of min operations:
Memory Stack:
In the above discussion a stack can exist as a stand-alone unit. But in the CPU implementation of a stack is done
by assigning a portion of memory to a stack operation and using a processor register as stack pointer.
The below figure shows a portion computer memory partitioned into three segments: program, data, and
stack.
The program counter PC points at the address of the next instruction in program.
The address register AR points at an array of data.
3
SP SP-1
M [SP] DR
The stack pointer is decremented so that it points at the address of the next word.
A memory write operation inserts the word from DR into the top of stack. A new item is deleted with a pop
operation as follows:
DR M [SP]
SP SP+1
The top item is read from the stack into DR. The stack pointer is then decremented to point at the next item in
the stack.
Most computers do not provide hardware to check for stack overflow (full stack) or underflow (empty stack).
The stack limits can be checked by using processor registers:
o one to hold the upper limit (3000 in this case)
o Other to hold the lower limit (4001 in this case).
After a push operation, SP compared with the upper-limit register and after a pop operation, SP is a compared
with the lower-limit register.
The two microoperations needed for either the push or pop are
The advantage of a memory stack is that the CPU can refer to it without having specify an address, since the
address is always available and automatically updated in the stack pointer.
Reverse Polish notation, combined with a stack arrangement of registers, is the most efficient way known for
evaluating arithmetic expressions.
This procedure is employed in some electronic calculators and also in some computer.
The following numerical example may clarify this procedure. Consider the arithmetic expression
(3*4) + (5*6)
34 * 56* +
Each box represents one stack operation and the arrow always points to the top of the stack.
Scanning the expression from left to right, we encounter two operands.
First the number 3 is pushed into the stack, then the number 4.
The next symbol is the multiplication operator *.
This causes a multiplication of the two top most items the stack.
The stack is then popped and the product is placed on top of the stack, replacing the two original operands.
Next we encounter the two operands 5 and 6, so they are pushed into the stack.
The stack operation results from the next * replaces these two numbers by their product.
The last operation causes an arithmetic addition of the two topmost numbers in the stack to produce the final
result of 42.
5
2. Instruction Formats:
Page
The format of an instruction is usually depicted in a rectangular box symbolizing the bits of the instruction as
they appear in memory words or in a control register.
The bits of the instruction are divided into groups called fields.
The most common fields found in instruction formats are:
1. An operation code field that specifies the operation to be perform
2. An address field that designates a memory address or a processor register.
3. A mode field that specifies the way the operand or the effective address is determined.
Computers may have instructions of several different lengths containing varying number of addresses.
The number of address fields in the instruct format of a computer depends on the internal organization of its
registers.
Most computers fall into one of three types of CPU organizations:
1. Single accumulator organization.
2. General register organization.
3. Stack organization.
In an accumulator type organization all the operations are performed with an implied accumulator register.
The instruction format in this type of computer uses one address field.
For example, the instruction that specifies an arithmetic addition defined by an assembly language
instruction as
ADD X
Where X is the address of the operand. The ADD instruction in this case results in the operation AC AC
+M[X]. AC is the accumulator register and M[X] symbolizes the memory word located at address X.
The instruction format in this type of computer needs three register address fields.
Thus the instruction for an arithmetic addition may be written in an assembly language as
ADD R1, R2, R3
to denote the operation R1R2 + R3. The number of address fields in the instruction can be
reduced from three to two if the destination register is the same as one of the source registers.
Thus the instruction ADD R1, R2 would denote the operation R1 R1 + R2. Only register addresses for R1
and R2 need be specified in this instruction.
General register-type computers employ two or three address fields in their instruction format.
Each address field may specify a processor register or a memory word.
An instruction symbolized by ADD R1, X would specify the operation R1 R1 + M[X].
It has two address fields, one for register R1 and the other for the memory address X.
Stack organization:
The stack-organized CPU has PUSH and POP instructions which require an address field.
Thus the instruction PUSH X will push the word at address X to the top of the stack.
The stack pointer is updated automatically.
Operation-type instructions do not need an address field in stack-organized computers.
This is because the operation is performed on the two items that are on top of the stack.
The instruction ADD in a stack computer consists of an operation code only with no address field.
This operation has the effect of popping the two top numbers from the stack, adding the numbers, and
pushing the sum into the stack.
There is no need to specify operands with an address field since all operands are implied to be in the stack.
The influence of the number of addresses on computer programs, we will evaluate the arithmetic statement
Page
X= (A+B) * (C+D)
Using zero, one, two, or three address instructions and using the symbols ADD, SUB, MUL and DIV for four
arithmetic operations; MOV for the transfer type operations; and LOAD and STORE for transfer to and from
memory and AC register.
Assuming that the operands are in memory addresses A, B, C, and D and the result must be stored in memory ar
address X and also the CPU has general purpose registers R1, R2, R3 and R4.
Three-address instruction formats can use each address field to specify either a processor register or a
memory operand.
The program assembly language that evaluates X = (A+B) * (C+D) is shown below, together with
comments that explain the register transfer operation of each instruction.
Two-address instructions formats use each address field can specify either a processor register or
memory word.
The program to evaluate X = (A+B) * (C+D) is as follows
The MOV instruction moves or transfers the operands to and from memory and processor registers.
The first symbol listed in an instruction is assumed be both a source and the destination where the
result of the operation transferred.
One-address instructions use an implied accumulator (AC) register for all data manipulation.
For multiplication and division there is a need for a second register. But for the basic discussion we will
neglect the second register and assume that the AC contains the result of all operations.
The program to evaluate X=(A+B) * (C+D) is
All operations are done between the AC register and a memory operand.
T is the address of a temporary memory location required for storing the intermediate result.
A stack-organized computer does not use an address field for the instructions ADD and MUL.
The PUSH and POP instructions, however, need an address field to specify the operand that
communicates with the stack.
The following program shows how X = (A+B) * (C+D) will be written for a stack-organized computer.
(TOS stands for top of stack).
7
Page
To evaluate arithmetic expressions in a stack computer, it is necessary to convert the expression into
reverse Polish notation.
The name "zero-address” is given to this type of computer because of the absence of an address field in
the computational instructions.
RISC Instructions:
The instruction set of a typical RISC processor is use only load and store instructions for communicating
between memory and CPU.
All other instructions are executed within the registers of CPU without referring to memory.
LOAD and STORE instructions that have one memory and one register address, and computational type
instructions that have three addresses with all three specifying processor registers.
The following is a program to evaluate X=(A+B)*(C+D)
The load instructions transfer the operands from memory to CPU register.
The add and multiply operations are executed with data in the register without accessing memory.
The result of the computations is then stored memory with a store in instruction.
3. Addressing Modes
The way the operands are chosen during program execution is dependent on the addressing mode of the
instruction.
Computers use addressing mode techniques for the purpose of accommodating one or both of the following
provisions:
o To give programming versatility to the user by providing such facilities as pointers to memory, counters
for loop control, indexing of data, and program relocation.
o To reduce the number of bits in the addressing field of the instruction
Most addressing modes modify the address field of the instruction; there are two modes that need no address
field at all. These are implied and immediate modes.
Implied Mode:
In this mode the operands are specified implicitly in the definition of the instruction.
For example, the instruction "complement accumulator" is an implied-mode instruction because the
operand in the accumulator register is implied in the definition of the instruction.
All register reference instructions that use an accumulator are implied mode instructions.
Zero address in a stack organization computer is implied mode instructions.
Immediate Mode:
When the address specifies a processor register, the instruction is said to be in the register mode.
Page
Register Mode:
In this mode the operands are in registers that reside within the CPU.
The particular register is selected from a register field in the instruction.
In this mode the instruction specifies a register in CPU whose contents give the address of the operand
in memory.
In other words, the selected register contains the address of the operand rather than the operand itself.
The advantage of a register indirect mode instruction is that the address field of the instruction uses few
bits to select a register than would have been required to specify a memory address directly.
This is similar to the register indirect mode except that the register is incremented or decremented after
(or before) its value is used to access memory.
The address field of an instruction is used by the control unit in the CPU to obtain the operand from memory.
Sometimes the value given in the address field is the address of the operand, but sometimes it is just an
address from which the address of the operand is calculated.
The basic two mode of addressing used in CPU are direct and indirect address mode.
In this mode the effective address is equal to the address part of the instruction.
The operand resides in memory and its address is given directly by the address field of the instruction.
In a branch-type instruction the address field specifies the actual branch address.
In this mode the address field of the instruction gives the address where the effective address is stored
in memory.
Control fetches the instruction from memory and uses its address part to access memory again to read
the effective address.
A few addressing modes require that the address field of the instruction be added to the content of a specific
register in the CPU.
The effective address in these modes is obtained from the following computation:
In this mode the content of the program counter is added to the address part of the instruction in order
to obtain the effective address.
In this mode the content of an index register is added to the address part of the instruction to obtain
the effective address.
An index register is a special CPU register that contains an index value.
In this mode the content of a base register is added to the address part of the instruction to obtain the
effective address.
This is similar to the indexed addressing mode except that the register is now called a base register
9
Page
To show the differences between the various modes, we will show the effect of the addressing modes on the
instruction defined in Fig. 8-7.
The two-word instruction at address 200 and 201 is a "load to AC" instruction with an address field equal to
500.
The first word of the instruction specifies the operation code and mode, and the second word specifies the
address part.
PC has the value 200 for fetching this instruction. The content of processor register R1 is 400, and the content of
an index register XR is 100.
AC receives the operand after the instruction is executed.
In the direct address mode the effective address is the address part of the instruction 500 and the operand to
be loaded into AC is 500.
In the immediate mode the second word of the instruction is taken as the operand rather than an address, so
500 is loaded into AC.
In the indirect mode the effective address is stored in memory at address 500. Therefore, the effective address
is 800 and the operand is 300.
In the relative mode the effective address is 500 + 202 =702 and the operand is 325. (the value in PC after the
fetch phase and during the execute phase is 202.)
In the index mode the effective address is XR+ 500 = 100 + 500 = 600 and the operand is 900.
In the register mode the operand is in R1 and 400 is loaded into AC.
In the register indirect mode the effective address is 400, equal to the content of R1 and the operand loaded
into AC is 700.
The auto-increment mode is the same as the register indirect mode except that R1 is incremented to 401 after
the execution of the instruction.
The auto-decrement mode decrements R1 to 399 prior to the execution of the instruction. The operand loaded
into AC is now 450.
Table 8-4 lists the values of the effective address and the operand loaded into AC for the nine addressing
modes.
10
Page
4. Data Transfer and Manipulation:
Data transfer instructions move data from one place in the computer to another without changing the data
content.
The most common transfers are between memory and processor registers, between processor registers and
input or output, and between the processor registers themselves.
Table 8-5 gives a list of eight data transfer instructions used in many computers.
The load instruction has been used mostly to designate a transfer from memory to a processor register, usually
an accumulator.
The store instruction designates a transfer from a processor register into memory.
The move instruction has been used in computers with multiple CPU registers to designate a transfer from one
register to another and also between CPU registers and memory or between two memory words.
The exchange instruction swaps information between two registers or a register and a memory word.
The input and output instructions transfer data among processor registers and input or output terminals.
The push and pop instructions transfer data between processor registers and a memory stack.
Different computers use different mnemonics symbols for differentiate the addressing modes.
As an example, consider the load to accumulator instruction when used with eight different addressing modes.
Table 8-6 shows the recommended assembly language convention and actual transfer accomplished in each
case
Data manipulation instructions perform operations on data and provide the computational capabilities for the
computer.
The data manipulation instructions in a typical computer are usually divided into three basic types:
1. Arithmetic instructions
2. Logical and bit manipulation instructions
3. Shift instructions
1. Arithmetic instructions
The four basic arithmetic operations are addition, subtraction, multiplication and division.
Most computers provide instructions for all four operations.
Some small computers have only addition and possibly subtraction instructions. The multiplication and division
must then be generated by mean software subroutines.
A list of typical arithmetic instructions is given in Table 8-7.
The increment instruction adds 1 to the value stored in a register or memory word.
A number with all 1's, when incremented, produces a number with all 0's.
The decrement instruction subtracts 1 from a value stored in a register or memory word.
A number with all 0's, when decremented, produces number with all 1's.
The add, subtract, multiply, and divide instructions may be use different types of data.
The data type assumed to be in processor register during the execution of these arithmetic operations is defined
by an operation code.
An arithmetic instruction may specify fixed-point or floating-point data, binary or decimal data, single-precision
or double-precision data.
The mnemonics for three add instructions that specify different data types are shown below.
ADDI Add two binary integer numbers
ADDF Add two floating-point numbers
ADDD Add two decimal numbers in BCD
A special carry flip-flop is used to store the carry from an operation.
The instruction "add carry" performs the addition on two operands plus the value of the carry the previous
computation.
Similarly, the "subtract with borrow" instruction subtracts two words and borrow which may have resulted from
a previous subtract operation.
12
The negate instruction forms the 2's complement number, effectively reversing the sign of an integer when
represented it signed-2's complement form.
Page
3. Shift Instructions:
Shifts are operations in which the bits of a word are moved to the left or right.
The bit shifted in at the end of the word determines the type of shift used.
Shift instructions may specify logical shifts, arithmetic shifts, or rotate-type operations.
In either case the shift may be to the right or to the left.
Table 8-9 lists four types of shift instructions.
This is a shift-right operation with the end bit remaining the same.
The arithmetic shift-left instruction inserts 0 to the end position and is identical to the logical shift-instruction.
Page
The rotate instructions produce a circular shift. Bits shifted out at one of the word are not lost as in a logical
shift but are circulated back into the other end.
The rotate through carry instruction treats a carry bit as an extension of the register whose word is being
rotated.
Thus a rotate-left through carry instruction transfers the carry bit into the rightmost bit position of the
register, transfers the leftmost bit position into the carry, and at the same time, shift the entire register to the
left.
5. Program Control:
Program control instructions specify conditions for altering the content of the program counter.
The change in value of the program counter as a result of the execution of a program control instruction causes
a break in the sequence of instruction execution.
This instruction provides control over the flow of program execution and a capability for branching to different
program segments.
Some typical program control instructions are listed in Table 8.10.
The ALU circuit in the CPU have status register for storing the status bit conditions.
Status bits are also called condition-code bits or flag bits.
Figure 8-8 shows block diagram of an 8-bit ALU with a 4-bit status register.
14
Page
The four status bits are symbolized by C, S, Z, and V. The bits are set or cleared as a result of an
operation performed in the ALU.
o Bit C (carry) is set to 1 if the end carry C8 is 1. It is cleared to 0 if the carry is 0.
o S (sign) is set to 1 if the highest-order bit F7 is 1. It is set to 0 if the bit is 0.
o Bit Z (zero) is set to 1 if the output of the ALU contains all 0's. It is clear to 0 otherwise. In other
words, Z = 1 if the output is zero and Z =0 if the output is not zero.
o Bit V (overflow) is set to 1 if the exclusive-OR of the last two carries equal to 1, and cleared to 0
otherwise.
The above status bits are used in conditional jump and branch instructions.
Subroutine Call and Return:
A subroutine is self contained sequence of instructions that performs a given computational task.
The most common names used are call subroutine, jump to subroutine, branch to subroutine, or
branch and save return address.
A subroutine is executed by performing two operations
(1) The address of the next instruction available in the program counter (the return address) is stored
in a temporary location so the subroutine knows where to return
(2) Control is transferred to the beginning of the subroutine.
The last instruction of every subroutine, commonly called return from subroutine, transfers the return
address from the temporary location in the program counter.
Different computers use a different temporary location for storing the return address.
The most efficient way is to store the return address in a memory stack.
The advantage of using a stack for the return address is that when a succession of subroutines is
called, the sequential return addresses can be pushed into the stack.
A subroutine call is implemented with the following microoperations:
The instruction that returns from the last subroutine is implemented by the microoperations:
Program Interrupt:
Program interrupt refers to the transfer of program control from a currently running program to another service
program as a result of an external or internal generated request.
The interrupt procedure is similar to a subroutine call except for three variations:
o The interrupt is initiated by an internal or external signal.
o Address of the interrupt service program is determined by the hardware.
o An interrupt procedure usually stores all the information rather than storing only PC content.
Types of interrupts:
There are three major types of interrupts that cause a break in the normal execution of a program.
They can be classified as
o External interrupts:
15
These come from input—output (I/O) devices, from a timing device, from a circuit monitoring
Page
A computer with large number instructions is classified as a complex instruction set computer, abbreviated as
CISC.
The computer which having the fewer instructions is classified as a reduced instruction set computer,
abbreviated as RISC.
CISC Characteristics:
RISC Characteristics:
16
Page
UNIT 3 COMBINATIONAL LOGIC
Analysis Procedure
1. Label all gate outputs that are a function of input variables with arbitrary
symbols. Determine the Boolean functions for each gate output.
2. Label the gates that are a function of input variables and previously labeled
gates with other arbitrary symbols. Find the Boolean functions for these
gates.
3. Repeat the process outlined in step 2 until the outputs of the circuit are
obtained.
Example:
F 1 = T3 + T 2
F1 = T3 + T2 = F2’T1 + ABC = A’BC’ + A’B’C + AB’C’ + ABC
We can derive the truth table in Table 4-1 by using the circuit of Fig.4-2.
Design procedure :
For each symbol of the Excess-3 code, we use 1’s to draw the map
for simplifying Boolean function :
Circuit implementation:
w = A + BC + BD = A + B(C + D)
Binary Adder-Subtractor:
SUM = AB’+A’B
CARRY = AB
These expressions shows`that, SUM output is EX-OR gate and the CARRY output is AND
gate. Figure shows the be implementation of half-adder with all the combinations
including the implementation using NAND gates only.
FULL ADDER : one that performs the addition of three bits(two
significant bits and a previous carry) is a full adder.
Simplified Expressions :
S = x’y’z + x’yz’ + xy’z’ + xyz
C = xy + xz + yz
Another implementation :
Full-adder can also implemented with two half adders and one OR gate (Carry
Look-Ahead adder).
S = z ⊕ (x ⊕ y)
This is also called Ripple Carry Adder ,because of the construction with full
adders are connected in cascade.
Carry Propagation :
Fig.4-9 causes a unstable factor on carry bit, and produces a longest propagation
delay.
The signal from Ci to the output carry Ci+1, propagates through an AND and OR
gates, so, for an n-bit RCA, there are 2n gate levels for the carry to propagate
from input to output.
Because the propagation delay will affect the output signals on different time, so
the signals are given enough time to get the precise and stable outputs.
The most widely used technique employs the principle of carry look-ahead to improve
the speed of the algorithm.
Boolean functions :
Si = Pi ⊕ Ci
Ci+1 = Gi + PiCi
C0 = input carry
C1 = G0 + P0C0
M = 1subtractor ; M = 0adder
Boolean expression of sum can be implemented by using two-input EX-OR gate in which
one of the input is Carry in and the other input is the output of another two-input EX-OR
gate with A and B as its inputs. On the other hand Carry output can implemented by
ORing the outputs of AND gates.
HALF-SUBTRACTOR
Half-subtractor is used to subtract one binary digit from another
to give DIFFERENCE output and a BORROW output. The truth table of a half-subtractor
is shown in figure. The Boolean expressions for half-subtractor are,
D=’B+AB’ and Bo=A’B
Here, the DIFFERENCE i.e. the D output is an EX-OR gate and the BORROW i.e. Bo is
AND gate with complemented input A. Figure shows the logic implementation of a half-
subtractor. Comparing a half-subtractor with a half-adder, it can be seen that, the
expressions for SUM and DIFFERENCE outputs are same. The expression for BORROW
in the case of the half-subtractor is more or less same with CARRY of the half-adder.
However, the case of BORROW output the the minuend is complemented and then
ANDing is done.
FULL SUBTRACTOR
Full subtractor performs subtraction of two bits, one is minuend and other is
subtrahend. In full subtractor ‘1’ is borrowed by the previous adjacent lower minuend
bit. Hence there are three bits are considered at the input of a full subtractor. There are
two outputs, that are DIFFERENCE output D and BORROW output Bo. The BORROW
output indicates`that the minuend bit requires borrow ‘1’ from the next minuend bit.
Figure shows the truth table of a full
subtractor. The K-maps for the two outputs are shown in figure. If we compare
DIFFERENCE output D and BORROW output Bo with full adder`it can be seen that the
DIFFERENCE output D is the same as that for the SUM output. Further, the BORROW
output Bo is similar to CARRY-OUT. In the case of a half-subtractor, A input is
complemented similar things are carried out in full subtractor.
From the Truth Table The Difference and Borrow will written as
Difference=A'B'C+A'BB'+AB'C'+ABC
Reduce it like adder
Then We got
Difference=A B C
Borrow=A'B'C+A'BC'+A'BC+ABC
=A'B'C+A'BC'+A'BC+A'BC+A'BC+ABC ----------> A'BC=A'BC+A'BC+A'BC
=A'C(B'+B)+A'B(C'+C)+BC(A'+A)
Borrow=A'C+A'B+BC
Block diagram
BINARY ADDER/SUBTRACTOR
Subtraction of binary numbers can be carried out by using the addition of 2’s
complement of subtrahend to the minuend. In this operation If the MSB of addition is a
‘0’, then the answer is correct and if MSB is ‘1’, then answer is having negative sign.
Hence, by using full adders subtraction can be carried out.
Figure above the realization of 4 bit adder-subtractor. From the figure it can be seen
that, the bits of the binary numbers are given to full adder through the XOR gates. The
control input is controls the addition or subtraction operation.
When the SUBTRACTION input is logic ‘0’ , the B3 B2 B1 B0 are passed to the full
adders. Hence, the output of the full adders is the addition of the two numbers.
Magnitude comparator:
The equality relation of each pair of bits can be expressed logically with an
exclusive-NOR function as:
A = A3A2A1A0 ; B = B3B2B1B0
xi=AiBi+Ai’Bi’ for i = 0, 1, 2, 3
(A = B) = x3x2x1x0
We inspect the relative magnitudes of pairs of MSB. If equal, we compare the
next lower significant pair of digits until a pair of unequal digits is reached.
If the corresponding digit of A is 1 and that of B is 0, we conclude that A>B.
(A>B)=
A3B’3+x3A2B’2+x3x2A1B’1+x3x2x1A0B’0
(A<B)=
A’3B3+x3A’2B2+x3x2A’1B1+x3x2x1A’0B0
Decoders :
S(x, y, z) = ∑(1, 2, 4, 7)
C(x, y, z) = ∑(3, 5, 6, 7)
Encoders:
z = D1 + D3 + D5 + D7
y = D2 + D3 + D6 + D7
x = D4 + D5 + D6 + D7
Priority encoder :
x = D2 + D3
y = D3 + D1D’2
V = D0 + D1 + D2 + D3
Multiplexers
Multiplexer is a special type of combinational circuit. There are n-data
inputs, one output and m select inputs with 2m = n. It is a digital circuit
which selects one of the n data inputs and routes it to the output. The
selection of one of the n inputs is done by the selected inputs. Depending
on the digital code applied at the selected inputs, one out of n data
sources is selected and transmitted to the single output Y. E is called the
strobe or enable input which is useful for the cascading. It is generally an
active low terminal that means it will perform the required operation
when it is low.
Block diagram
2 to 1 Multiplexer:
S = 1, Y = I1 0 I0
1 I1
2 : 1 multiplexer
4 : 1 multiplexer
16 : 1 multiplexer
32 : 1 multiplexer
Block Diagram
Truth Table
4-to-1 Line Multiplexer:
Multiplexer circuits can be combined with common selection inputs to provide multiple-bit
selection logic. Compare with Fig4-24.
Boolean function implementation :
A more efficient method for implementing a Boolean function of n variables with a
multiplexer that has n-1 selection inputs.
F(x, y, z) = (1,2,6,7)
1 : 2 demultiplexer
1 : 4 demultiplexer
1 : 16 demultiplexer
1 : 32 demultiplexer
Block diagram
Truth Table
Computer Organization and Architecture Chapter 6 : Memory System
Chapter – 6
Memory System
6.1 Microcomputer Memory
Memory is an essential component of the microcomputer system.
It stores binary instructions and datum for the microcomputer.
The memory is the place where the computer holds current programs and data that are in
use.
None technology is optimal in satisfying the memory requirements for a computer
system.
Computer memory exhibits perhaps the widest range of type, technology, organization,
performance and cost of any feature of a computer system.
The memory unit that communicates directly with the CPU is called main memory.
Devices that provide backup storage are called auxiliary memory or secondary memory.
Location
• Processor memory: The memory like registers is included within the processor and
termed as processor memory.
• Internal memory: It is often termed as main memory and resides within the CPU.
• External memory: It consists of peripheral storage devices such as disk and magnetic
tape that are accessible to processor via i/o controllers.
Capacity
• Word size: Capacity is expressed in terms of words or bytes.
— The natural unit of organisation
• Number of words: Common word lengths are 8, 16, 32 bits etc.
— or Bytes
Unit of Transfer
• Internal: For internal memory, the unit of transfer is equal to the number of data lines
into and out of the memory module.
• External: For external memory, they are transferred in block which is larger than a
word.
• Addressable unit
— Smallest location which can be uniquely addressed
— Word internally
— Cluster on Magnetic disks
Compiled By: Er. Hari Aryal [[email protected]] Reference: W. Stallings & M. Mano | 1
Computer Organization and Architecture Chapter 6 : Memory System
Access Method
• Sequential access: In this access, it must start with beginning and read through a
specific linear sequence. This means access time of data unit depends on position of
records (unit of data) and previous location.
— e.g. tape
• Direct Access: Individual blocks of records have unique address based on location.
Access is accomplished by jumping (direct access) to general vicinity plus a
sequential search to reach the final location.
— e.g. disk
• Random access: The time to access a given location is independent of the sequence of
prior accesses and is constant. Thus any location can be selected out randomly and
directly addressed and accessed.
— e.g. RAM
• Associative access: This is random access type of memory that enables one to make a
comparison of desired bit locations within a word for a specified match, and to do this
for all words simultaneously.
— e.g. cache
Performance
• Access time: For random access memory, access time is the time it takes to perform a
read or write operation i.e. time taken to address a memory plus to read / write from
addressed memory location. Whereas for non-random access, it is the time needed to
position read / write mechanism at desired location.
— Time between presenting the address and getting the valid data
• Memory Cycle time: It is the total time that is required to store next memory access
operation from the previous memory access operation.
Memory cycle time = access time plus transient time (any additional time required
before a second access can commence).
— Time may be required for the memory to “recover” before next access
— Cycle time is access + recovery
• Transfer Rate: This is the rate at which data can be transferred in and out of a
memory unit.
— Rate at which data can be moved
— For random access, R = 1 / cycle time
— For non-random access, Tn = Ta + N / R; where Tn – average time to read or
write N bits, Ta – average access time, N – number of bits, R – Transfer rate
in bits per second (bps).
Physical Types
• Semiconductor
— RAM
• Magnetic
— Disk & Tape
• Optical
— CD & DVD
• Others
Compiled By: Er. Hari Aryal [[email protected]] Reference: W. Stallings & M. Mano | 2
Computer Organization and Architecture Chapter 6 : Memory System
— Bubble
— Hologram
Physical Characteristics
• Decay: Information decays mean data loss.
• Volatility: Information decays when electrical power is switched off.
• Erasable: Erasable means permission to erase.
• Power consumption: how much power consumes?
Organization
• Physical arrangement of bits into words
• Not always obvious
- e.g. interleaved
Compiled By: Er. Hari Aryal [[email protected]] Reference: W. Stallings & M. Mano | 3
Computer Organization and Architecture Chapter 6 : Memory System
CPU logic is usually faster than main memory access time, with the result that processing
speed is limited primarily by the speed of main memory
The cache is used for storing segments of programs currently being executed in the CPU
and temporary data frequently needed in the present calculations
The memory hierarchy system consists of all storage devices employed in a computer
system from slow but high capacity auxiliary memory to a relatively faster cache memory
accessible to high speed processing logic. The figure below illustrates memory hierarchy.
Compiled By: Er. Hari Aryal [[email protected]] Reference: W. Stallings & M. Mano | 4
Computer Organization and Architecture Chapter 6 : Memory System
Hierarchy List
Registers
L1 Cache
L2 Cache
Main memory
Disk cache
Disk
Optical
Tape
Compiled By: Er. Hari Aryal [[email protected]] Reference: W. Stallings & M. Mano | 5
Computer Organization and Architecture Chapter 6 : Memory System
Compiled By: Er. Hari Aryal [[email protected]] Reference: W. Stallings & M. Mano | 6
Computer Organization and Architecture Chapter 6 : Memory System
Compiled By: Er. Hari Aryal [[email protected]] Reference: W. Stallings & M. Mano | 7
Computer Organization and Architecture Chapter 6 : Memory System
Types of ROM
Programmable ROM (PROM)
o It is non-volatile and may be written into only once. The writing process is
performed electrically and may be performed by a supplier or customer at
a time later than the original chip fabrication.
Erasable Programmable ROM (EPROM)
o It is read and written electrically. However, before a write operation, all
the storage cells must be erased to the same initial state by exposure of the
packaged chip to ultraviolet radiation (UV ray). Erasure is performed by
shining an intense ultraviolet light through a window that is designed into
the memory chip. EPROM is optically managed and more expensive than
PROM, but it has the advantage of the multiple update capability.
Compiled By: Er. Hari Aryal [[email protected]] Reference: W. Stallings & M. Mano | 8
Computer Organization and Architecture Chapter 6 : Memory System
External Memory
The devices that provide backup storage are called external memory or auxiliary
memory. It includes serial access type such as magnetic tapes and random access
type such as magnetic disks.
Magnetic Tape
A magnetic tape is the strip of plastic coated with a magnetic recording medium.
Data can be recorded and read as a sequence of character through read / write
head. It can be stopped, started to move forward or in reverse or can be rewound.
Data on tapes are structured as number of parallel tracks running length wise.
Earlier tape system typically used nine tracks. This made it possible to store data
one byte at a time with additional parity bit as 9th track. The recording of data in
this form is referred to as parallel recording.
Magnetic Disk
A magnetic disk is a circular plate constructed with metal or plastic coated with
magnetic material often both side of disk are used and several disk stacked on one
spindle which Read/write head available on each surface. All disks rotate together
at high speed. Bits are stored in magnetize surface in spots along concentric
circles called tracks. The tracks are commonly divided into sections called
sectors. After the read/write head are positioned in specified track the system has
to wait until the rotating disk reaches the specified sector under read/write head.
Information transfer is very fast once the beginning of sector has been reached.
Disk that are permanently attached to the unit assembly and cannot be used by
occasional user are called hard disk drive with removal disk is called floppy disk.
Compiled By: Er. Hari Aryal [[email protected]] Reference: W. Stallings & M. Mano | 9
Computer Organization and Architecture Chapter 6 : Memory System
Optical Disk
The huge commercial success of CD enabled the development of low cost optical
disk storage technology that has revolutionized computer data storage. The disk is
form from resin such as polycarbonate. Digitally recorded information is
imprinted as series of microscopic pits on the surface of poly carbonate. This is
done with the finely focused high intensity leaser. The pitted surface is then
coated with reflecting surface usually aluminum or gold. The shiny surface is
protected against dust and scratches by the top coat of acrylic.
Information is retrieved from CD by low power laser. The intensity of reflected
light of laser changes as it encounters a pit. Specifically if the laser beam falls on
pit which has somewhat rough surface the light scatters and low intensity is
reflected back to the surface. The areas between pits are called lands. A land is a
smooth surface which reflects back at higher intensity. The change between pits
and land is detected by photo sensor and converted into digital signal. The sensor
tests the surface at regular interval.
DVD-Technology
Multi-layer
Very high capacity (4.7G per layer)
Full length movie on single disk
Using MPEG compression
Finally standardized (honest!)
Movies carry regional coding
Players only play correct region films
DVD-Writable
Loads of trouble with standards
First generation DVD drives may not read first generation DVD-W disks
First generation DVD drives may not read CD-RW disks
Compiled By: Er. Hari Aryal [[email protected]] Reference: W. Stallings & M. Mano | 10
Computer Organization and Architecture Chapter 6 : Memory System
Cache memory is intended to give memory speed approaching that of the fastest
memories available, and at the same time provide a large memory size at the price of less
expensive types of semiconductor memories. There is a relatively large and slow main
memory together with a smaller, faster cache memory contains a copy of portions of
main memory.
When the processor attempts to read a word of memory, a check is made to determine if
the word is in the cache. If so, the word is delivered to the processor. If not, a block of
main memory, consisting of fixed number of words is read into the cache and then the
word is delivered to the processor.
The locality of reference property states that over a short interval of time, address
generated by a typical program refers to a few localized area of memory repeatedly. So if
programs and data which are accessed frequently are placed in a fast memory, the
average access time can be reduced. This type of small, fast memory is called cache
memory which is placed in between the CPU and the main memory.
When the CPU needs to access memory, cache is examined. If the word is found in
cache, it is read from the cache and if the word is not found in cache, main memory is
accessed to read word. A block of word containing the one just accessed is then
transferred from main memory to cache memory.
Compiled By: Er. Hari Aryal [[email protected]] Reference: W. Stallings & M. Mano | 11
Computer Organization and Architecture Chapter 6 : Memory System
When a cache hit occurs, the data and address buffers are disabled and the
communication is only between processor and cache with no system bus traffic. When a
cache miss occurs, the desired word is first read into the cache and then transferred from
cache to processor. For later case, the cache is physically interposed between the
processor and main memory for all data, address and control lines.
Compiled By: Er. Hari Aryal [[email protected]] Reference: W. Stallings & M. Mano | 12
Computer Organization and Architecture Chapter 6 : Memory System
Locality of Reference
The reference to memory at any given interval of time tends to be confined within
a few localized area of memory. This property is called locality of reference. This
is possible because the program loops and subroutine calls are encountered
frequently. When program loop is executed, the CPU will execute same portion of
program repeatedly. Similarly, when a subroutine is called, the CPU fetched
starting address of subroutine and executes the subroutine program. Thus loops
and subroutine localize reference to memory.
This principle states that memory references tend to cluster over a long period of
time, the clusters in use changes but over a short period of time, the processor is
primarily working with fixed clusters of memory references.
Spatial Locality
It refers to the tendency of execution to involve a number of memory locations
that are clustered.
It reflects tendency of a program to access data locations sequentially, such as
when processing a table of data.
Temporal Locality
It refers to the tendency for a processor to access memory locations that have been
used frequently. For e.g. Iteration loops executes same set of instructions
repeatedly.
Compiled By: Er. Hari Aryal [[email protected]] Reference: W. Stallings & M. Mano | 13
Computer Organization and Architecture Chapter 6 : Memory System
Direct Mapping
It is the simplex technique, maps each block of main memory into only one possible
cache line i.e. a given main memory block can be placed in one and only one place on
cache.
i = j modulo m
Where I = cache line number; j = main memory block number; m = number of lines in
the cache
The mapping function is easily implemented using the address. For purposes of cache
access, each main memory address can be viewed as consisting of three fields.
The least significant w bits identify a unique word or byte within a block of main
memory. The remaining s bits specify one of the 2s blocks of main memory.
The cache logic interprets these s bits as a tag of (s-r) bits most significant position and a
line field of r bits. The latter field identifies one of the m = 2r lines of the cache.
Compiled By: Er. Hari Aryal [[email protected]] Reference: W. Stallings & M. Mano | 14
Computer Organization and Architecture Chapter 6 : Memory System
24 bit address
2 bit word identifier (4 byte block)
22 bit block identifier
8 bit tag (=22-14), 14 bit slot or line
No two blocks in the same line have the same Tag field
Check contents of cache by finding line and checking Tag
Cache Line 0 1 2 3 4
0 1 2 3 4
Main 5 6 7 8 9
Memory 10 11 12 13 14
Block 15 16 17 18 19
20 21 22 23 24
Note that
o all locations in a single block of memory have the same higher order bits (call them the
block number), so the lower order bits can be used to find a particular word in the block.
o within those higher-order bits, their lower-order bits obey the modulo mapping given
above (assuming that the number of cache lines is a power of 2), so they can be used to
get the cache line for that block
o the remaining bits of the block number become a tag, stored with each cache line, and
used to distinguish one block from another that could fit into that same cache
Compiled By: Er. Hari Aryal [[email protected]] Reference: W. Stallings & M. Mano | 15
Computer Organization and Architecture Chapter 6 : Memory System
line.
Compiled By: Er. Hari Aryal [[email protected]] Reference: W. Stallings & M. Mano | 16
Computer Organization and Architecture Chapter 6 : Memory System
Compiled By: Er. Hari Aryal [[email protected]] Reference: W. Stallings & M. Mano | 17
Computer Organization and Architecture Chapter 6 : Memory System
o If a program accesses 2 blocks that map to the same line repeatedly, cache
misses are very high
Associated Mapping
It overcomes the disadvantage of direct mapping by permitting each main memory block
to be loaded into any line of cache.
Cache control logic interprets a memory address simply as a tag and a word field
Tag uniquely identifies block of memory
Cache control logic must simultaneously examine every line’s tag for a match which
requires fully associative memory
very complex circuitry, complexity increases exponentially with size
Cache searching gets expensive
Compiled By: Er. Hari Aryal [[email protected]] Reference: W. Stallings & M. Mano | 18
Computer Organization and Architecture Chapter 6 : Memory System
Compiled By: Er. Hari Aryal [[email protected]] Reference: W. Stallings & M. Mano | 21
Computer Organization and Architecture Chapter 6 : Memory System
e.g
Address Tag Data Set number
1FF 7FFC 1FF 12345678 1FFF
001 7FFC 001 11223344 1FFF
Compiled By: Er. Hari Aryal [[email protected]] Reference: W. Stallings & M. Mano | 22
Computer Organization and Architecture Chapter 6 : Memory System
Write Through
All write operations are made to main memory as well as to cache, so main
memory is always valid
Other CPU’s monitor traffic to main memory to update their caches when needed
This generates substantial memory traffic and may create a bottleneck
Anytime a word in cache is changed, it is also changed in main memory
Both copies always agree
Generates lots of memory writes to main memory
Multiple CPUs can monitor main memory traffic to keep local (to CPU) cache up
to date
Lots of traffic
Slows down writes
Remember bogus write through caches!
Write back
When an update occurs, an UPDATE bit associated with that slot is set, so when
the block is replaced it is written back first
During a write, only change the contents of the cache
Update main memory only when the cache line is to be replaced
Causes “cache coherency” problems -- different values for the contents of an
address are in the cache and the main memory
Complex circuitry to avoid this problem
Accesses by I/O modules must occur through the cache
Multiple caches still can become invalidated, unless some cache coherency
system is used. Such systems include:
o Bus Watching with Write Through - other caches monitor memory writes
by other caches (using write through) and invalidates their own cache line
if a match
o Hardware Transparency - additional hardware links multiple caches so
that writes to one cache are made to the others
o Non-cacheable Memory - only a portion of main memory is shared by
more than one processor, and it is non-cacheable
Compiled By: Er. Hari Aryal [[email protected]] Reference: W. Stallings & M. Mano | 23
Computer Organization and Architecture Chapter 6 : Memory System
Split Cache
o Cache splits into two parts first for instruction and second for data. Can
outperform unified cache in systems that support parallel execution and
pipelining (reduces cache contention)
o Trend is toward split cache because of superscalar CPU’s
o Better for pipelining, pre-fetching, and other parallel instruction execution
designs
o Eliminates cache contention between instruction processor and the
execution unit (which uses data)
Compiled By: Er. Hari Aryal [[email protected]] Reference: W. Stallings & M. Mano | 24