0% found this document useful (0 votes)
35 views26 pages

Module 4-3

Uploaded by

Manish MH
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
35 views26 pages

Module 4-3

Uploaded by

Manish MH
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 26

MODULE 4

Input/output Organization: Accessing I/O Devices, Interrupts – Interrupt Hardware, Enabling and
Disabling Interrupts, Handling Multiple Devices, Direct Memory Access: Bus Arbitration, Speed, size
and Cost of memory systems. Cache Memories – Mapping Functions.

Session 25

ACCESSING I/O DEVICES

A simple arrangement to connect I/O devices to a computer is to use a single bus


arrangement. The bus enables all the devices connected to it to exchange information.
Typically, it consists of three sets of lines used to carry address, data, and control signals. Each
I/O device is assigned a unique set of addresses. When the processor places a particular address
on the address line, the device that recognizes this address responds to the commands issued
on the control lines. The processor requests either a read or a write operation, and the
requested data are transferred over the data lines, when I/O devices and the memory share
the same address space, the arrangement is called memory-mapped I/O.

Figure 4.1:single bus architecture


With memory-mapped I/O, any machine instruction that can access memory can be
used to transfer data to or from an I/O device. For example, if DATAIN is the address of the
input buffer associated with the keyboard, the instruction

Move DATAIN, R0

Reads the data from DATAIN and stores them into processor register R0. Similarly, the
instruction

Move R0, DATAOUT


Sends the contents of register R0 to location DATAOUT, which may be the output data buffer
of a display unit or a printer.

Most computer systems use memory-mapped I/O. some processors have special In and Out
instructions to perform I/O transfers. When building a computer system based on these
processors, the designer had the option of connecting I/O devices to use the special I/O address
space or simply incorporating them as part of the memory address space. The I/O devices
examine the low-order bits of the address bus to determine whether they should respond. The
hardware required to connect an I/O device to the bus.

I/O INTERFACE FOR AN INPUT DEVICE:

Figure 4.2: I/O Device interface

The address decoder enables the device to recognize its address when this address appears on
the address lines.

The data register holds the data being transferred to or from the processor.

The status register contains information relevant to the operation of the I/O device. Both the
data and status registers are connected to the data bus and assigned unique addresses.

The address decoder, the data and status registers, and the control circuitry required to
coordinate I/O transfers constitute the device’s interface circuit.

I/O devices operate at speeds that are vastly different from that of the processor. When a
human operator is entering characters at a keyboard, the processor is capable of executing
millions of instructions between successive character entries. An instruction that reads a
character from the keyboard should be executed only when a character is available in the input
buffer of the keyboard interface. Also, we must make sure that an input character is read only
once.

Review Questions

1.What is the difference between memory mapped I/O and I/O mapped I/O

2.What is the function of Address Decoder.

3.Which are the two flags used to check the status of characters presence in the registers
Session 26

Interrupts

In the case of program-controlled I/O, in which the processor repeatedly checks a status flag
to achieve the required synchronization between the processor and an input or output device.
We say that the processor polls the device. There are two other commonly used mechanisms
for implementing I/O operations: interrupts and direct memory access. In the case of
interrupts, synchronization is achieved by having the I/O device send a special signal over the
bus whenever it is ready for a data transfer operation. Direct memory access is a technique
used for high-speed I/O devices. It involves having the device interface transfer data directly
to or from the memory, without continuous involvement by the processor. The routine executed
in response to an interrupt request is called the interrupt-service routine, which is the PRINT
routine in our example. Interrupts bear considerable resemblance to subroutine calls. Assume
that an interrupt request arrives during execution of instruction i in figure 1

Figure 4.3. Transfer of control through the use of interrupts

The processor first completes execution of instruction i. Then, it loads the program counter
with the address of the first instruction of the interrupt-service routine. For the time being, let
us assume that this address is hardwired in the processor. After execution of the interrupt-
service routine, the processor has to come back to instruction i +1. Therefore, when an interrupt
occurs, the current contents of the PC, which point to instruction i+1, must be put in temporary
storage in a known location. A Return-from-interrupt instruction at the end of the interrupt-
service routine reloads the PC from the temporary storage location, causing execution to
resume at instruction i +1. In many processors, the return address is saved on the processor
stack.
INTERRUPT HARDWARE

Figure 4.4: Interrupt hardware with multiple I/O devices

We pointed out that an I/O device requests an interrupt by activating a bus line called interrupt-
request. Most computers are likely to have several I/O devices that can request an interrupt. A
single interrupt-request line may be used to serve n devices as depicted. All devices are
connected to the line via switches to ground. To request an interrupt, a device closes its
associated switch. Thus, if all interrupt-request signals INTR1 to INTRn are inactive, that is, if
all switches are open, the voltage on the interrupt-request line will be equal to Vdd. This is the
inactive state of the line. Since the closing of one or more switches will cause the line voltage
to drop to 0, the value of INTR is the logical OR of the requests from individual devices, that is,

INTR = INTR1 + ………+INTRn

It is customary to use the complemented form, INTR , to name the interrupt-request signal on
the common line, because this signal is active when in the low-voltage state.

Review Questions

1.What is interrupt
2.What is interrupt service Routine
Session 27

ENABLING AND DISABLING INTERRUPTS

The facilities provided in a computer must give the programmer complete control over
the events that take place during program execution. The arrival of an interrupt request from
an external device causes the processor to suspend the execution of one program and start the
execution of another. Because interrupts can arrive at any time, they may alter the sequence
of events from the envisaged by the programmer. Hence, the interruption of program execution
must be carefully controlled.

Let us consider in detail the specific case of a single interrupt request from one device.
When a device activates the interrupt-request signal, it keeps this signal activated until it learns
that the processor has accepted its request. This means that the interrupt-request signal will
be active during execution of the interrupt-service routine, perhaps until an instruction is
reached that accesses the device in question.

Method 1: The first possibility is to have the processor hardware ignore the interrupt-request
line until the execution of the first instruction of the interrupt-service routine has been
completed. Then, by using an Interrupt-disable instruction as the first instruction in the
interrupt-service routine, the programmer can ensure that no further interruptions will occur
until an Interrupt-enable instruction is executed. Typically, the Interrupt-enable instruction will
be the last instruction in the interrupt-service routine before the Return-from-interrupt
instruction. The processor must guarantee that execution of the Return-from-interrupt
instruction is completed before further interruption can occur.

Method 2: The second option, which is suitable for a simple processor with only one interrupt-
request line, is to have the processor automatically disable interrupts before starting the
execution of the interrupt-service routine. After saving the contents of the PC and the processor
status register (PS) on the stack, the processor performs the equivalent of executing an
Interrupt-disable instruction. It is often the case that one bit in the PS register, called Interrupt-
enable, indicates whether interrupts are enabled.

Method 3: In the third option, the processor has a special interrupt-request line for which the
interrupt-handling circuit responds only to the leading edge of the signal. Such a line is said to
be edge-triggered.
Before proceeding to study more complex aspects of interrupts, let us summarize the
sequence of events involved in handling an interrupt request from a single device. Assuming
that interrupts are enabled, the following is a typical scenario.

1. The device raises an interrupt request.


2. The processor interrupts the program currently being executed.
3. Interrupts are disabled by changing the control bits in the PS (except in the case of edge-
triggered interrupts).
4. The device is informed that its request has been recognized, and in response, it
deactivates the interrupt-request signal.
5. The action requested by the interrupt is performed by the interrupt-service routine.
6. Interrupts are enabled and execution of the interrupted program is resumed.

Review Questions

1. Why must the interruption of program execution be carefully controlled?


2. What is the role of the interrupt-request signal in program execution?
3. How to enable the Interrupt
4. How to disable the interrupts
Session 28

HANDLING MULTIPLE DEVICES

Let us now consider the situation where a number of devices capable of initiating interrupts
are connected to the processor. Because these devices are operationally independent, there is
no definite order in which they will generate interrupts. For example, device X may request in
interrupt while an interrupt caused by device Y is being serviced, or several devices may request
interrupts at exactly the same time. This gives rise to a number of questions

1. How can the processor recognize the device requesting an interrupts?


2. Given that different devices are likely to require different interrupt-service routines, how
can the processor obtain the starting address of the appropriate routine in each case?
3. Should a device be allowed to interrupt the processor while another interrupt is being
serviced?
4. How should two or more simultaneous interrupt requests be handled?

The means by which these problems are resolved vary from one computer to another,
And the approach taken is an important consideration in determining the computer’s suitability
for a given application.
When a request is received over the common interrupt-request line, additional
information is needed to identify the particular device that activated the line. The information
needed to determine whether a device is requesting an interrupt is available in its status
register. When a device raises an interrupt request, it sets to 1 one of the bits in its status
register, which we will call the IRQ bit. For example, bits KIRQ and DIRQ are the interrupt
request bits for the keyboard and the display, respectively. The simplest way to identify the
interrupting device is to have the interrupt-service routine poll all the I/O devices connected to
the bus. The first device encountered with its IRQ bit set is the device that should be serviced.
An appropriate subroutine is called to provide the requested service.

The polling scheme is easy to implement. Its main disadvantage is the time spent interrogating
the IRQ bits of all the devices that may not be requesting any service. An alternative approach
is to use vectored interrupts, which we describe next.

Vectored Interrupts:-

To reduce the time involved in the polling process, a device requesting an interrupt may
identify itself directly to the processor. Then, the processor can immediately start executing the
corresponding interrupt-service routine. The term vectored interrupts refers to all interrupt-
handling schemes based on this approach.

A device requesting an interrupt can identify itself by sending a special code to the
processor over the bus. This enables the processor to identify individual devices even if they
share a single interrupt-request line. The code supplied by the device may represent the
starting address of the interrupt-service routine for that device. The code length is typically in
the range of 4 to 8 bits. The remainder of the address is supplied by the processor based on
the area in its memory where the addresses for interrupt-service routines are located.

This arrangement implies that the interrupt-service routine for a given device must always
start at the same location. The programmer can gain some flexibility by storing in this location
an instruction that causes a branch to the appropriate routine.

Interrupt Nesting:

Interrupts should be disabled during the execution of an interrupt-service routine, to ensure


that a request from one device will not cause more than one interruption. The same
arrangement is often used when several devices are involved, in which case execution of a
given interrupt-service routine, once started, always continues to completion before the
processor accepts an interrupt request from a second device. Interrupt-service routines are
typically short, and the delay they may cause is acceptable for most simple devices.

For some devices, however, a long delay in responding to an interrupt request may lead
to erroneous operation. Consider, for example, a computer that keeps track of the time of day
using a real-time clock. This is a device that sends interrupt requests to the processor at regular
intervals. For each of these requests, the processor executes a short interrupt-service routine
to increment a set of counters in the memory that keep track of time in seconds, minutes, and
so on. Proper operation requires that the delay in responding to an interrupt request from the
real-time clock be small in comparison with the interval between two successive requests. To
ensure that this requirement is satisfied in the presence of other interrupting devices, it may
be necessary to accept an interrupt request from the clock during the execution of an interrupt-
service routine for another device

A multiple-level priority organization means that during execution of an interrupt-service


routine, interrupt requests will be accepted from some devices but not from others, depending
upon the device’s priority. To implement this scheme, we can assign a priority level to the
processor that can be changed under program control. The priority level of the processor is the
priority of the program that is currently being executed. The processor accepts interrupts only
from devices that have priorities higher than its own.
The processor’s priority is usually encoded in a few bits of the processor status word. It
can be changed by program instructions that write into the PS. These are privileged instructions,
which can be executed only while the processor is running in the supervisor mode. The
processor is in the supervisor mode only when executing operating system routines. It switches
to the user mode before beginning to execute application programs. Thus, a user program
cannot accidentally, or intentionally, change the priority of the processor and disrupt the
system’s operation. An attempt to execute a privileged instruction while in the user mode leads
to a special type of interrupt called a privileged instruction.

A multiple-priority scheme can be implemented easily by using separate interrupt-


request and interrupt-acknowledge lines for each device, as shown in figure. Each of the
interrupt-request lines is assigned a different priority level. Interrupt requests received over
these lines are sent to a priority arbitration circuit in the processor. A request is accepted only
if it has a higher priority level than that currently assigned to the processor.

Priority arbitration Circuit

Figure2: Implementation of interrupt priority using individual interrupt-request and


acknowledge lines.

Simultaneous Requests:-

Let us now consider the problem of simultaneous arrivals of interrupt requests from two
or more devices. The processor must have some means of deciding which requests to service
first. Using a priority scheme such as that of figure, the solution is straightforward. The
processor simply accepts the requests having the highest priority.

Polling the status registers of the I/O devices is the simplest such mechanism. In this
case, priority is determined by the order in which the devices are polled. When vectored
interrupts are used, we must ensure that only one device is selected to send its interrupt vector
code. A widely used scheme is to connect the devices to form a daisy chain, as shown in figure

3a. The interrupt-request line INTR is common to all devices. The interrupt-acknowledge line,
INTA, is connected in a daisy-chain fashion, such that the INTA signal propagates serially
through the devices.

(3.a) Daisy chain

(3.b) Arrangement of priority groups

When several devices raise an interrupt request and the INTR line is activated, the processor
responds by setting the INTA line to 1. This signal is received by device 1. Device 1 passes the
signal on to device 2 only if it does not require any service. If device 1 has a pending request
for interrupt, it blocks the INTA signal and proceeds to put its identifying code on the data lines.
Therefore, in the daisy-chain arrangement, the device that is electrically closest to the
processor has the highest priority. The second device along the chain has second highest
priority, and so on.

The scheme in figure 3.a requires considerably fewer wires than the individual
connections in figure 2. The main advantage of the scheme in figure 2 is that it allows the
processor to accept interrupt requests from some devices but not from others, depending upon
their priorities. The two schemes may be combined to produce the more general structure in
figure 3b. Devices are organized in groups, and each group is connected at a different priority
level. Within a group, devices are connected in a daisy chain. This organization is used in many
computer systems.

Review Questions

1. What is the IRQ bit?


2. What is the main disadvantage of the polling scheme for identifying interrupting devices?
3. What is a vectored interrupt?
4. How does a device identify itself to the processor in a vectored interrupt system?
5. What is interrupt nesting?
6. Why should interrupts be disabled during the execution of an interrupt-service routine?
7. What is a multiple-level priority organization in the context of interrupts?
8. In a priority scheme, how does a processor decide which interrupt request to service
first?
9. What is the purpose of the priority arbitration circuit in an interrupt-handling system?
10.What is the daisy chain method used for in handling simultaneous interrupt requests?
Session 29

DIRECT MEMORY ACCESS:-

The discussion in the previous sections concentrates on data transfer between the
processor and I/O devices. Data are transferred by executing instructions such as

Move DATAIN, R0

An instruction to transfer input or output data is executed only after the processor
determines that the I/O device is ready. To do this, the processor either polls a status flag in
the device interface or waits for the device to send an interrupt request. In either case,
considerable overhead is incurred, because several program instructions must be executed for
each data word transferred. In addition to polling the status register of the device, instructions
are needed for incrementing the memory address and keeping track of the word count. When
interrupts are used, there is the additional overhead associated with saving and restoring the
program counter and other state information.

To transfer large blocks of data at high speed, an alternative approach is used. A special
control unit may be provided to allow transfer of a block of data directly between an external
device and the main memory, without continuous intervention by the processor. This approach
is called direct memory access, or DMA.

DMA transfers are performed by a control circuit that is part of the I/O device interface.
We refer to this circuit as a DMA controller. The DMA controller performs the functions that
would normally be carried out by the processor when accessing the main memory. For each
word transferred, it provides the memory address and all the bus signals that control data
transfer. Since it has to transfer blocks of data, the DMA controller must increment the memory
address for successive words and keep track of the number of transfers.

Although a DMA controller can transfer data without intervention by the processor, its
operation must be under the control of a program executed by the processor. To initiate the
transfer of a block of words, the processor sends the starting address, the number of words in
the block, and the direction of the transfer. On receiving this information, the DMA controller
proceeds to perform the requested operation. When the entire block has been transferred, the
controller informs the processor by raising an interrupt signal.
While a DMA transfer is taking place, the program that requested the transfer cannot
continue, and the processor can be used to execute another program. After the
DMA transfer is completed, the processor can return to the program that requested the transfer.
I/O operations are always performed by the operating system of the computer in
response to a request from an application program. The OS is also responsible for suspending
the execution of one program and starting another. Thus, for an I/O operation involving DMA,
the OS puts the program that requested the transfer in the Blocked state, initiates the DMA
operation, and starts the execution of another program. When the transfer is completed, the
DMA controller informs the processor by sending an interrupt request. In response, the OS puts
the suspended program in the Runnable state so that it can be selected by the scheduler to
continue execution.

Figure 4 shows an example of the DMA controller registers that are accessed by the processor
to initiate transfer operations. Two registers are used for storing the

Figure 4 Registers in DMA interface

Figure 5 Use of DMA controllers in a computer system

Starting address and the word count. The third register contains status and control flags. The
R/W bit determines the direction of the transfer. When this bit is set to 1 by a program
instruction, the controller performs a read operation, that is, it transfers data from the memory
to the I/O device. Otherwise, it performs a write operation. When the controller has completed
transferring a block of data and is ready to receive another command, it sets the Done flag to
1. Bit 30 is the Interrupt-enable flag, IE. When this flag is set to 1, it causes the controller to
raise an interrupt after it has completed transferring a block of data. Finally, the controller sets
the IRQ bit to 1 when it has requested an interrupt.

An example of a computer system is given in above figure, showing how DMA controllers
may be used. A DMA controller connects a high-speed network to the computer bus. The disk
controller, which controls two disks, also has DMA capability and provides two DMA channels.
It can perform two independent DMA operations, as if each disk had its own DMA controller.
The registers needed to store the memory address, the word count, and so on are duplicated,
so that one set can be used with each device.

To start a DMA transfer of a block of data from the main memory to one of the disks, a
program writes the address and word count information into the registers of the corresponding
channel of the disk controller. It also provides the disk controller with information to identify
the data for future retrieval. The DMA controller proceeds independently to implement the
specified operation. When the DMA transfer is completed. This fact is recorded in the status
and control register of the DMA channel by setting the Done bit. At the same time, if the IE bit
is set, the controller sends an interrupt request to the processor and sets the IRQ bit. The status
register can also be used to record other information, such as whether the transfer took place
correctly or errors occurred.

Memory accesses by the processor and the DMA controller are interwoven. Requests by
DMA devices for using the bus are always given higher priority than processor requests. Among
different DMA devices, top priority is given to high-speed peripherals such as a disk, a high-
speed network interface, or a graphics display device. Since the processor originates most
memory access cycles, the DMA controller can be said to “steal” memory cycles from the
processor. Hence, the interweaving technique is usually called cycle stealing. Alternatively, the
DMA controller may be given exclusive access to the main memory to transfer a block of data
without interruption. This is known as block or burst mode.

Most DMA controllers incorporate a data storage buffer. In the case of the network
interface in figure 5 for example, the DMA controller reads a block of data from the main
memory and stores it into its input buffer. This transfer takes place using burst mode at a speed
appropriate to the memory and the computer bus. Then, the data in the buffer are transmitted
over the network at the speed of the network.
A conflict may arise if both the processor and a DMA controller or two DMA controllers
try to use the bus at the same time to access the main memory. To resolve these conflicts, an
arbitration procedure is implemented on the bus to coordinate the activities of all devices
requesting memory transfers.

Bus Arbitration:-

The device that is allowed to initiate data transfers on the bus at any given time is called
the bus master. When the current master relinquishes control of the bus, another device can
acquire this status. Bus arbitration is the process by which the next device to become the bus
master is selected and bus mastership is transferred to it. The selection of the bus master must
take into account the needs of various devices by establishing a priority system for gaining
access to the bus.

There are two approaches to bus arbitration: centralized and distributed. In centralized
arbitration, a single bus arbiter performs the required arbitration. In distributed arbitration, all
devices participate in the selection of the next bus master.

Centralized Arbitration:-

The bus arbiter may be the processor or a separate unit connected to the bus. A basic
arrangement in which the processor contains the bus arbitration circuitry. In this case, the
processor is normally the bus master unless it grants bus mastership to one of the DMA
controllers. A DMA controller indicates that it needs to become the bus master by activating the

Bus-Request line, BR . The signal on the Bus-Request line is the logical OR of the bus requests
from all the devices connected to it. When Bus-Request is activated, the processor activates
the Bus-Grant signal, BG1, indicating to the DMA controllers that they may use the bus when
it becomes free. This signal is connected to all DMA controllers using a daisy-chain arrangement.
Thus, if DMA controller 1 is requesting the bus, it blocks the propagation of the grant signal to
other devices. Otherwise, it passes the grant downstream by asserting BG2. The current bus
master indicates to all device that it is using the bus by activating another open-controller line

called Bus-Busy, BBSY . Hence, after receiving the Bus-Grant signal, a DMA controller waits for
Bus-Busy to become inactive, then assumes mastership of the bus.
Distributed Arbitration:-

Distributed arbitration means that all devices waiting to use the bus have equal responsibility
in carrying out the arbitration process, without using a central arbiter. A simple method for
distributed arbitration is illustrated in figure 6. Each device on the bus assigned a 4-bit
identification number. When one or more devices request the bus, they assert the

Start − Arbitratio n signal and place their 4-bit ID numbers on four open-collector lines, ARB0
through ARB3 . A winner is selected as a result of the interaction among the signals transmitted
over those liens by all contenders. The net outcome is that the code on the four lines represents
the request that has the highest ID number.

Review Questions
1. What is the purpose of a DMA controller?
2. What is the role of the R/W bit in a DMA controller?
3. What is cycle stealing in the context of DMA?
4. What is block or burst mode in DMA transfers?
5. What is bus arbitration?
6. What is the bus master?
7. What is the main difference between centralized and distributed bus arbitration?
8. What is the role of the 4-bit ID number in distributed arbitration?
Session 30

Speed, Size and Cost

Ideally, computer memory should be fast, large and inexpensive. Unfortunately, it is impossible to meet
all the three requirements simultaneously. Increased speed and size are achieved at increased cost. Very
fast memory systems can be achieved if SRAM chips are used. These chips are expensive and for the
cost reason it is impracticable to build a large main memory using SRAM chips. The alternative used to
use DRAM chips for large main memories.
The processor fetches the code and data from the main memory to execute the program. The DRAMs
which form the main memory are slower devices. So it is necessary to insert wait states in memory
read/write cycles. This reduces the speed of execution. The solution for this problem is in the memory
system small section of SRAM is added along with the main memory, referred to as cache memory. The
program which is to be executed is loaded in the main memory, but the part of the program and data
accessed from the cache memory. The cache controller looks after this swapping between main memory
and cache memory with the help of DMA controller, Such cache memory is called secondary cache.
Recent processor have the built in cache memory called primary cache. The size of the memory is still
small compared to the demands of the large programs with the voluminous data. A solution is provided
by using secondary storage, mainly magnetic disks and magnetic tapes to implement large memory
spaces, which is available at reasonable prices.

To make efficient computer system it is not possible to rely on a single memory component, but to
employ a memory hierarchy which uses all different types of memory units that gives efficient
computer system. A typical memory hierarchy is illustrated below in the figure :
Memory Access Times: - It is a useful measure of the speed of the memory unit. It is the time that elapses
between the initiation of an operation and the completion of that operation (for example, the time between
READ and MFC).
Memory Cycle Time :- It is an important measure of the memory system. It is the minimum time delay
required between the initiations of two successive memory operations (for example, the time between two
successive READ operations). The cycle time is usually slightly longer than the access time.

Review Questions

1. What type of memory is typically used for large main memory?


2. What is the purpose of cache memory in a computer system?
3. Which type of memory is typically used for secondary storage?
4. Why is a memory hierarchy necessary in computer systems?
5. What is memory access time?
6. What is memory cycle time?
7. Why is memory cycle time usually slightly longer than memory access time?
Session 31

CACHE MEMORIES

The cache is a smaller, faster memory which stores copies of the data from the most frequently used main
memory locations. As long as most memory accesses are 194 cached memory locations, the average
latency of memory accesses will be closer to the cache latency than to the latency of main memory. In
this unit, we discuss the concepts related to cache memory, the need for it, their architecture, their working.

Cache memory is an integral part of every system now. Cache memory is random access memory (RAM)
that a computer microprocessor can access more quickly than it can access regular RAM. As the
microprocessor processes data, it looks first in the cache memory and if it finds the data there (from a
previous reading of data), it does not have to do the more time consuming reading of data from larger
memory. The effectiveness of cache mechanism is based on the property of Locality of reference.

Locality of Reference:

During some time period and remainder of the program is accessed relatively infrequently. It manifests
itself in 2 ways. They are Temporal (The recently executed instruction are likely to be executed again
very soon), Spatial (The instructions in close proximity to recently executed instruction are likely to be
executed soon). If the active segment of the program is placed in cache memory, then the total execution
time can be reduced significantly.

If the active segment of a program can be placed in a fast cache memory, then the total execution time
can be reduced significantly. The operation of a cache memory is very simple. The memory control
circuitry is designed to take advantage of the property of locality of reference. The term Block refers to
the set of contiguous address locations of some size. The cache line is used to refer to the cache block.

The Figure 15.1 shows arrangement of Cache between processor and main memory. The Cache memory
stores a reasonable number of blocks at a given time but this number is small compared to the total number
of blocks available in Main Memory. The correspondence between main memory block and the block in
cache memory is specified by a mapping function. The Cache control hardware decides that which block
should be removed to create space for the new block that contains the referenced word. The collection of
rule for making this decision is called the replacement algorithm. The cache control circuit determines
whether the requested word currently exists in the cache. If it exists, then Read/Write operation will take
place on appropriate cache location. In this case Read/Write hit will occur. In a Read operation, the
memory will not be involved.

The write operation proceeds in 2 ways. They are:


• Write-through protocol
• Write-back protocol Write-through protocol: Here the cache location and the main memory locations
are updated simultaneously.
Write-back protocol:
This technique is to update only the cache location and to mark it as with associated flag bit called
dirty/modified bit. The word in the main memory will be updated later, when the block containing this
marked word is to be removed from the cache to make room for a new block. If the requested word

currently does not exist in the cache during read operation, then read miss will occur. To overcome the
read miss Load –through / early restart protocol is used.

Read Miss: The block of words that contains the requested word is copied from the main memory into
cache. Load –through: After the entire block is loaded into cache, the particular word requested is
forwarded to the processor. If the requested word does exist in the cache during write operation, then
Write Miss will occur. If Write through protocol is used, the information is written directly into main
memory. If Write back protocol is used then blocks containing the addressed word is first brought into
the cache and then the desired word in the cache is overwritten with the new information.
Review Questions

1. How does cache memory improve the performance of a computer?


2. What is the property of Locality of Reference in cache memory?
3. What are the two types of Locality of Reference?
4. What does Temporal Locality refer to in cache memory?
5. What does Spatial Locality refer to in cache memory?
6. What is a cache hit?
7. What is a cache miss?
8. Why is cache memory faster than main memory?
Session 32

MAPPING FUNCTIONS

The different ways by which blocks from the main memory is mapped in to cache memory. There are
three main mapping techniques which decides the cache organization:

1. Direct-mapping technique
2. Associative mapping Technique
3.Set associative mapping technique

To discuss possible methods for specifying where memory blocks are placed in the cache, we use a
specific small example, a cache consisting of 128 blocks of 16 word each, for a total of 2048(2k) word,
and assuming that the main memory is addressable by a 16-bit address. The main memory has 64k
word, which will be viewed as 4K blocks of 16 word each, the consecutive addresses refer to
consecutive word.

Direct Mapping Technique


The cache systems are divided into three categories, to implement cache system. As shown in figure,
the lower order 4-bits from 16 words in a block constitute a word field. The second field is known as
block field used to distinguish a block from other blocks. Its length is 7-bits, when a new block enters
the cache, the 7-bit cache block field determines the cache position in which this block must be stored.
The third field is a Tag field, used to store higher order 5-bits of the memory address of the block, and
to identify which of the 32blocks are mapped into the cache.

Tag Block Word


5 7 4

Figure 8: Main Memory Address

It is the simplest mapping technique, in which each block from the main memory has only one
possible location in the cache organization. For example, the block I of the main memory maps
on to block i module128 of the cache. Therefore, whenever one of the main memory blocks 0, 128,
256, ……. Is loaded in the cache, it is stored in the block 0. Block 1, 129, 257,….. are stored in
block 1 of the cache and so on
Associative Mapping Technique

The figure shows the associative mapping, where in which main memory block can be placed into
any cache block position, in this case, 12 tag bits are required to identify a memory block when it
is resident in the cache. The tag bits of an address received from the processor are compared to the
tag bits of each block of the cache, to see if the desired block is present. This is called associative-
mapping technique. It gives the complete freedom in choosing the cache location in which to place
the memory block.
Set-Associative Mapping

It is a combination of the direct and associative-mapping techniques can be used. Blocks of the
cache are grouped into sets and the mapping allows a block of main memory to reside in any block
of the specific set. In this case memory blocks 0, 64,128……4032 mapped into cache set 0, and
they can occupy either of the two block positions within this set. The cache might contain the
desired block. The tag field of the address must then be associatively compared to the tags of the
two blocks of the set to check if the desired block is present these two associative searches is
simple to implement

Review Questions

1. What are the three main mapping techniques used in cache memory?
2. In direct-mapping, how is a block from the main memory mapped to the
cache?
3. How many bits are used for the word field in a 16-word block in direct-
mapping?
4. Which mapping technique offers the simplest cache organization?
Question bank
1. List out the difference between I/O mapped I/O and memory mapped I/O.

2. Explain with neat diagram I/O interface of an input device.

3. What is an interrupt? Explain its concepts with an example.

4. Explain interrupt hardware with neat diagram.

5. In a situation where multiple devices capable of initiating interrupts are connected to


processor, explain the implementation of interrupt priority, using individual INTER
and INTA and a common INTER line to all devices.
6.Define the terms cycle stealing and block modes.

7. What is bus arbitration? Explain the different approaches to bus arbitration.

8. Explain how interrupt requests from several IO devices can be communicated to a


processor through a single INTR line.

9. What are the different methods of DMA? Explain them in brief


10. What is an interrupt? Explain its concepts and the hardware used to realize it.

11. What is the necessity of DMA? Explain the two modes in which DMA interface operates to
transfer data?

12. With the sketches explain various methods for handling multiple interrupts requests

13. Explain input/output interface circuit.

14. List out the functions of an I/O interface

15. Discuss in detail any one feature of memory design that leads to improve performance of
computer.

16. Explain Bus Arbitration.


17 .Explain Speed, size and Cost of memory systems.

18. Explain Cache Memories – Mapping Functions.

You might also like