Coa Unit 3 & 4
Coa Unit 3 & 4
Prepared by:
Ms.V. SWATHILAKSHMI / AP /CSE
2 Marks
1. What is mean by memory?
• Memory is the process of taking in information from the world around us, processing it, storing it and later re-
calling that information, sometimes many years later.
• Human memory is often likened to that of a computer memory system or a filing cabinet.
2. What is mean by main memory?
• Main memory is the primary, internal workspace in the computer, commonly known as RAM (random access
memory).
• Specifications such as 4GB, 8GB, 12GB and 16GB almost always refer to the capacity of RAM. In contrast, disk or
solid-state storage capacities in a computer are typically 128GB or 256GB and higher.
3. Define RAM.
RAM stands for random-access memory, but what does that mean? Your computer RAM is essentially short term
memory where data is stored as the processor needs it. This isn't to be confused with long-term data that's
stored on your hard drive, which stays there even when your computer is turned off.
4. Define ROM.
• Read-Only Memory (ROM), is a type of electronic storage that comes built in to a device during manufacturing
• You'll find ROM chips in computers and many other types of electronic products; VCRs, game consoles, and car
radios all use ROM to complete their functions smoothly.
5. How will you map memory address?
The table, called a memory address map, is a pictorial representation of assigned address space for each chip in
the system, shown in the table.
To demonstrate with a particular example, assume that a computer system needs 512 bytes of RAM and 512
bytes of ROM. The RAM and ROM chips to be used specified in figures.
6. What is mean by auxiliary memory?
• Auxiliary memory units are among computer peripheral equipment. They trade slower access rates for greater
storage capacity and data stability.
• Auxiliary memory holds programs and data for future use, and, because it is nonvolatile (like ROM), it is used to
store inactive programs and to archive data.
7. What is mean by magnetic disks?
What are magnetic disks?
• A magnetic disk is a storage device that uses a magnetization process to write, rewrite and access data.
• It is covered with a magnetic coating and stores data in the form of tracks, spots and sectors.
• Hard disks, zip disks and floppy disks are common examples of magnetic disks.
8. What is associative memory?
• An associative memory is one in which any stored item can be accessed directly by using partial contents of the
item in question.
• Associative memories are also commonly known as content-addressable memories (CAMs).
• The subfield chosen to address the memory is called the key
9. What is cache memory?
• Cache memory is a chip-based computer component that makes retrieving data from the computer's memory
more efficient.
• It acts as a temporary storage area that the computer's processor can retrieve data from easily.
10. What is mean by mapping process?
• The mapping functions are used to map a particular block of main memory to a particular block of cache.
• This mapping function is used to transfer the block from main memory to cache memory.
11. List the types of mapping process
There are 3 main types of mapping:
Associative Mapping.
Direct Mapping.
Set Associative Mapping.
12. What are the cache algorithms
In computing, cache algorithms are optimizing instructions, or algorithms, that a computer program or a hard-
ware-maintained structure can utilize in order to manage a cache of information stored on the computer.
13. Name the cache coherence protocols.
These are explained as following below:
• MSI Protocol: This is a basic cache coherence protocol used in multiprocessor system. ...
• MOSI Protocol: This protocol is an extension of MSI protocol. ...
• MESI Protocol – It is the most widely used cache coherence protocol. ...
• MOESI Protocol.
14. What is direct memory?
Direct memory access is a feature of computer systems and allows certain hardware subsystems to access main
system memory independently of the central processing unit.
15. What is direct mapping?
• Direct mapping is a procedure used to assign each memory block in the main memory to a particular line in the
cache.
• If a line is already filled with a memory block and a new block needs to be loaded, then the old block is discarded
from the cache.
16. How does direct mapping cache work?
In a direct-mapped cache each addressed location in main memory maps to a single location in cache memory.
Since main memory is much larger than cache memory, there are many addresses in main memory that map to
the same single location in cache memory.
17. How does direct memory work?
Direct Memory Access (DMA) transfers the block of data between the memory and peripheral devices of the sys-
tem, without the participation of the processor. The unit that controls the activity of accessing memory directly is
called a DMA controller. The processor relinquishes the system bus for a few clock cycles.
18. What is the disadvantage of direct mapping?
Disadvantage of direct mapping: 1. Each block of main memory maps to a fixed location in the cache; therefore, if
two different blocks map to the same location in cache and they are continually referenced, the two blocks will be
continually swapped in and out (known as thrashing).
19. What is Direct Memory Access with example?
Typical examples are disk controllers, Ethernet controllers, USB controllers, and video controllers.
Usually the DMA controller built into these devices can only move data between the device itself and main
memory – that is, it's not intended to be used as a general system DMA controller.
20. What are the cache coherence protocols?
In multiprocessor systems with separate caches that share a common memory, a same datum can be stored in
more than one cache. A data consistency problem may occur when a data is modified in one cache only.
The protocols to maintain the coherency for multiple processors are called cache-coherency protocols.
21. What are the 3 types of cache memory?
Types of cache memory
L1 cache, or primary cache, is extremely fast but relatively small, and is usually embedded in the processor chip
as CPU cache.
L2 cache, or secondary cache, is often more capacious than L1. ...
Level 3 (L3) cache is specialized memory developed to improve the performance of L1 and L2.
22. What are the four cache replacement algorithms?
Traditional cache replacement algorithms include LRU, LFU, Pitkow/Recker and some of their variants. Least Re-
cently Used (LRU) expels the object from the cache that was asked for the least number of times, of late
23. What are the different types of mapping technique explain?
Three distinct types of mapping are used for cache memory mapping: Direct, Associative and Set-Associative
mapping
25. What is mapping process in memory Organisation?
Memory-mapping is a mechanism that maps a portion of a file, or an entire file, on disk to a range of addresses
within an application's address space. The application can then access files on disk in the same way it accesses
dynamic memory.
26. What are the 3 types of magnetic disk describe it?
Types of Magnetic Disks
Types of Magnetic Disks. All magnetic Disks are round platters. ...
Floppy Disks. A floppy disk is a flat, circular piece of flexible plastic coated with magnetic oxide. ...
Hard Disks. Hard disks are the primary on-line secondary storage device for most computer systems today.
27. What is the use of magnetic disk?
Magnetic storage media, primarily hard disks, are widely used to store computer data as well as audio and video
signals.
In the field of computing, the term magnetic storage is preferred and in the field of audio and video production,
the term magnetic recording is more commonly used.
28. What is associative memory in computer?
Associative memory is also known as content addressable memory (CAM) or associative storage or associative
array. It is a special type of memory that is optimized for performing searches through data, as opposed to
providing a simple direct access to the data based on the address.
29. What are the 3 types of cache memory?
There is three types of cache: direct-mapped cache; fully associative cache; N-way-set-associative cache.
30. Is cache a memory or storage?
Memory caching (often simply referred to as caching) is a technique in which computer applications temporarily
store data in a computer's main memory (i.e., random access memory, or RAM) to enable fast retrievals of that
data. The RAM that is used for the temporary storage is known as the cache.
1. Define interface.
In computing, an interface is a shared boundary across which two or more separate components of a
com- puter system exchange information. The exchange can be between software, computer hardware,
periph- eral devices, humans, and combinations of these.
2. What is programmed I/O?
Programmed input–output (also programmable input/output, programmed input/output, programmed
I/O, PIO) is a method of data transmission, via input/output (I/O), between a central processing unit
(CPU) and a peripheral device, such as a network adapter or a Parallel ATA storage device.
3. Name the types of buses.
Address bus - carries memory addresses from the processor to other components such as primary storage
and input/output devices.
Data bus - carries the data between the processor and other components.
Control bus - carries control signals from the processor to other components.
4. Define Synchronous bus.
synchronous bus A bus used to interconnect devices that comprise a computer system where the timing of
transactions between devices is under the control of a synchronizing clock signal.
5. Define Asynchronous bus.
Asynchronous bus A bus that interconnects devices of a computer system where information transfers
be- tween devices are self-timed rather than controlled by a synchronizing clock signal. A connected
"slots".)
13. Define SCSI
Small Computer System Interface
SCSI (pronounced SKUH-zee and sometimes colloquially known as "scuzzy"), the Small Computer
System Interface, is a set of American National Standards Institute (ANSI) standard electronic interfaces
that allow personal computers (PCs) to communicate with peripheral hardware such as disk drives,
tape drives, CD- ROM
14. What is mean by USB?
Universal Serial Bus is an industry standard that establishes specifications for cables, connectors and
pro- tocols for connection, communication and power supply between computers, peripherals and
other com- puters.
5 Marks
1. List the different types of interrupts. Explain briefly about mask-able interrupt.
Data transfer between the CPU and the peripherals is initiated by the CPU. But the CPU cannot start the
transfer unless the peripheral is ready to communicate with the CPU. When a device is ready to
communi- cate with the CPU, it generates an interrupt signal. A number of input-output devices are
attached to the computer and each device is able to generate an interrupt request.
The main job of the interrupt system is to identify the source of the interrupt. There is also a possibility
that several devices will request simultaneously for CPU communication. Then, the interrupt system has
to decide which device is to be serviced first.
Priority Interrupt
A priority interrupt is a system which decides the priority at which various devices, which generates the
interrupt signal at the same time, will be serviced by the CPU. The system has authority to decide which
conditions are allowed to interrupt the CPU, while some other interrupt is being serviced. Generally, de-
vices with high speed transfer such as magnetic disks are given high priority and slow devices such
as keyboards are given low priority.
When two or more devices interrupt the computer simultaneously, the computer services the device
with the higher priority first.
Hardware interrupts
★ The interrupt signal generated from external devices and i/o devices are made interrupt to CPU
when the instructions are ready.
★ For example − In a keyboard, if we press a key to do some action this pressing of the keyboard gen-
erates a signal that is given to the processor to do action, such interrupts are called hardware inter-
In daisy chaining system all the devices are connected in a serial form. The interrupt line request is
com- mon to all devices. If any device has interrupt signal in low level state then interrupt line goes to
low level state and enables the interrupt input in the CPU. When there is no interrupt the interrupt line
stays in high level state. The CPU respond to the interrupt by enabling the interrupt acknowledge line.
This signal is re- ceived by the device 1 at its PI input. The acknowledge signal passes to next device
through PO output only if device 1 is not requesting an interrupt.
The following figure shows the block diagram for daisy chaining priority system.
2. What is DMA? Explain the block diagram of DMA .Also describe how DMA is used to
transfer data from peripherals.
Direct Memory Access (DMA) :
DMA Controller is a hardware device that allows I/O devices to directly access memory with less
partici- pation of the processor. A DMA controller needs the same old circuits of an interface to
communicate with the CPU and Input/Output devices.
Fig-1 below shows the block diagram of the DMA controller.
The unit communicates with the CPU through data bus and control lines. Through the use of the
address bus and allowing the DMA and RS register to select inputs, the register within the DMA is
chosen by the CPU. RD and WR are two-way inputs. When BG (bus grant) input is 0, the CPU can
communicate with DMA registers. When BG (bus grant) input is 1, the CPU has relinquished the buses
and DMA can communicate directly with the memory.
DMA controller registers :
The DMA controller has three registers as follows.
Address register – It contains the address to specify the desired location in memory.
Word count register – It contains the number of words to be transferred.
Control register – It specifies the transfer mode.
Note –
All registers in the DMA appear to the CPU as I/O interface registers. Therefore, the CPU can both read and
write into the DMA registers under program control via the data bus.
Explanation :
➔ The CPU initializes the DMA by sending the given information through the data bus.
➔ The starting address of the memory block where the data is available (to read) or where data is to be
stored (to write).
➔ It also sends word count which is the number of words in the memory block to be read or written.
➔ Control to define the mode of transfer such as read or write.
➔ A control to begin the DMA transfer.
3. Explain the features of USB, PCI, and
SCSI bus. USB
• Individual USB cables can run as long as 5 meters; with hubs, devices can be up to 30 meters (six cables'
worth) away from the host.
• A USB cable has two wires for power (+5 volts and ground) and a twisted pair of wires to carry the data.
• On the power wires, the computer can supply up to 500 milliamps of power at 5 volts.
• Low-power devices (such as mice) can draw their power directly from the bus. High-power devices
(such as printers) have their own power supplies and draw minimal power from the bus. Hubs can have
their own power supplies to provide power to devices connected to the hub.
• USB devices are hot-swappable, that means there is no need to shut down and restart the PC to attach or
remove a peripheral. Just plug it in and go! The PC automatically detects the peripheral and configures the
necessary software. This feature is especially useful for users of multi-player games, as well as business
and notebook PC users who want to share peripherals.
• Many USB devices can be put to sleep by the host computer when the computer enters a power-saving
mode.
• USB distributes electrical power to many peripherals. Again, USB lets the PC automatically sense the
power that's required and deliver it to the device. This interesting USB feature eliminates those clunky
power supply boxes.
• USB provides two-way communication between the PC and peripheral devices, making it ideal for many
I/O applications.
• Multiple devices can connect to a system using a series of USB hubs and repeaters. A Root Hub with up to
seven additional ports can be integrated into the main interface, or it can be externally connected with a
cable. Each of the seven hubs on the Root Hub can in turn be connected to seven hubs, etc. to a maximum
of seven tiers and 127 ports.
PCI
• The original SCSI standard supports up to 7 devices on a single host adapter, but new standards sup-
port high-speed operation with up to 16 devices and bus lengths of up to 12 meters.
• SCSI devices are "smart" devices with their own control circuitry. They can "disconnect" themselves
from the host adapter to process tasks on their own, thus freeing up the bus for other transmissions.
• The bus can handle simultaneous reads and writes.
4. Explain about priority
interrupt. Interrupts:
There are many situations where other tasks can be performed while waiting for an I/O device to
become ready. A hardware signal called an Interrupt will alert the processor when an I/O device
becomes ready. Interrupt-request line is usually dedicated for this purpose.
For example, consider, COMPUTE and PRINT routines. The routine executed in response to an interrupt
request is called interrupt-service routine. Transfer of control through the use of interrupts happens.
The processor must inform the device that its request has been recognized by sending interrupt-
acknowledge signal. One must therefore know the difference between Interrupt Vs Subroutine.
Interrupt latency is con- cerned with saving information in registers will increase the delay between the
time an interrupt request is received and the start of execution of the interrupt-service routine.
Interrupt hardware
Most computers have several I/O devices that can request an interrupt. A single interrupt request line
may be used to serve n devices.
Simultaneous requests
The processor must have some mechanisms to decide which request to service when simultaneous requests
arrive. Here, daisy chain and arrangement of priority groups as the interrupt priority schemes are discussed.
Priority based simultaneous requests are considered in many organizations.
Controlling device requests
At the device end, an interrupt enable bit determines whether it is allowed to generate an interrupt request. At
the processor end, it determines whether a given interrupt request will be accepted.
Exceptions
The term exception is used to refer to any event that causes an interruption.
Hence, I/O interrupts are one example of an exception.
Recovery from errors – These are techniques to ensure that all hardware components are operating
properly.
Debugging – find errors in a program, trace and breakpoints (only at specific points selected by the user).
Privilege exception – execute privileged instructions to protect OS of a computer.
Use of interrupts in Operating Systems
Operating system is system software which is also termed as resource manager, as it manages all
variety of computer peripheral devices efficiently. Different issues addressed by the operating systems
are: Assign priorities among jobs, Security and protection features, incorporate interrupt-service
routines for all de- vices and Multitasking, time slice, process, program state, context switch and others.
In a typical application, a number of I/O devices are attached to computer, with each device being able
to originate an interrupt request, so to provide services to device which initiate interrupt request, the
task of interrupt system is to identify the source(device) of interrupt and then provide services to them.
But, in most cases there is a possibility that several sources will request service simultaneously.So, in
this case, the interrupt system must also need to decide which device to service first.But, these simple
inter- rupt system are not able for that, so, another system known as Priority interrupt system is
provided.
Priority Interrupt are systems, that establishes a Priority over the various sources(interrupt devices) to
determine which condition is to be serviced first when two or more requests arrive simultaneously.This
system may also determine which condition are permitted to interrupt to the computer while another
in- terrupt is being serviced.
Usually, in Priority Systems, higher-priority interrupt levels are served first, as if they are delayed or
inter- rupted, could have serious consequences.And the devices with high-speed transfer such as
magnetic disks are given high-priority, and slow devices such as keyboards receive low-priority.
The priority of simultaneous interrupts can be established either by software method or hardware.
Programmable I/O is one of the I/O techniques other than the interrupt-driven I/O and direct memory
ac- cess (DMA). The programmed I/O was the most simple type of I/O technique for the exchanges of
is exchanged between the processor and the I/O module. The processor executes a program that gives it
direct control of the I/O operation, including sensing device status, sending a read or write command, and
transferring the data. When the processor issues a command to the I/O module, it must wait until the I/O
operation is complete. If the processor is faster than the I/O module, this is wasteful of processor time. The
overall operation of the programmed I/O can be summarized as follow:
The processor is executing a program and encounters an instruction relating to I/O operation.
The processor then executes that instruction by issuing a command to the appropriate I/O module.
The I/O module will perform the requested action based on the I/O command issued by the processor
(READ/WRITE) and set the appropriate bits in the I/O status register.
The processor will periodically check the status of the I/O module until it finds that the operation is com-
plete.
Each input is read after first testing whether the device is ready with the input (a state reflected by a bit
in a status register).
The program waits for the ready status by repeatedly testing the status bit and till all targeted bytes are
read from the input device.
The program is in a busy (non-waiting) state only after the device gets ready.
1. Each output written after first testing whether the device is ready to accept the byte at its output
register or output buffer is empty.
2. The program waits for the ready status by repeatedly testing the status bit(s) and till all the target-
ed bytes are written to the device.
3. The program is in a busy (non-waiting) state only after the device gets ready else wait state.
There is a ticket counter where people come, take tickets and go.
People enter a line (queue) to get to the Ticket Counter in an organized manner.
The person who enters the queue first, will get the ticket first and leave the
queue.
The person entering the queue next will get the ticket after the person in front of
Therefore, the First person to enter the queue gets the ticket first and the Last person to enter the queue
gets the ticket last. This is known as the First-In-First-Out approach or FIFO.
Where is FIFO used:
Data Structures
Certain data structures like Queue and other variants of Queue use the FIFO approach for processing data.
Disk scheduling
Disk controllers can use the FIFO as a disk scheduling algorithm to determine the order in which to service
disk I/O requests.
Communications and networking
Communication network bridges, switches and routers used in computer networks use FIFOs to hold data
packets en route to their next destination.
7. Explain about Asynchronous data transfer.
The internal operations in an individual unit of a digital system are synchronized using clock pulse. It
means a clock pulse is given to all registers within a unit. And all data transfer among internal registers
occurs simultaneously during the occurrence of the clock pulse. Now, suppose any two units of a digital
system are designed independently, such as CPU and I/O interface.
If the registers in the I/O interface share a common clock with CPU registers, then transfer between the
two units is said to be synchronous. But in most cases, the internal timing in each unit is independent of
each other, so each uses its private clock for its internal registers. In this case, the two units are said to
be asynchronous to each other, and if data transfer occurs between them, this data transfer is called
Asyn- chronous Data Transfer.
But, the Asynchronous Data Transfer between two independent units requires that control signals be
transmitted between the communicating units so that the time can be indicated at which they send data.
These two methods can achieve this asynchronous way of data transfer:
Strobe control: A strobe pulse is supplied by one unit to indicate to the other unit when the transfer has to
occur.
Handshaking: This method is commonly used to accompany each data item being transferred with a
con- trol signal that indicates data in the bus. The unit receiving the data item responds with another
signal to acknowledge receipt of the data.
The strobe pulse and handshaking method of asynchronous data transfer is not restricted to I/O transfer.
They are used extensively on numerous occasions requiring the transfer of data between two independent
units. So, here we consider the transmitting unit as a source and receiving unit as a destination.
For example, the CPU is the source during output or write transfer and the destination unit during input or
read transfer.
Therefore, the control sequence during an asynchronous transfer depends on whether the transfer is initi-
ated by the source or by the destination.
So, while discussing each data transfer method asynchronously, you can see the control sequence in
both terms when it is initiated by source or by destination. In this way, each data transfer method can
be fur- ther divided into parts, source initiated and destination initiated.
The Strobe Control method of asynchronous data transfer employs a single control line to time each
trans- fer. This control line is also known as a strobe, and it may be achieved either by source or
destination, de- pending on which initiator transfers.
Source initiated strobe: In the below block diagram, you can see that strobe is initiated by source, and as
shown in the timing diagram, the source unit first places the data on the data bus.
After a brief delay to ensure that the data resolve to a stable value, the source activates a strobe pulse.
The information on the data bus and strobe control signal remains in the active state for a sufficient
time to allow the destination unit to receive the data.
The destination unit uses a falling edge of strobe control to transfer the contents of a data bus to one of
its internal registers. The source removes the data from the data bus after it disables its strobe pulse.
Thus, new valid data will be available only after the strobe is enabled again.
In this case, the strobe may be a memory-write control signal from the CPU to a memory unit. The CPU
places the word on the data bus and informs the memory unit, which is the destination.
Destination initiated strobe: In the below block diagram, you see that the strobe initiated by destination,
and in the timing diagram, the destination unit first activates the strobe pulse, informing the source to
provide the data.
The source unit responds by placing the requested binary information on the data bus. The data must be
valid and remain on the bus long enough for the destination unit to accept it.
The falling edge of the strobe pulse can use again to trigger a destination register. The destination unit
then disables the strobe. Finally, and source removes the data from the data bus after a determined
time interval.
In this case, the strobe may be a memory read control from the CPU to a memory unit. The CPU
initiates the read operation to inform the memory, which is a source unit, to place the selected word
into the data bus.
2. Handshaking Method
The strobe method has the disadvantage that the source unit that initiates the transfer has no way of
knowing whether the destination has received the data that was placed in the bus. Similarly, a
destination unit that initiates the transfer has no way of knowing whether the source unit has placed
data on the bus.
So this problem is solved by the handshaking method. The handshaking method introduces a second
con- trol signal line that replays the unit that initiates the transfer.
In this method, one control line is in the same direction as the data flow in the bus from the source to the
destination. The source unit uses it to inform the destination unit whether there are valid data in the bus.
The other control line is in the other direction from the destination to the source. This is because the des-
tination unit uses it to inform the source whether it can accept data. And in it also, the sequence of control
depends on the unit that initiates the transfer. So it means the sequence of control depends on whether the
Source initiated handshaking: In the below block diagram, you can see that two handshaking lines are
"da- ta valid", which is generated by the source unit, and "data accepted", generated by the destination
unit.
The timing diagram shows the timing relationship of the exchange of signals between the two units.
The source initiates a transfer by placing data on the bus and enabling its data valid signal. The
destination unit then activates the data accepted signal after it accepts the data from the bus.
The source unit then disables its valid data signal, which invalidates the data on the bus.
After this, the destination unit disables its data accepted signal, and the system goes into its initial state.
The source unit does not send the next data item until after the destination unit shows readiness to
accept new data by disabling the data accepted signal.
in its sequence diagram, which shows the above sequence in which the system is present at any given
time.
Destination initiated handshaking: In the below block diagram, you see that the two handshaking lines are
"data valid", generated by the source unit, and "ready for data" generated by the destination unit.
Note that the name of signal data accepted generated by the destination unit has been changed to
ready for data to reflect its new meaning.
The destination transfer is initiated, so the source unit does not place data on the data bus until it
receives a ready data signal from the destination unit. After that, the handshaking process is the same
as that of the source initiated.
The sequence of events is shown in its sequence diagram, and the timing relationship between signals is
shown in its timing diagram. Therefore, the sequence of events in both cases would be identical.
The strobe method has the disadvantage that the source unit that initiates the transfer has no way of
knowing whether the destination has received the data that was placed in the bus. Similarly, a
destination unit that initiates the transfer has no way of knowing whether the source unit has placed
data on the bus.
So this problem is solved by the handshaking method. The handshaking method introduces a second
con- trol signal line that replays the unit that initiates the transfer.
In this method, one control line is in the same direction as the data flow in the bus from the source to the
The other control line is in the other direction from the destination to the source. This is because the des-
tination unit uses it to inform the source whether it can accept data. And in it also, the sequence of control
depends on the unit that initiates the transfer. So it means the sequence of control depends on whether the
transfer is initiated by source and destination.
Source initiated handshaking: In the below block diagram, you can see that two handshaking lines are
"da- ta valid", which is generated by the source unit, and "data accepted", generated by the destination
unit.
The timing diagram shows the timing relationship of the exchange of signals between the two units.
The source initiates a transfer by placing data on the bus and enabling its data valid signal. The
destination unit then activates the data accepted signal after it accepts the data from the bus.
The source unit then disables its valid data signal, which invalidates the data on the bus.
After this, the destination unit disables its data accepted signal, and the system goes into its initial state.
The source unit does not send the next data item until after the destination unit shows readiness to
accept new data by disabling the data accepted signal.
in its sequence diagram, which shows the above sequence in which the system is present at any given
time.
Destination initiated handshaking: In the below block diagram, you see that the two handshaking lines are
"data valid", generated by the source unit, and "ready for data" generated by the destination unit.
Note that the name of signal data accepted generated by the destination unit has been changed to
ready for data to reflect its new meaning.
The destination transfer is initiated, so the source unit does not place data on the data bus until it
receives a ready data signal from the destination unit. After that, the handshaking process is the same
as that of the source initiated.
The sequence of events is shown in its sequence diagram, and the timing relationship between signals is
shown in its timing diagram. Therefore, the sequence of events in both cases would be identical.
10 Marks
In Computer System Design, Memory Hierarchy is an enhancement to organize the memory such that it
can minimize the access time. The Memory Hierarchy was developed based on a program behavior
known as locality of references. The figure below clearly demonstrates the different levels of memory
hierarchy:
One of the most significant ways to increase system performance is minimizing how far down the
memory hierarchy one has to go to manipulate data.
Cost per bit:
As we move from bottom to top in the Hierarchy, the cost per bit increases i.e. Internal Memory is cost-
lier than External Memory.
Let us discuss each level in detail:
Level-0 − Registers
The registers are present inside the CPU. As they are present inside the CPU, they have least access
time. Registers are most expensive and smallest in size generally in kilobytes. They are implemented by
using Flip-Flops.
Level-1 − Cache
Cache memory is used to store the segments of a program that are frequently accessed by the proces-
sor. It is expensive and smaller in size generally in Megabytes and is implemented by using static RAM.
Level-2 − Primary or Main Memory
It directly communicates with the CPU and with auxiliary memory devices through an I/O processor.
Main memory is less expensive than cache memory and larger in size generally in Gigabytes. This
memory is implemented by using dynamic RAM.
Level-3 − Secondary storage
Secondary storage devices like Magnetic Disk are present at level 3. They are used as backup storage.
They are cheaper than main memory and larger in size generally in a few TB.
Level-4 − Tertiary storage
Tertiary storage devices like magnetic tape are present at level 4. They are used to store removable files
and are the cheapest and largest in size (1-20 TB).
Let us see the memory levels in terms of size, access time, bandwidth.
★ Memory hierarchy is arranging different kinds of storage present on a computing device based on
speed of access.
★ At the very top, the highest performing storage is CPU registers which are the fastest to read and
write to.
★ Next is cache memory followed by conventional DRAM memory, followed by disk storage with dif-
ferent levels of performance including SSD, optical and magnetic disk drives.
★ To bridge the processor memory performance gap, hardware designers are increasingly relying on
memory at the top of the memory hierarchy to close / reduce the performance gap.
★ This is done through increasingly larger cache hierarchies (which can be accessed by processors
much faster), reducing the dependency on main memory which is slower.
1. Discuss in detail about Main memory.
★ Main memory is where programs and data are kept when the processor is actively using them. When
programs and data become active, they are copied from secondary memory into main memory
where the processor can interact with them. A copy remains in secondary memory.
★ Main memory is intimately connected to the processor, so moving instructions and data into and out
of the processor is very fast.
★ Main memory is sometimes called RAM. RAM stands for Random Access Memory. "Random" means
that the memory cells can be accessed in any order. However, properly speaking, "RAM" means the
type of silicon chip used to implement main memory.
★ The main memory is the fundamental storage unit in a computer system. It is associatively large and
quick memory and saves programs and information during computer operations. The technology
that makes the main memory work is based on semiconductor integrated circuits.
★ RAM is the main memory. Integrated circuit Random Access Memory (RAM) chips are applicable in
two possible operating modes are as follows −
★ Static − It consists of internal flip-flops, which store the binary information. The stored data remains
solid considering power is provided to the unit. The static RAM is simple to use and has smaller read
and write cycles.
★ Dynamic − It saves the binary data in the structure of electric charges that are used to capacitors.
The capacitors are made available inside the chip by Metal Oxide Semiconductor (MOS) transistors.
The stored value on the capacitors contributes to discharge with time and thus, the capacitors should
be regularly recharged through stimulating the dynamic memory.
DRAM
SRAMs are faster but their cost is high because their cells require many transistors. RAMs can be ob-
tained at a lower cost if simpler cells are used. A MOS storage cell based on capacitors can be used to
replace the SRAM cells. Such a storage cell cannot preserve the charge (that is, data) indefinitely and
must be recharged periodically. Therefore, these cells are called dynamic storage cells. RAMs using
these cells are referred to as Dynamic RAMs or simply DRAMs.
The primary technology used for the main memory is based on semiconductor integrated circuits. The
integrated circuits for the main memory are classified into two major units.
❖ RAM (Random Access Memory) integrated circuit chips
❖ ROM (Read Only Memory) integrated circuit chips
1. RAM integrated circuit chips
The RAM integrated circuit chips are further classified into two possible operating modes, static and
dynamic.
The primary compositions of a static RAM are flip-flops that store the binary information. The nature of
the stored information is volatile, i.e. it remains valid as long as power is applied to the system. The
static RAM is easy to use and takes less time performing read and write operations as compared to dy-
namic RAM.
The block diagram of a RAM Chip is shown in Fig.12-2. The capacity of memory is 128 words of eight
bits (one byte) per word. This requires a 7-bit address and an 8-bit bidirectional data bus. The read and
write inputs specify the memory operation and the two chips select (CS) control inputs are enabling the
chip only when it is selected by the microprocessor. The read and write inputs are sometimes combined
into one line labelled R/W.
The function table listed in Fig.12-2(b) specifies the operation of the RAM chip. The unit is in operation
only when CS1=1 and CS2=0.The bar on top of the second select variable indicates that this input is en-
abled when it is equal to 0. If the chip select inputs are not enabled, or if they are enabled but the read
or write inputs are not enabled, the memory is inhibited and its data bus is in a high-impedance state.
When CS1=1 and CS2=0, the memory can be placed in a write or read mode. When the WR input is ena-
bled, the memory stores a byte from the data bus into a location specified by the address input lines.
When the RD input is enabled, the content of the selected byte is placed into the data bus. The RD and
WR signals control the memory operation as well as the bus buffers associated with the bidirectional
data bus.
A ROM chip is organized externally in a similar manner. However, since a ROM can only read, the
data bus can only be in an output mode. The block diagram of a ROM chip is shown in fig.12-3.
The nine address lines in the ROM chip specify any one of the512 bytes stored in it. The two
chip selectinputs must be CS1=1 and CS2=0 for the unit to operate. Otherwise, the data bus is in a
high-impedance state.
Memory Address Map
The interconnection between memory and processor is then established from knowledge of
the size of memory needed and the type of RAM and ROM chips available. The addressing of
memory can be established by means of a table that specify the memory address assigned to each
chip. The table called Memory address map, is a pictorial representation of assigned address space
for each chip in the system.
The memory address map for this configuration is shown in table 12-1. The component
column specifies whether a RAM or a ROM chip is used. The hexadecimal address column assigns
a range of hexadecimal equivalent addresses for each chip. The address bus lines are listed in the
third column. TheRAM chips have 128 bytes and need seven address lines. The ROM chip has 512
bytes and needs 9 address lines.
The connection of memory chips to the CPU is shown in Fig.12-4. This configuration gives a
memory capacity of 512 bytes of RAM and 512 bytes of ROM. Each RAM receives the seven low-order
bits of the address bus to select one of 128 possible bytes. The particular RAM chip selected is
determined from lines 8 and 9 in the address bus. This is done through a 2 X 4 decoder whose outputs go
to the CS1 inputs in each RAM chip. Thus, when address lines 8 and 9 are equal to 00, the first RAM
chipis selected. When 01, the second RAM chip is select, and so on. The RD and WR outputs from the
microprocessor are applied to the inputs of each RAM chip. The selection between RAM and ROM is
achieved through bus line 10. The RAMs are selected when the bit in this line is 0, and the ROM
whenthe bit is 1. Address bus lines 1 to 9 are applied to the input address of ROM without going through
the decoder. The data bus of the ROM has only an output capability, whereas the data bus connected to
the RAMs can transfer information in both directions
3. Describe in details about Memory chip Organization.
A memory chip is an integrated circuit made out of millions of capacitors and transistors that can store
data or can be used to process code. Memory chips can hold memory either temporarily through ran-
dom access memory (RAM), or permanently through read only memory (ROM). Read-only memory
contains permanently stored data that a processor can read but cannot modify. Memory chips come in
different sizes and shapes. Some can be connected directly while some need special drives. Memory
chips are essential components in computer and electronic devices in which memory storage plays a
key role.
The time required to find an item stored in memory can be reduced considerably if stored data can
be identified for access by the content of the data itself rather than by an address. A memory unit
accessed by content is called an associative memory or content addressable memory (CAM).
• CAM is accessed simultaneously and in parallel on the basis of data content rather than by specific
address or location
• Associative memory is more expensive than a RAM because each cell must have storage capability
as well as logic circuits
• Argument register –holds an external argument for content matching
• Key register –mask for choosing a particular field or key in the argument word
Magnetic Disks
A magnetic disk is a type of memory constructed using a circular plate of metal or plastic coated with
magnetized materials. Usually, both sides of the disks are used to carry out read/write operations.
However, several disks may be stacked on one spindle with a read/write head available on each sur-
face.
The following image shows the structural representation for a magnetic disk.
❖ The memory bits are stored in the magnetized surface in spots along the concentric circles called
tracks.
❖ The concentric circles (tracks) are commonly divided into sections called sectors.
Magnetic Tape
★ Magnetic tape is a storage medium that allows data archiving, collection, and backup for different
kinds of data. The magnetic tape is constructed using a plastic strip coated with a magnetic recording
medium.
★ The bits are recorded as magnetic spots on the tape along several tracks. Usually, seven or nine bits
are recorded simultaneously to form a character together with a parity bit.
★ Magnetic tape units can be halted, started to move forward or in reverse, or can be rewound. How-
ever, they cannot be started or stopped fast enough between individual characters. For this reason,
information is recorded in blocks referred to as records.
Hardware Organization
It consists of a memory array and logic for m words with n bits per word. The
argument register A and key register K each have n bits, one for each bit of a word. The match
register M has m bits, one for each memory word. Each word in memory is compared in parallel
with the content of the argument register. The words that match the bits of the argument register set
a corresponding bit in the match register. After the matching process, those bits in the match
register that have been set bindicate the fact that their corresponding words have been matched.
Reading is accomplished by a sequential access to memory for those words whose corresponding
bits in the match register have been set.
The relation between the memory array and external registers in an associative memory is
shownin Fig.12-7. The cells in the array are marked by the letter C with two subscripts. The first
subscript gives the word number and second specifies the bit position in the word. Thus cell Cij is
the cell for bit j in word
i. A bit Aj in the argument register is compared with all the bits in column j of the array provid-
ed that kj
=1.This is done for all columns j=1,2,….n. If a match occurs between all the unmasked bits of the
argument and the bits in word I, the corresponding bit M i in the match register is set to 1. If one or
more unmasked bits of the argument and the word do not match, Mi is cleared to 0.
It consists of flip-flop storage element Fij and the circuits for reading, writing, and
matching thecell. The input bit is transferred into the storage cell during a write operation. The bit
stored is read out during a read operation. The match logic compares the content of the storage cell
with corresponding unmasked bit of the argument and provides an output for the decision logic that
sets the bit in Mi.
Match Logic
The match logic for each word can be derived from the comparison algorithm for two binary
numbers. First, neglect the key bits and compare the argument in A with the bits stored in the cells
of the words.
Word i is equal to the argument in A if Aj=F ijfor j=1,2,…..,n. Two bits are equal if they are
both 1 or both 0. The equality of two bits can be expressed logically by the Boolean function
xj=Aj Fij + Aj ‘Fij ‘
where xj = 1 if the pair of bits in position j are equal;otherwise , xj =0. For a word i is equal to the
argument in A we must have all xj variables equal to 1. This is the condition for setting the
corresponding match bit Mi to 1. The Boolean function for this condition is
Mi = x1 x2 x3…… xn
Each cell requires two AND gate and one OR gate. The inverters for A and K are needed once for
each column and are used for all bits in the column. The output of all OR gates in the cells of the
same word goto the input of a common AND gate to generate the match signal for Mi. Mi will be
logic 1 if a matchoccurs and 0 if no match occurs.
• Read and Write operation
• Read Operation
Read Operation
• If more than one word in memory matches the unmasked argument field, all the matched
wordswill have 1’s in the corresponding bit position of the match register
• In read operation all matched words are read in sequence by applying a read signal
to eachword line whose corresponding Mi bit is a logic 1
• In applications where no two identical items are stored in the memory, only one word may match
, in which case we can use Mi output directly as a read signal for the corresponding word
Write Operation
Can take two different forms
1. Entire memory may be loaded with new information
2. Unwanted words to be deleted and new words to be
inserted
1. Entire memory: writing can be done by addressing each location in sequence – This makes
it random access memory for writing and content addressable memory for reading – number
of lines needed for decoding is d Where m = 2 d , m is number of words.
2. Unwanted words to be deleted and new words to be inserted:
• Tag register is used which has as many bits as there are words in memory
• For every active (valid ) word in memory , the corresponding bit in tag register is set to 1
• When word is deleted the corresponding tag bit is reset to 0
• The word is stored in the memory by scanning the tag register until the first 0 bit
isencountered After storing the word the bit is set to 1.
★ An associative memory can be considered as a memory unit whose stored data can be identified for
access by the content of the data itself rather than by an address or memory location.
★ Associative memory is often referred to as Content Addressable Memory (CAM).
★ When a write operation is performed on associative memory, no address or memory location is giv-
en to the word. The memory itself is capable of finding an empty unused location to store the word.
★ On the other hand, when the word is to be read from an associative memory, the content of the word,
or part of the word, is specified. The words which match the specified content are located by the
memory and are marked for reading.
★ Also called as Content-addressable memory (CAM), associative storage, or associative array
★ Content-addressed or associative memory refers to a memory organization in which the memory is
accessed by its content (as opposed to an explicit address).
★ It is a special type of computer memory used in certain very high speed searching applications.
★ In standard computer memory (random access memory or RAM), the user supplies a memory ad-
dress and the RAM returns the data word stored at that address.
★ In CAM, the user supplies a data word and then CAM searches its entire memory to see if that data
word is stored anywhere in it. If the data word is found, the CAM returns a list of one or more storage
addresses where the word was found.
★ CAM is designed to search its entire memory in a single operation.
★ It is much faster than RAM in virtually all search applications.
★ An associative memory is more expensive than RAM, as each cell must have storage capability as
well as logic circuits for matching its content with an external argument.
★ Associative memories are used in applications where the search time is very critical and short.
★ Associative memories are expensive compared to RAMs because of the add logic associated with
each cell.
The following diagram shows the block representation of an Associative memory.
From the block diagram, we can say that an associative memory consists of a memory array and logic
for 'm' words with 'n' bits per word.
The functional registers like the argument register A and key register K each have n bits, one for each
bit of a word. The match register M consists of m bits, one for each memory word.
The words which are kept in the memory are compared in parallel with the content of the argument
register.
The key register (K) provides a mask for choosing a particular field or key in the argument word. If the
key register contains a binary value of all 1's, then the entire argument is compared with each memory
word. Otherwise, only those bits in the argument that have 1's in their corresponding position of the
key register are compared. Thus, the key provides a mask for identifying a piece of information which
specifies how the reference to memory is made.
The following diagram can represent the relation between the memory array and the external registers
in an associative memory.
The cells present inside the memory array are marked by the letter C with two subscripts. The first sub-
script gives the word number and the second specifies the bit position in the word. For instance, the cell
Cij is the cell for bit j in word i.
A bit Aj in the argument register is compared with all the bits in column j of the array provided that Kj =
1. This process is done for all columns j = 1, 2, 3 , n.
If a match occurs between all the unmasked bits of the argument and the bits in word i, the correspond-
ing bit Mi in the match register is set to 1. If one or more unmasked bits of the argument and the word
do not match, Mi is cleared to 0.
Hardware Organization :
Associative memory consists of a memory array and logic for m words and n bits per word.
The argument register A and key register K each have n bits, one for each bit of a word. The match register M
has m bits, one for each memory words.
Each word in memory is compared in parallel with the content of the argument register. The words that match
the bits of the argument register set a corresponding bit in the match register.
After matching process, those bits in the match register that have been set indicate the fact that their corre-
sponding words have been matched.
Reading is accomplished by a sequential access to memory for those words whose corresponding bits in the
match register have been set.
The key register provides a mask for choosing a particular field or key in the argument word. The en-
tire argument is compared with each memory word if the key register contains all 1’s. Otherwise, only
those bits in the argument that have 1’s in their corresponding position of the key register are com-
pared. Thus, the key provides a mask or identifying piece of information which specifies how the refer-
ence to memory is made.
The cache is the fastest component in the memory hierarchy and approaches the speed of CPU compo-
nents.
➔ When the CPU needs to access memory, the cache is examined. If the word is found in the cache, it is
read from the fast memory.
➔ If the word addressed by the CPU is not found in the cache, the main memory is accessed to read the
word.
➔ A block of words one just accessed is then transferred from main memory to cache memory. The
block size may vary from one word (the one just accessed) to about 16 words adjacent to the one just
accessed.
➔ The performance of the cache memory is frequently measured in terms of a quantity called hit ratio.
➔ When the CPU refers to memory and finds the word in cache, it is said to produce a hit.
➔ If the word is not found in the cache, it is in main memory and it counts as a miss.
➔ The ratio of the number of hits divided by the total CPU references to memory (hits plus misses) is
the hit ratio.
Locality of Reference
Locality of reference is manifested in two ways :
1. Temporal- means that a recently executed instruction is likely to be executed again very soon.
• The information which will be used in near future is likely to be in use al-
ready( e.g. reuseof information in loops)
2. Spatial- means that instructions in close proximity to a recently executed instruction
are also likelyto be executed soon
• If a word is accessed, adjacent (near) words are likely to be accessed soon
( e.g.related data items (arrays) are usually stored together; instructions are exe-
cuted sequentially
)
3. If active segments of a program can be placed in afast (cache) memory , then total exe-
cution timecan be reduced significantly
4. Temporal Locality of Reference suggests whenever an information (instruction or da-
ta) is needed first , this item should be brought in to cache
5. Spatial aspect of Locality of Reference suggests that instead of bringing just one
item from the main memory to the cache ,it is wise to bring several items that reside at
The main memory can store 32k words of 12 bits each. The cache is capable of storing 512
of these words at any given time. For every word stored , there is a duplicate copy in main memory.
The Cpu communicates with both memories. It first sends a 15 bit address to cahache. If there is a
hit, the CPU accepts the 12 bit data from cache. If there is a miss, the CPU reads the word from
main memory and the word is then transferred to cache.
• When a read request is received from CPU,contents of a block of memory words contain-
ing thelocation specified are transferred in to cache
• When the program references any of the locations in this block , the contents are read
from thecache Number of blocks in cache is smaller than number of blocks in main memory
• Correspondence between main memory blocks and those in the cache is specified by a
mappingfunction
• Assume cache is full and memory word not in cache is referenced
• Control hardware decides which block from cache is to be removed to create space for new
blockcontaining referenced word from memory
• Collection of rules for making this decision is called “Replacement algo-
rithm ” Read/ Write operations on cache
• Cache Hit Operation
• CPU issues Read/Write requests using addresses that refer to locations in main memory
• Cache control circuitry determines whether requested word currently exists in cache
• If it does, Read/Write operation is performed on the appropriate location in cache
(Read/WriteHit )
In fig 12-12. The CPU address of 15 bits is divided into two fields. The nine least
significant bits constitute the index field and remaining six bits form the tag field. The main
memory needs an address
that includes both the tag and the index bits. The number of bits in the index field is equal to the
numberof address bits required to access the cache memory.
The direct mapping cache organization uses the n- bit address to access the main memory and the
k-bit index to access the cache. Each word in cache consists of the data word and associated tag.
When a new word is first brought into the cache, the tag bits are stored alongside the data bits.
When the CPU generates a memory request, the index field is used the index field is used for the
address to access the cache. The tag field of the CPU address is compared with the tag in the word
read from the cache. If the two tags match, there is a hit and the desired data word is in cache. If
there is no match, there is a miss.
In fig 12-14, The index field is now divided into two parts: Block field and The word field. In a 512
word cache there are 64 blocks of 8 words each, since 64X8=512. The block number is specified
with a 6 bit field and the word with in the block is specified with a 3-bit field. Th etag field stored
within the the cache is common to all eight words of the same block.
Associative mapping:
• In this mapping function, any block of Main memory can potentially reside in any cache block
position. This is much more flexible mapping method.
• In this type of mapping, the associative memory is used to store content and addresses of the
memory word.
• Any block can go into any line of the cache. This means that the word id bits are used to identify
which word in the block is needed, but the tag becomes all of the remaining bits. This enables the
placement of any word at any place in the cache memory.
• It is considered to be the fastest and the most flexible mapping form.
In fig 12-11, The associative memory stores both address and content(data) of the memory word.
This permits any location in cache to store any word from main memory.The diagram shows three
words presently stored in the cache. The address value of 15 bits is shown as a five-digit ctal
number and its corressponding 12-bit word is shown as a four-digit octal number. A CPU address of
15-bits is placed in the argument register and the associative memory is searched for a matching
address. If address is found, the corresponding 12-bit data is read and sent to the CPU. If no match
occurs, the main memory is accessed for the word.
Set-associative mapping:
• In this method, blocks of cache are grouped into sets, and the mapping allows a block of main
memory to reside in any block of a specific set. From the flexibility point of view, it is in between
to the other two methods.
• This form of mapping is an enhanced form of direct mapping where the drawbacks of direct
mapping are removed. Set associative addresses the problem of possible thrashing in the direct
mapping method.
• It does this by saying that instead of having exactly one line that a block can map to in the cache,
we will group a few lines together creating a set. Then a block in memory can map to any one of
the lines of a specific set..
• Set-associative mapping allows that each word that is present in the cache can have two or more
words in the main memory for the same index address.
• Set associative cache mapping combines the best of direct and associative cache mapping
techniques.
In this case, the cache consists of a number of sets, each of which consists of a number of lines. The
relationships are
m=v*k
i= j mod v
where
i=cache set number
j=main memory block number
v=number of sets
m=number of lines in the cache number of sets
k=number of lines in each set
The octal numbers listed in Fig.12-15 are with reference to the main memory contents. When the
CPU generats a memory request, the index valus of the address is used to access the cache. The tag
field of the CPU
address is then compared with both tags in the cache to determine if a match occurs. The
comparison logic dine by an associative search of the tags in the set similar to anassociative
memory search thus the name “Set Associative”.
Replacement Policies
• When the cache is full and there is necessity to bring new data to cache , then a decision must
be made as to which data from cache is to be removed
• The guideline for taking a decision about which data is to be removed is called replacement
policy Replacement policy depends on mapping
• There is no specific policy in case of Direct mapping as we have no choice of block placement in
cache Replacement Policies
In case of associative mapping
• A simple procedure is to replace cells of the cache in round robin order whenever a new word
is requested from memory
• This constitutes a First-in First-out (FIFO) replacement policy
In case of set associative mapping
• Random replacement
• First-in First-out (FIFO) (item chosen is the item that has been in the set longest)
• Least Recently Used (LRU)(item chosen is the item that has been least recently used by CPU)
Application of Cache Memory –
1. Usually, the cache memory can store a reasonable number of blocks at any given time, but this
number is small compared to the total number of blocks in the main memory.
2. The correspondence between the main memory blocks and those in the cache is specified by a
mapping function.
Types of Cache –
• Primary Cache –
A primary cache is always located on the processor chip. This cache is small and its access time is
comparable to that of processor registers.
• Secondary Cache –
Secondary cache is placed between the primary cache and the rest of the memory. It is referred to
as the level 2 (L2) cache. Often, the Level 2 cache is also housed on the processor chip.
Locality of reference –
Since size of cache memory is less as compared to main memory. So to check which part of main
memory should be given priority and loaded in cache is decided based on locality of reference.
Types of Locality of reference
1. Spatial Locality of reference
This says that there is a chance that element will be present in the close proximity to the reference
point and next time if again searched then more close proximity to the point of reference.
2. Temporal Locality of reference
In this Least recently used algorithm will be used. Whenever there is page fault occurs within a word
will not only load word in main memory but complete page fault will be loaded because spatial lo-
cality of reference rule says that if you are referring any word next word will be referred in its regis-
ter that’s why we load complete page table so the complete block will be loaded.
7. Discuss in detail about Virtual memory.
Virtual memory is the partition of logical memory from physical memory. This partition supports large
virtual memory for programmers when only limited physical memory is available.
Virtual memory can give programmers the deception that they have a very high memory although the
computer has a small main memory. It makes the function of programming easier because the pro-
grammer no longer requires to worry about the multiple physical memory available.
• Early days memory was expensive – hence small
• Programmers were using secondary storage for overlaying
• Programmers were responsible for breaking programs in to overlays , decide where to keep in sec-
ondary memory, arranging for transfer of overlays between main memory and secondary memory
Virtual memory works similarly, but at one level up in the memory hierarchy. A memory management
unit (MMU) transfers data between physical memory and some gradual storage device, generally a disk.
This storage area can be defined as a swap disk or swap file, based on its execution. Retrieving data
from physical memory is much faster than accessing data from the swap disk.
Virtual Memory - Background
• Separate concept of address space and memory locations
• Programs reference instructions and data that is independent of available physical memory Ad-
dresses issued by processor for Instructions or Data are called Virtual or Logical addresses
• Virtual addresses are translated in to physical addresses by a combination of Hardware and Soft-
ware components
Types of Memory
• Real memory
• Main memory
• Virtual memory
• Memory on disk
• Allows for effective multiprogramming and relieves the user of tight constraints of main memory
Address Space and Memory Space
• Address used by a programmer is called virtual address and set of such addresses is called address
space
• Address in main memory is called a location or physical address and set of such locations is called
the memory space
• The Address Space is allowed to be larger than the memory space in computers with virtual
memory
• In a multiprogram computer system, programs and data are transferred to and from auxiliary
memory and main memory based on demands imposed by the CPU. Suppose that program1 is
currently
• being executed in the CPU. Program1 and a portion of its associated data are moved from auxil-
iary memory into main memory as shown in fig. 12-16. Portions of programs and data need not
be in contiguous locations in memory since information is being moved in out, and empty spaces
may be available in scattered locations in
memory.
• In fig 12-17, To map a virtual address of 20 bits to a physical address of 15 bits. The mapping is a dy-
namic operation, which means that every address is translated immediately as a word is referenced
by CPU. The mapping table may be stored in a separate memory. In first case, an additional unit is
required as well as one extra memory access time. In the second case, the table takes space from
main memoy and two accesses to memory are required with program running at half speed. A third
alternative is to use an associative memory.
• Address Mapping Using Pages
• The physical memory is broken down into groups of equal size called blocks, which may range from
64 to 4096 word each. The term page refers to groups of address space of the same size. Portions of
programs are moved from auxiliary memory to main memory in records equal to the size of a page.
The term “page frame” is sometimes used to denote a block.
In fig 12-18, a virtual address has 13 bits. Since each page consists of 1024 words, the high
orderthree bits of virtual address will specify one of the eight pages and the low order 10 bits give
the line address within the page.
The organization of the memory mapping table in a paged system is shown in Fig.12-19.
The memory page table consists of eight word , one for each page. The address in the page
tabledenotes the page number and the content of the word gives the block number where that
page is stored in main
memory. The table showsthat pages 1,2,5 and 6 are now available in main memory in blocks 3,0,1 and 2,
respectively.
Associative Memory Page Table
Replace the random access memory-page table with an associative memory of four
words as shown in Fig12-20. Each entry in the associative memory array consists of two
fields. The first three bits specify a field for storing the page number. The last two bits
constitute a field for storing the blocknumber. The virtual address is placed in the argument
register.
Address Translation
A table is needed to map virtual address to a physical address ( dynamic operation)This table
may be kept in
• a separate memory or
• main memory or
• associative memory
There are two primary methods for implementing virtual memory are as follows −
Paging
Paging is a technique of memory management where small fixed-length pages are allocated instead of a
single large variable-length contiguous block in the case of the dynamic allocation technique. In a paged
system, each process is divided into several fixed-size ‘chunks’ called pages, typically 4k bytes in length.
The memory space is also divided into blocks of the equal size known as frames.
Segmentation
• The partition of memory into logical units called segments, according to the user’s perspective is
called segmentation. Segmentation allows each segment to grow independently, and share. In other
words, segmentation is a technique that partition memory into logically related units called a seg-
ment. It means that the program is a collection of the segments.
• Unlike pages, segments can vary in size. This requires the MMU to manage segmented memory
somewhat differently than it would manage paged memory. A segmented MMU contains a segment
table to maintain track of the segments resident in memory.
• A segment can initiate at one of several addresses and can be of any size, each segment table entry
should contain the start address and segment size. Some systems allow a segment to start at any ad-
dress, while others limit the start address. One such limit is found in the Intel X86 architecture,
which requires a segment to start at an address that has 6000 as its four low-order bits.
Advantages of Virtual Memory
❖ The degree of Multiprogramming will be increased.
❖ Users can run large applications with less real RAM.
❖ There is no need to buy more memory RAMs.
Disadvantages of Virtual Memory
➔ The system becomes slower since swapping takes time.
➔ It takes more time to switch between applications.
➔ The user will have the lesser hard disk space for its use.
8. Write short notes on Cache algorithms.
A cache algorithm is a detailed list of instructions that directs which items should be discarded in a
computing device's cache of information.
When the processor needs to read or write a location in main memory, it first checks for a correspond-
ing entry in the cache.
If the processor finds that the memory location is in the cache, a cache hit has occurred and data is read
from cache
If the processor does not find the memory location in the cache, a cache miss has occurred. For a cache
miss, the cache allocates a new entry and copies in data from main memory, then the request is fulfilled
from the contents of the cache.
The performance of cache memory is frequently measured in terms of a quantity called Hit ratio.
Hit ratio = hit / (hit + miss) = no. of hits/total accesses
We can improve Cache performance using higher cache block size, higher associativity, reduce miss
rate, reduce miss penalty, and reduce the time to hit in the cache.
Examples of cache algorithms include:
Least Frequently Used (LFU): This cache algorithm uses a counter to keep track of how often an entry
is accessed. With the LFU cache algorithm, the entry with the lowest count is removed first. This meth-
od isn't used that often, as it does not account for an item that had an initially high access rate and then
was not accessed for a long time.
Least Recently Used (LRU): This cache algorithm keeps recently used items near the top of cache.
Whenever a new item is accessed, the LRU places it at the top of the cache. When the cache limit has
been reached, items that have been accessed less recently will be removed starting from the bottom of
the cache. This can be an expensive algorithm to use, as it needs to keep "age bits" that show exactly
when the item was accessed. In addition, when a LRU cache algorithm deletes an item, the "age bit"
changes on all the other items.
Adaptive Replacement Cache (ARC): Developed at the IBM Almaden Research Center, this cache algo-
rithm keeps track of both LFU and LRU, as well as evicted cache entries to get the best use out of the
available cache.
Most Recently Used (MRU): This cache algorithm removes the most recently used items first. A MRU
algorithm is good in situations in which the older an item is, the more likely it is to be accessed.
9. Explain in detail about Cache Hierarchy.
• Cache hierarchy, or multi-level caches, refers to a memory architecture that uses a hierarchy of
memory stores based on varying access speeds to cache data. Highly requested data is cached in
high-speed access memory stores, allowing swiffer access by central processing unit (CPU) cores.
• Cache hierarchy is a form and part of memory hierarchy and can be considered a form of tiered stor-
age. This design was intended to allow CPU cores to process faster despite the memory latency of
main memory access.
• Accessing main memory can act as a bottleneck for CPU core performance as the CPU waits for data,
while making all of main memory high-speed may be prohibitively expensive. High-speed caches are
a compromise allowing high-speed access to the data most-used by the CPU, permitting a faster CPU
clock.
3. MESI Protocol –
It is the most widely used cache coherence protocol. Every cache line is marked with one the following
states:
Modified –
This indicates that the cache line is present in the current cache only and is dirty i.e its value is different
from the main memory. The cache is required to write the data back to main memory in future, before
permitting any other read of invalid main memory state.
Exclusive –
This indicates that the cache line is present in the current cache only and is clean i.e its value matches
the main memory value.
Shared –
It indicates that this cache line may be stored in other caches of the machine.
Invalid –
It indicates that this cache line is invalid.
4. MOESI Protocol:
This is a full cache coherence protocol that encompasses all of the possible states commonly used in
other protocols. Each cache line is in one of the following states:
Modified –
A cache line in this state holds the most recent, correct copy of the data while the copy in the main
memory is incorrect and no other processor holds a copy.
Owned –
A cache line in this state holds the most recent, correct copy of the data. It is similar to shared state in
that other processors can hold a copy of the most recent, correct data and unlike a shared state howev-
er, a copy in main memory can be incorrect. Only one processor can hold the data in its owned state
while all other processors must hold the data in a shared state.
Exclusive –
A cache line in this state holds the most recent, correct copy of the data. The main memory copy is also
the most recent, correct copy of data while no other holds a copy of data.
Shared –
A cache line in this state holds the most recent, correct copy of the data. Other processors in the system
may hold copies of data in a shared state as well. The main memory copy is also the most recent, correct
copy of the data, if no other processor holds it in its own state.
Invalid –
A cache line in this state does not hold a valid copy of data. Valid copies of data can be either in main
memory or another processor cache.
Cache Memory Performance
Types of Caches:
L1 Cache: Cache built in the CPU itself is known as L1 or Level 1 cache. This type of cache holds most
recent data so when, the data is required again so the microprocessor inspects this cache first so it does
not need to go through main memory or Level 2 cache. The main significance behind the above concept
is “Locality of reference”, according to which a location just accessed by the CPU has a higher probabil-
ity of being required again.
L2 Cache : This type of cache resides on a separate chip next to the CPU also known as Level 2 Cache.
This cache stores recently used data that cannot be found in the L1 Cache. Some CPUs have both L1 and
L2 Cache built-in and designate the separate cache chip as level 3 (L3) Cache.
Cache that is built into the CPU is faster than a separate cache. Separate cache is faster than RAM. Built-
in Cache runs at the speed of a microprocessor.
Disk Cache: It contains most recent read in data from the hard disk and this cache is much slower than
RAM.
Instruction Cache Vs Data Cache : Instruction or I-cache stores instructions only while Data or D-
cache stores only data. Distinguishing the stored data by this method recognizes the different access
behavior pattern of instructions and data. For example : The programs need to involve few write ac-
cesses, and they often exhibit more temporal and spatial locality than the data they process.
Unified Cache Vs Split Cache : A cache that stores both instructions and data is referred to as a unified
cache. A split cache on other hand, consists of two associated but largely independent units – An I-cache
and D-cache. This type of cache can also be designed to deal with two independent units differently.
The performance of the cache memory is measured in terms of a quantity called Hit Ratio. When the
CPU refers to the memory and reveals the word in the cache, it’s far stated that a hit has successfully
occurred. If the word is not discovered in the cache, then the CPU refers to the main memory for the fa-
vored word and it is referred to as a miss to cache.
Hit Ratio (h) :
Hit Ratio (h) = Number of Hits / Total CPU references to memory = Number of hits / ( Number of Hits +
Number of Misses )
The Hit ratio is nothing but a probability of getting hits out of some number of memory references
made by the CPU. So its range is 0 <= h <= 1.
Average Access Time ( tavg ) :
tavg = h X tc + ( 1- h ) X ( tc + tm ) = tc + ( 1- h ) X tm
Let tc, h and tm denote the cache access time, hit ratio in cache and and main access time respectively.
Average memory access time = Hit Time + Miss Rate X Miss Penalty
Miss Rate : It can be defined as the fraction of accesses that are not in the cache (i.e. (1-h)).
Miss Penalty : It can be defined as the additional clock cycles to service the miss, the extra time needed
to carry the favored information into cache from main memory in case of miss in cache.
Compulsory Miss (Cold start Misses or First reference Misses) : This type of miss occurs when the
first access to a block happens. In this type of miss, the block must be brought into the cache.
Capacity Miss : This type of miss occurs when a program working set is much bigger than the cache
storage capacity. Blocks need to be discarded as keeping all blocks is not possible during program exe-
cution.
Conflict Miss (Collision Misses or Interference Misses) : This miss is found majorly in the case of set
associative or direct mapped block placement strategies, conflict miss occurs when several blocks are
mapped to the same set or block frame.
Coherence Miss (Invalidation) : It occurs when other external processors ( e.g. I/O ) update memory.
Memory Interleaving is less or More an Abstraction technique. Though it’s a bit different from Abstrac-
tion. It is a Technique that divides memory into a number of modules such that Successive words in the
address space are placed in the Different modules.
Let us assume 16 Data’s to be Transferred to the Four Module. Where Module 00 be Module 1, Module
01 be Module 2, Module 10 be Module 3 & Module 11 be Module 4. Also, 10, 20, 30….130 are the data to
be transferred.
From the figure above in Module 1, 10 [Data] is transferred then 20, 30 & finally, 40 which are the Data.
That means the data are added consecutively in the Module till its max capacity.
Most significant bit (MSB) provides the Address of the Module & the least significant bit (LSB) provides
the address of the data in the module.
For Example, to get 90 (Data) 1000 will be provided by the processor. This 10 will indicate that the data
is in module 10 (module 3) & 00 is the address of 90 in Module 10 (module 3). So,
Now again we assume 16 Data’s to be transferred to the Four Module. But Now the consecutive Data are
added in Consecutive Module. That is, 10 [Data] is added in Module 1, 20 [Data] in Module 2 and So on.
Least Significant Bit (LSB) provides the Address of the Module & Most significant bit (MSB) provides the
address of the data in the module.
For Example, to get 90 (Data) 1000 will be provided by the processor. This 00 will indicate that the data
is in module 00 (module 1) & 10 is the address of 90 in Module 00 (module 1). That is,
Module 1 Contains Data : 10, 50, 90, 130
Module 2 Contains Data : 20, 60, 100, 140
Module 3 Contains Data : 30, 70, 110, 150
Module 4 Contains Data : 40, 80, 120, 160
Advantages:
• Whenever the Processor requests Data from the main memory. A block (chunk) of Data is Trans-
ferred to the cache and then to the Processor. So whenever a cache miss occurs the Data is to be
fetched from the main memory. But the main memory is relatively slower than the cache. So to im-
prove the access time of the main memory interleaving is used.
• We can access all four Modules at the same time thus achieving Parallelism. From Figure 2 the data
can be acquired from the Module using the Higher bits. This method Uses memory effectively
12. difference between On chip Vs Off chip Memories/Caches
On-chip components are the components built Off-chip components are discrete components
on the chip (IC) itself i.e on the same silicon that are not built on the chip .
substrate like transistors ,resistors ,capacitors
,coils...etc .
On-chip components can provide good match- Usually, off-chip components are used for coils
ing (can achieve accurate ratios between val- ,capacitors and resistors of high values that
ues) and doesn't consume size (not bulky) and are impractical to implement on the chip or for
reduces connections required on the PCB .
components that are required to operate at
higher voltage level and power than the chip
can handle .
10 Marks
interface. Introduction
A general purpose computer should have the ability to exchange information with a wide range of devices
in varying environments. Computers can communicate with other computers over the Internet and
access information around the globe. They are an integral part of home appliances, manufacturing
equipment, transportation systems, banking and point-of- sale terminals. In this chapter, we study the
various ways in which I/O operations are performed.
Peripheral Devices:
The Input / output organization of computer depends upon the size of computer and the
peripherals connected to it. The I/O Subsystem of the computer, provides an efficient
mode of communication between the central system and the outside environment
The most common input output devices are:
i) Monitor
ii) Keyboard
iii) Mouse
iv) Printer
v) Magnetic tapes
Accessing I/O Devices
A single-bus structure
A simple arrangement to connect I/O devices to a computer is to use a single bus arrangement, as
shown in above figure. Each I/O device is assigned a unique set of address. When the processor places a
particu- lar address on the address lines, the device that recognizes this address responds to the
commands issued on the control lines. The processor requests either a read or a write operation which
is transferred over the data lines. When I/O devices and the memory share the same address space, the
arrangement is called memory-mapped I/O.
Input-Output Interface is used as an method which helps in transferring of information between the
inter- nal storage devices i.e. memory and the external peripheral device. A peripheral device is that
which pro- vide input and output for the computer, it is also called Input-Output devices. For Example:
A keyboard and mouse provide Input to the computer are called input devices while a monitor and
printer that pro- vide output to the computer are called output devices. Just like the external hard-
drives, there is also availability of some peripheral devices which are able to provide both input and
output.
The address decoder enables the device to recognize its address when this address appears on the
address lines. The data register holds the data. The status register contains information. The address
decoder, data and status registers and controls required to coordinate I/O transfers constitutes
interface circuit
For eg: Keyboard, an instruction that reads a character from the keyboard should be executed only
when a character is available in the input buffer of the keyboard interface. The processor repeatedly
checks a sta- tus flag to achieve the synchronization between processor and I/O device, which is called
as program- controlled I/O.
To communicate with I/O, the processor must communicate with the memory unit. Like the I/O bus, the
memory bus contains data, address and read/write control lines.
I/O Versus Memory Bus
There are 3 ways that computer buses can be used to communicate with memory and I/O:
i. Use two Separate buses , one for memory and other for I/O.
ii. Use one common bus for both memory and I/O but separate control lines for each.
iii. Use one common bus for memory and I/O with common control lines.
I/O BUS and Interface Module:
• I/O Processor In the first method, the computer has independent sets of data, address and control buses
one for accessing memory and other for I/O.
• This is done in computers that provides a separate I/O processor (IOP).
• I/O Processor
• The purpose of IOP is to provide an independent pathway for the transfer of information between external
device and internal memory.
• In the first method, the computer has independent sets of data, address and control buses one for access-
• The purpose of IOP is to provide an independent pathway for the transfer of information between external
device and internal memory.
• In microcomputer base system, the only purpose of peripheral devices is just to provide special communi-
cation links for the interfacing them with the CPU. To resolve the differences between peripheral devices
and CPU, there is a special need for communication links.
The major differences are as follows:
• The nature of peripheral devices is electromagnetic and electro-mechanical. The nature of the CPU is
electronic. There is a lot of difference in the mode of operation of both peripheral devices and CPU.
• There is also a synchronization mechanism because the data transfer rate of peripheral devices are
slower than CPU.
• In peripheral devices, data code and formats differ from the format in the CPU and memory.
• The operating modes of peripheral devices are different and each may be controlled so as not to disturb
the operation of other peripheral devices connected to the CPU.
• There is a special need for additional hardware to resolve the differences between CPU and peripheral
devices to supervise and synchronize all input and output devices.
The internal operations in an individual unit of a digital system are synchronized using clock pulse. It
means clock pulse is given to all registers within a unit. And all data transfer among internal registers
oc- curs simultaneously during the occurrence of the clock pulse. Now, suppose any two units of a
digital sys- tem are designed independently, such as CPU and I/O interface.
If the registers in the I/O interface share a common clock with CPU registers, then transfer between the
two units is said to be synchronous. But in most cases, the internal timing in each unit is independent of
each other, so each uses its private clock for its internal registers. In this case, the two units are said to
be asynchronous to each other, and if data transfer occurs between them, this data transfer
is called Asynchronous Data Transfer.
But, the Asynchronous Data Transfer between two independent units requires that control signals be
transmitted between the communicating units so that the time can be indicated at which they send
data. These two methods can achieve this asynchronous way of data transfer:
o Strobe control: A strobe pulse is supplied by one unit to indicate to the other unit when the trans-
fer has to occur.
o Handshaking: This method is commonly used to accompany each data item being transferred with
a control signal that indicates data in the bus. The unit receiving the data item responds with an-
other signal to acknowledge receipt of the data.
The strobe pulse and handshaking method of asynchronous data transfer is not restricted to I/O
transfer. They are used extensively on numerous occasions requiring the transfer of data between two
independent units. So, here we consider the transmitting unit as a source and receiving unit as a
destination.
For example, the CPU is the source during output or write transfer and the destination unit during input
or read transfer.
Therefore, the control sequence during an asynchronous transfer depends on whether the transfer is
initi- ated by the source or by the destination.
So, while discussing each data transfer method asynchronously, you can see the control sequence in
both terms when it is initiated by source or by destination. In this way, each data transfer method can
be fur- ther divided into parts, source initiated and destination initiated.
Asynchronous Data Transfer Methods
The asynchronous data transfer between two independent units requires that control signals be
transmit- ted between the communicating units to indicate when they send the data. Thus, the two
methods can achieve the asynchronous way of data transfer.
The Strobe Control method of asynchronous data transfer employs a single control line to time each
trans- fer. This control line is also known as a strobe, and it may be achieved either by source or
destination, de- pending on which initiate the transfer.
the transfer.
After a brief delay to ensure that the data resolve to a stable value, the source activates a strobe pulse.
The information on the data bus and strobe control signal remains in the active state for a sufficient
time to allow the destination unit to receive the
data. The destination unit uses a falling edge of strobe control to transfer the contents of a
data bus to one of its internal registers. The source removes the data from the data bus after it disables
its strobe pulse. Thus, new valid data will be available only after the strobe
is enabled again. In this case, the strobe may be a memory-write control signal from the CPU to a
memory unit. The CPU places the word on the data bus and informs the memory unit, which is the
destination.
a. Destination initiated strobe: In the below block diagram, you see that the strobe initiated by des-
tination, and in the timing diagram, the destination unit first activates the strobe pulse, informing the
source to provide the data.
The source unit responds by placing the requested binary information on the data bus. The data must
be valid and remain on the bus long enough for the destination unit to accept
it. The falling edge of the strobe pulse can use again to trigger a destination register. The destination
unit then disables the strobe. Finally, and source removes the data from the data bus after a determined
time interval.
In this case, the strobe may be a memory read control from the CPU to a memory unit. The CPU initiates
the read operation to inform the memory, which is a source unit, to place the selected word into the
2. Handshaking Method
The strobe method has the disadvantage that the source unit that initiates the transfer has no way
of knowing whether the destination has received the data that was placed in the bus. Similarly, a
destination unit that initiates the transfer has no way of knowing whether the source unit has placed
data on the bus. So this problem is solved by the handshaking method. The handshaking method
introduces a second con- trol signal line that replays the unit that initiates the transfer.
In this method, one control line is in the same direction as the data flow in the bus from the source to
the destination. The source unit uses it to inform the destination unit whether there are valid data in the
bus. The other control line is in the other direction from the destination to the source. This is because
the des- tination unit uses it to inform the source whether it can accept data. And in it also, the sequence
of control depends on the unit that initiates the transfer. So it means the sequence of control depends on
whether the transfer is initiated by source and destination.
Source initiated handshaking: In the below block diagram, you can see that two handshaking lines are
"data valid", which is generated by the source unit, and "data accepted", generated by the destination
unit.
o The timing diagram shows the timing relationship of the exchange of signals between the two units.
The source initiates a transfer by placing data on the bus and enabling its data valid signal. The des-
tination unit then activates the data accepted signal after it accepts the data from the bus.
The source unit then disables its valid data signal, which invalidates the data on the bus.
After this, the destination unit disables its data accepted signal, and the system goes into its initial
state. The source unit does not send the next data item until after the destination unit shows readi-
ness to accept new data by disabling the data accepted signal.
This sequence of events described in its sequence diagram, which shows the above sequence in
which the system is present at any given time.
o Destination initiated handshaking: In the below block diagram, you see that the two handshak-
ing lines are "data valid", generated by the source unit, and "ready for data" generated by the des-
tination unit.
Note that the name of signal data accepted generated by the destination unit has been changed to
ready for data to reflect its new meaning.
• The destination transfer is initiated, so the source unit does not place data on the data bus until it re-
ceives a ready data signal from the destination unit. After that, the handshaking process is the same as
that of the source initiated.
The sequence of events is shown in its sequence diagram, and the timing relationship between signals
is shown in its timing diagram. Therefore, the sequence of events in both cases would be identical.
Advantages of Asynchronous Data Transfer
Asynchronous Data Transfer in computer organization has the following advantages, such as:
• It is more flexible, and devices can exchange information at their own pace. In addition, individual data
characters can complete themselves so that even if one packet is corrupted, its predecessors and suc-
cessors will not be affected.
• It does not require complex processes by the receiving device. Furthermore, it means that inconsisten-
cy in data transfer does not result in a big crisis since the device can keep up with the data stream. It al-
so makes asynchronous transfers suitable for applications where character data is generated irregular-
ly.
• Here earlier state of line was 1 when a character has to be send a 0 is send and character bit are trans-
ferred.
• Difference between serial and parallel transfer –
•
• Asynchronous communication interface
• The block diagram of the asynchronous communication interface is shown above. It functions both
as a transmitter and receiver.
1. Interrupt- initiated I/O: Since in the above case we saw the CPU is kept busy unnecessarily. This situa-
tion can very well be avoided by using an interrupt driven method for data transfer. By using interrupt fa-
cility and special commands to inform the interface to issue an interrupt request signal whenever data is
available from any device. In the meantime the CPU can proceed for any other program execution. The in-
terface meanwhile keeps monitoring the device. Whenever it is determined that the device is ready for da-
ta transfer it initiates an interrupt request signal to the computer. Upon detection of an external interrupt
signal the CPU stops momentarily the task that it was already performing, branches to the service program
to process the I/O transfer, and then return to the task it was originally performing.
Note: Both the methods programmed I/O and Interrupt-driven I/O require the active intervention of
the processor to transfer data between memory and the I/O module, and any data transfer must
transverse a path through the processor. Thus both these forms of I/O suffer from two inherent
drawbacks.
• The I/O transfer rate is limited by the speed with which the processor can test and service a
device.
• The processor is tied up in managing an I/O transfer; a number of instructions must be executed
for each I/O transfer.
3. Direct Memory Access: The data transfer between a fast storage media such as magnetic disk and
memory unit is limited by the speed of the CPU. Thus we can allow the peripherals directly communicate
with each other using the memory buses, removing the intervention of the CPU. This type of data transfer
technique is known as DMA or direct memory access. During DMA the CPU is idle and it has no control
over the memory buses. The DMA controller takes over the buses to manage the transfer directly between
the I/O devices and the memory unit.
Bus Request: It is used by the DMA controller to request the CPU to relinquish the control of the
buses. Bus Grant: It is activated by the CPU to Inform the external DMA controller that the buses are in
high im- pedance state and the requesting DMA can take control of the buses. Once the DMA has taken
during a write operation, the processor sends a memory address followed by a sequence of data words,
to be written in successive memory locations starting at the address.
configuration. The first two are self-explanatory. The I/O address space is intended for use with proces-
sors, such as Pentium, that have a separate I/O address space. However, as noted , the system designer
may choose to use memory-mapped I/O even when a separate I/O address space is available.
-and-play capability. A 4- bit command
that accompanies the address identifies which of the three spaces is being used in a given data transfer
operation.
s is similar to the one used, we assumed that the master main-
tains the address information on the bus until data transfer is completed. But, this is not necessary. The
address is needed only long enough for the slave to be selected. The slave can store the address in its in-
ternal buffer. Thus, the address is needed on the bus for one clock cycle only, freeing the address lines to
be used for sending data in subsequent clock cycles. The result is a significant cost reduction because the
number of wires on a bus is an important cost factor. This approach in used in the PCI bus.
transfers by issuing read
and write commands. A master is called an initiator in PCI terminology. This is either a processor or a
DMA controller.
Device Configuration
evice
and the software that communicates with it.
memory that stores information about that device. The configuration ROMs of all devices is accessible in
the configuration address space.
case, it determines whether the device is a printer, a keyboard, an Ethernet interface, or a disk controller.
It can further learn bout various device options and characteristics. Devices are assigned addresses dur-
ing the initialization process. This means that during the bus configuration operation, devices cannot be
accessed based on their address, as they have not yet been assigned one. Hence, the configuration ad-
dress space uses a different mechanism. Each device has an input signal called Initialization Device Se-
lect, IDSEL#.
as SUNs, to benefit from the wide range of I/O devices for which a PCI interface is available.
the PCI-processor bridge circuit is built on
the processor chip itself, further simplifying system design and packaging.
SCSI Bus
It is a standard bus defined by the American National Standards Institute (ANSI). A controller connected
to a SCSI bus is an initiator or a target. The processor sends a command to the SCSI controller, which
causes the following sequence of events to take place:
The SCSI controller contends for control of the bus (initiator).
When the initiator wins the arbitration process, it selects the target controller and hands over con-
trol of the bus to it.
The target starts an output operation. The initiator sends a command specifying the required read
operation.
The target sends a message to the initiator indicating that it will temporarily suspends the connec-
tion between them. Then it releases the bus.
The acronym SCSI stands for Small Computer System Interface. It refers to a standard bus defined
by the American National Standards Institute (ANSI) under the designation X3.131 .
In the original specifications of the standard, devices such as disks
are connected to a computer via a 50-wire cable, which can be up to 25 meters in length and can transfer
data at rates up to 5 megabytes/s.
The SCSI bus standard has undergone many revisions, and its data transfer capability has increased
very rapidly, almost doubling every two years. SCSI-2 and SCSI-3 have been defined, and each has several
options.
A SCSI bus may have eight data lines, in which case it is called a narrow bus and transfers data one
byte at a time.
Alternatively, a wide SCSI bus has 16 data lines and transfers data 16 bits at a time.
There are also several options for the electrical signaling scheme used.
Devices connected to the SCSI bus are not part of the address space of the processor in the same
way as devices connected to the processor bus.
The SCSI bus is connected to the processor bus through a SCSI controller. This controller uses DMA
to transfer data packets from the main memory to the device, or vice versa.
A packet may contain a block of data, commands from the processor to the device, or status infor-
mation about the device.
Communication with a disk drive differs substantially from communication with the main memory.
A controller connected to a SCSI bus is one of two types – an initiator or a target.
An initiator has the ability to select a particular target and to send commands specifying the opera-
tions to be performed.
Clearly, the controller on the processor side, such as the SCSI controller, must be able to operate as
an initiator.
The disk controller operates as a target. It carries out the commands it receives from the initiator.
The initiator establishes a logical connection with the intended target. Once this connection has
been established, it can be suspended and restored as needed to transfer commands and bursts of data.
While a particular connection is suspended, other device can use the bus to transfer information.
This ability to overlap data transfer requests is one of the key features of the SCSI bus that leads to
its high performance.
Data transfers on the SCSI bus are always controlled by the target controller. To send a command to
a target, an initiator requests control of the bus and, after winning arbitration, selects the controller it
wants to communicate with and hands control of the bus over to it.
Then the controller starts a data transfer operation to receive a command from the initiator.
The processor sends a command to the SCSI controller, which causes the following sequence of event to
take place:
ntrol
of the bus to it.
r to target); in response to this, the initiator sends a
command specifying the required read operation.
ator
indicating that it will temporarily suspend the connection between them. Then it releases the bus.
n-
volved in the requested read operation. Then, it reads the data stored in that sector and stores them in a
data buffer. When it is ready to begin transferring data to the initiator, the target requests control of the
bus. After it wins arbitration, it reselects the initiator controller, thus restoring the suspended connec-
tion.
e contents of the data buffer to the initiator and then suspends the connection
again. Data are transferred either 8 or 16 bits in parallel, depending on the width of the bus.
operation. Then, it
transfers the contents of the second disk sector to the initiator as before. At the end of this transfers, the
logical connection between the two controllers is terminated.
into the main memory using the DMA ap-
proach.
been completed.
This scenario show that the messages exchanged over the SCSI bus are at a higher level than those
exchanged over the processor bus. In this context, a “higher level” means that the messages refer to op-
erations that may require several steps to complete, depending on the device. Neither the processor nor
the SCSI controller need be aware of the details of operation of the particular device involved in a data
transfer. In the preceding example, the processor need not be involved in the disk seek operation.
• The target controller sends a command to the disk drive to move the read head to the first sector in-
volved in the requested read operation.
• The target transfers the contents of the data buffer to the initiator and then suspends the connection
again.
• The target controller sends a command to the disk drive to perform another seek operation.
•As the initiator controller receives the data, it stores them into the main memory using the DMA ap-
proach.
• The SCSI controller sends an interrupt to the processor to inform it that the requested operation has
been completed.
The bus signals, arbitration, selection, information transfer and reselection are the topics discussed in
addition to the above.
Universal Serial Bus (USB)
The USB has been designed to meet several key objectives such as:
• Provide a simple, low-cost and easy to use interconnection system that overcomes the difficulties
due to the limited number of I/O ports available on a computer
• Accommodate a wide range of data transfer characteristics for I/O devices, including telephone and
Internet connections
• Enhance user convenience through a “plug-and-play” mode of operation.
Port Limitation
Here to add new ports, a user must open the computer box to gain access to the internal expansion bus
and install a new interface card. The user may also need to know how to configure the device and the
software. And also it is to make it possible to add many devices to a computer system at any time, with-
out opening the computer box.
Device Characteristics
The kinds of devices that may be connected to a computer cover a wide range of functionality - speed,
volume and timing constraints. A variety of simple devices attached to a computer generate data in dif-
ferent asynchronous mode. A signal must be sampled quickly enough to track its highest-frequency com-
ponents.
Plug-and-play
Whenever a device is introduced, do not turn the computer off/restart to connect/disconnect a device.
The system should detect the existence of this new device automatically, identify the appropriate device-
driver software and any other facilities needed to service that device, and establish the appropriate ad-
dresses and logical connections to enable them to communicate.
USB architecture
To accommodate a large number of devices that can be added or removed at any time, the USB has the
tree structure. Each node has a device called a hub. Root hub, functions, split bus operations – high speed
(HS) and Full/Low speed (F/LS).
5. Explain the block diagram of DMA .Also describe how DMA is used to transfer data
from periph- erals
Direct Memory Access (DMA) :
• In the Direct Memory Access (DMA) the interface transfer the data into and out of the memory unit
through the memory bus.
• The transfer of data between a fast storage device such as magnetic disk and memory is often limited by
the speed of the CPU.
• Removing the CPU from the path and letting the peripheral device manage the memory buses directly
would improve the speed of transfer. This transfer technique is called Direct Memory Access (DMA).
• During the DMA transfer, the CPU is idle and has no control of the memory buses. A DMA Controller takes
over the buses to manage the transfer directly between the I/O device and memory
• DMA Controller is a hardware device that allows I/O devices to directly access memory with less partici-
pation of the processor. A DMA controller needs the same old circuits of an interface to communicate
with the CPU and Input/Output devices.
Fig-1 below shows the block diagram of the DMA controller.
The unit communicates with the CPU through data bus and control lines. Through the use of the
address bus and allowing the DMA and RS register to select inputs, the register within the DMA is
chosen by the CPU. RD and WR are two-way inputs. When BG (bus grant) input is 0, the CPU can
communicate with DMA registers. When BG (bus grant) input is 1, the CPU has relinquished the buses
and DMA can communicate directly with the memory.
➢ The CPU may be placed in an idle state in a variety of ways. One common method extensively used in micropro-
cessor is to disable the buses through special control signals such as:
i) Bus Request (BR)
ii) Bus Grant (BG)
• These two control signals in the CPU that facilitates the DMA transfer. The Bus Request (BR) input
is used by the DMA controller to request the CPU. When this input is active, the CPU terminates the exe-
cution of the current instruction and places the address bus, data bus and read write lines into a high Im-
pedance state. High Impedance state means that the output is disconnected.
Direct Memory Access (DMA)
• The CPU activates the Bus Grant (BG) output to inform the external DMA that the Bus Request (BR)
can now take control of the buses to conduct memory transfer without processor.
• When the DMA terminates the transfer, it disables the Bus Request (BR) line. The CPU disables the
Bus Grant (BG), takes control of the buses and return to its normal operation.
• The transfer can be made in several ways that are:
i). DMA Burst
ii). Cycle Stealing
➢ DMA Burst :- In DMA Burst transfer, a block sequence consisting of a number of memory words is
transferred in continuous burst while the DMA controller is master of the memory buses.
➢ Cycle Stealing :- Cycle stealing allows the DMA controller to transfer one data word at a time, after
which it must returns control of the buses to the CPU.
DMA controller registers:
The DMA controller has three registers as follows.
Address register – It contains the address to specify the desired location in memory.
Word count register – It contains the number of words to be transferred.
Control register – It specifies the transfer mode.
Note –
All registers in the DMA appear to the CPU as I/O interface registers. Therefore, the CPU can both read and
write into the DMA registers under program control via the data bus.
Explanation:
The CPU initializes the DMA by sending the given information through the data bus.
The starting address of the memory blocks where the data is available (to read) or where data is to be
stored (to write).
It also sends word count which is the number of words in the memory block to be read or written.
Control to define the mode of transfer such as read or write.
A control to begin the DMA transfer.
DMA Transfer:
➢ The CPU communicates with the DMA through the address and data buses as with any interface unit.
The DMA has its own address, which activates the DS and RS lines. The CPU initializes the DMA through
the data bus. Once the DMA receives the start control command, it can transfer between the peripheral
and the memory.
➢ When BG = 0 the RD and WR are input lines allowing the CPU to communicate with the internal DMA
registers. When BG=1, the RD and WR are output lines from the DMA controller to the random access
memory to specify the read or write operation of data.
Bus:
➢ A bus is a communication channel shared by many devices and hence rules need to be established in
order for the communication to happen correctly. These rules are called bus protocols.
➢ Design of a bus architecture involves several tradeoffs related to the width of the data bus, data transfer
size, bus protocols, clocking, etc. Depending on whether the bus transactions are controlled by a clock
or not, buses are classified into synchronous and asynchronous buses.
➢ Depending on whether the data bits are sent on parallel wires or multiplexed onto one single wire,
there are parallel and serial buses. Control of the bus communication in the presence of multiple devic-
es necessitates defined procedures called arbitration schemes.
Synchronous Buses:
➢ In synchronous buses, the steps of data transfer take place at fixed clock cycles. Everything is synchro-
nized to bus clock and clock signals are made available to both master and slave. The bus clock is a
square wave signal.
➢ A cycle starts at one rising edge of the clock and ends at the next rising edge, which is the beginning of
the next cycle. A transfer may take multiple bus cycles depending on the speed parameters of the bus
and the two ends of the transfer.
➢ One scenario would be that on the first clock cycle, the master puts an address on the address bus, puts
data on the data bus, and asserts the appropriate control lines. Slave recognizes its address on the ad-
dress bus on the first cycle and reads the new value from the bus in the second cycle.
➢ Synchronous buses are simple and easily implemented. However, when connecting devices with vary-
ing speeds to a synchronous bus, the slowest device will determine the speed of the bus. Also, the syn-
chronous bus length could be limited to avoid clock-skewing problems..
Asynchronous Buses
➢ There are no fixed clock cycles in asynchronous buses. Handshaking is used instead. Figure 8.11 shows
the handshaking protocol. The master asserts the data-ready line
➢ (point 1 in the figure) until it sees a data-accept signal. When the slave sees a dataready signal, it will
assert the data-accept line (point 2 in the figure). T
➢ he rising of the data-accept line will trigger the falling of the data-ready line and the removal of data
from the bus. The falling of the data-ready line (point 3 in the figure) will trigger the falling of the data-
accept line (point 4 in the figure).
➢ This handshaking, which is called fully interlocked, is repeated until the data is completely transferred.
Asynchronous bus is appropriate for different speed devices.
➢ An asynchronous bus has no system clock. Handshaking is done to properly conduct the transmission
of data between the sender and the receiver.
➢ The process is illustrated in Fig. 6. For example, in an asynchronous read operation, the bus master puts
the address and control signals on the bus and then asserts a synchronization signal.
➢ The synchronization signal from the master prompts the slave to get synchronized and once it has ac-
cessed the data, it asserts its own synchronization signal.
➢ The slave's synchronization signal indicates to the processor that there is valid data on the bus, and it
reads the data. The master then deasserts its synchronization signal, which indicates to the slave that
the master has read the data.
➢ The slave then deasserts its synchronization signal. This method of synchronization is referred to as a
full handshake. Note that there is no clock and that starting and ending of the data transfer are indicat-
ed by special synchronization signals. An asynchronous communication protocol can be considered as a
pair of Finite State machines (FSMs) that operate in such a way that one FSM does not proceed until the
other FSM has reached a certain state.
Advantages
➢ DMA speedups the memory operations by bypassing the involvement of the CPU.
➢ The work overload on the CPU decreases.
➢ For each transfer, only a few numbers of clock cycles are required
Disadvantages
➢ Cache coherence problem can be seen when DMA is used for data transfer.
➢ Increases the price of the system.
6. Explain in detail about USB, PCI, and
SCSI bus. USB
A Universal Serial Bus (USB) is a common interface that enables communication between devices and a
host controller such as a personal computer (PC) or smartphone. It connects peripheral devices such as
digital cameras, mice, keyboards, printers, scanners, media devices, external hard drives and flash
drives. Because of its wide variety of uses, including support for electrical power, the USB has replaced
a wide range of interfaces like the parallel and serial port.
A USB is intended to enhance plug-and-play and allow hot swapping. Plug-and-play enables the
operating system (OS) to spontaneously configure and discover a new peripheral device without
the computer. As well, hot swapping allows removal and replacement of a new peripheral without
having to reboot.
There are several types of USB connectors. In the past the majority of USB cables were one of two types,
type A and type B. The USB 2.0 standard is type A; it has a flat rectangle interface that inserts into a hub
or USB host which transmits data and supplies power. A keyboard or mouse is common examples of a
type A USB connector. A type B USB connector is square with slanted exterior corners. It is connected to
an up- stream port that uses a removable cable such as a printer. The type B connector also transmits
data and supplies power. Some type B connectors do not have a data connection and are used only as a
power con- nection.
Today, newer connectors have replaced old ones, such as the Mini-USB (or Mini-B), that has been aban-
doned in favor of the Micro-USB and USB-C cables. Micro-USB cables are usually used for charging and
da- ta transfer between smartphones, video game controllers, and some computer peripherals. Micro-
USB is being slowly replaced by type-C connectors, which are becoming the new standard for Android
smartphones and tablets.
PCI
It could be a standard information transport that was common in computers from 1993 to 2007 or so. It
was for a long time the standard transport for extension cards in computers, like sound cards, network
cards, etc. It was a parallel transport that, in its most common shape, had a clock speed of 66 MHz, and
can either be 32 or 64 bits wide. It has since been replaced by PCI Express, which could be a serial
transport as opposed to PCI. A PCI port, or, more precisely, PCI opening, is essentially the connector
that’s utilized to put through the card to the transport. When purged, it basically sits there and does
nothing.
Types of PCI:
These are various types of PCI:
Function of PCI:
PCI slots are utilized to install sound cards, Ethernet and remote cards and presently strong state drives
utilizing NVMe innovation to supply SSD drive speeds that are numerous times speedier than SATA SSD
speeds. PCI openings too permit discrete design cards to be included to a computer as well.
PCI openings (and their variations) permit you to include expansion cards to a motherboard. The exten-
sion cards increment the machines capabilities past what the motherboard may create alone, such as:
up- graded illustrations, extended sound, expanded USB and difficult drive controller, and extra arrange
inter- face options, to title a couple of.
Advantage of PCI:
• You’ll interface a greatest of five components to the PCI and you’ll be able moreover supplant each of
them by settling gadgets on the motherboard.
• You have different PCI buses on the same computer.
• The PCI transport will improve the speed of the exchanges from 33MHz to 133 MHz with a transfer
rate of 1 gigabyte per second.
• The PCI can handle gadgets employing a maximum of 5 volts and the pins utilized can exchange more
than one flag through one stick.
Disadvantage of PCI:
• PCI Graphics Card cannot get to system memory.
• PCI does not support pipelines.
SCSI
The basic interface for connecting peripheral devices to a PC is a small computer system interface.
Based on the specification, it can typically respond up to 16 external devices using a single route, along
with a host adapter. Small Computer System Interface is used to boost performance, deliver fast data
transfer de- livery and provide wider expansion for machines like CD-ROM drivers, scanners, DVD>
drives and CD writers.
Small Computer System Interface is most commonly used for RAID, servers, highly efficient desktop
com- puters, and storage area networks.
The Small Computer System Interface has control, which is responsible for transmitting data across the
Small Computer System Interface bus and the computers. It can be fixed on a motherboard, or one
client adapter is installed through an extension on the computer's motherboard.
The controller also incorporates a simple SCSI input/output system, which is a small chip that provides
access and control equipment with the necessary software. The SCSI ID is his number. Using serial
storage architecture initiators, new serial SCSI IDs such as serial attached SCSI use an automatic process
which as- signs a 7-bit number.