Coa Unit 2
Coa Unit 2
ASSISTANT PROFESSOR
BE-CSE
DESIGN OF CONTROL
UNIT
•Two decoders, sequence counter and logic gates make up a Hardwired Control.
•The instruction register stores an instruction retrieved from the memory unit
(IR).
•An instruction register consists of the operation code, the I bit, and bits 0
through 11.
•A 3 x 8 decoder is used to encode the operation code in bits 12 through 14.
•The decoder’s outputs are denoted by the letters D0 through D7.
•The bit 15 operation code is transferred to a flip-flop with the symbol I.
•The control logic gates are programmed with operation codes from bits 0 to 11.
•The sequence counter (or SC) can count from 0 to 15 in binary.
Designing of Hardwired Control Unit
The following are some of the ways for constructing hardwired control logic that
have been proposed:
The basic data for control signal creation is contained in the operation code of an
instruction. The operation code is decoded in the instruction decoder. The
instruction decoder is a collection of decoders that decode various fields of the
instruction opcode.
As a result, only a few of the instruction decoder’s output lines have active signal
values. These output lines are coupled to the matrix’s inputs, which provide control
signals for the computer’s executive units. This matrix combines the decoded
signals from the instruction opcode with the outputs from that matrix which
generates signals indicating consecutive control unit states, as well as signals from
the outside world, such as interrupt signals. The matrices are constructed in the
same way that programmable logic arrays are.
Hardwired control unit
Generation of a Signal
• Control signals for instruction execution must be generated during the whole-
time range that corresponds to the cycle of instruction execution, not just at a
single moment in time.
• The control unit organises the appropriate sequence of internal states based on
the structure of this cycle.
• The control signal generator matrix sends a number of signals back to
the inputs of the following control state generator matrix. This matrix
mixes these signals with the timing signals created by the timing unit
depending on the rectangular patterns typically provided by the quartz
generator.
• The control unit is in the beginning state of new instruction, fetching
whenever a new instruction arrives at it.
Generation of a Signal
• Instruction decoding permits the control unit to enter the first state relevant to
the new instruction execution, which lasts as long as the computer’s timing
signals as well as other input signals, such as flags and state
information, stay unchanged.
• A change in any of the previously stated signals causes the control unit’s
status to change.
Result
A new corresponding input for the control signal generator matrix is formed as a
result of this. When an external signal (such as an interrupt) comes, the control
unit enters the next control state, which is concerned with the response to the
external signal (for example, interrupt processing). The computer’s flags and
state variables are utilised to choose appropriate states for the cycle of instruction
execution.
The cycle’s last states are control states that begin fetching the program’s next
instruction: sending the program’s counter content to the address of the main
memory buffer register and then reading the instruction word into the
computer’s instruction register. The control unit enters an OS state, where it
waits for the next user directive when the running instruction is the stop
instruction, which terminates programme execution.
Advantages of Hardwired Control Unit
Reference Books:
•J.P. Hayes, “Computer Architecture and Organization”, Third Edition.
•Mano, M., “Computer System Architecture”, Third Edition, Prentice Hall.
•Stallings, W., “Computer Organization and Architecture”, Eighth Edition,
Pearson Education.
Text Books:
•Carpinelli J.D,” Computer systems organization &Architecture”, Fourth
Edition, Addison Wesley.
•Patterson and Hennessy, “Computer Architecture”, Fifth Edition Morgaon
Kauffman.
Reference Website:
•https://www.youtube.com/watch?v=MxvZQLR6zqM
•https://www.youtube.com/watch?v=gXXVX64yhME
•https://youtu.be/1q2JKX3qg-4?si=dUjGIZjK8JVg78Xy
University Institute of Engineering
Department of Computer Science & Engineering
ASSISTANT PROFESSOR
BE-CSE
MICRO-PROGRAMMED CONTROL UNIT
The existence of the control store, which is used to store words containing encoded
control signals required for instruction execution, is the main distinction between
microprogrammed structures and the hardwired control unit structure.
Each bit in the microinstruction is connected to a single control signal. The control
signal is active when the bit is set., and it becomes inactive when it is cleared. A
sequence of these microinstructions can be kept in the internal ‘control’ memory. A
microprogram-controlled computer’s control unit is a computer within a computer.
Some Important Terms –
Reference Books:
•J.P. Hayes, “Computer Architecture and Organization”, Third Edition.
•Mano, M., “Computer System Architecture”, Third Edition, Prentice Hall.
•Stallings, W., “Computer Organization and Architecture”, Eighth
Edition, Pearson Education.
Text Books:
•Carpinelli J.D,” Computer systems organization &Architecture”,
Fourth Edition, Addison Wesley.
•Patterson and Hennessy, “Computer Architecture” , Fifth Edition Morgaon
Kauffman.
Other References:
• http://www.pvpsiddhartha.ac.in/dep_it/lecturenotes/CSA/unit-3.pdf
• http://www.cs.binghamton.edu/~reckert/hardwire3new.html
• https://www.geeksforgeeks.org/last-minute-notes-computer-organization/
• https://bmsit.ac.in/system/study_materials/documents/000/000/007/origin
al/Comp uter_Organization_Chapter7.pdf?1477057102
• https://www.javatpoint.com/design-of-control-unit
ASSISTANT PROFESSOR
BE-CSE
COMPARATIVE STUDY OF HARDWIRED AND MICRO-
PROGRAMMED CONTROL UNIT
•To execute an instruction, there are two types of control units: Hardwired
Control unit and Micro-programmed control unit.
•Hardwired control units are generally faster than microprogrammed
designs. In hardwired control, we saw how all the control signals required
inside the CPU can be generated using a state counter and a Programmable
Logic Array (PLA) circuit.
•A microprogrammed control unit is a relatively simple logic circuit that is
capable of sequencing through microinstructions and generating control signals
to execute each microinstruction.
•The main difference between Hardwired and Microprogrammed Control Unit is
that a Hardwired Control Unit is a sequential circuit that generates control
signals while a Microprogrammed Control Unit is a unit with microinstructions
in the control memory to generate control signals.
Following are the further differences in tabular form.
Hardwired Control Unit Microprogrammed Control Unit
With the help of a hardware circuit, we While with the help of programming, we
can implement the hardwired control unit. In can implement the micro-
other words, we can say that it is a programmed control unit.
circuitry approach.
The hardwired control unit uses the logic The micro-programmed CU uses
circuit so that it can generate the control microinstruction so that it can
signals, which are required for the processor. generate the control signals. Usually,
control memory is used to store
these microinstructions.
In this CU, the control signals are going to be It is very easy to modify the micro-
generated in the form of hard wired. programmed control unit because
That's why it is very difficult to modify the the modifications are going to be
hardwired control unit. performed only at the instruction level.
Hardwired Control Unit Microprogrammed Control Unit
The complex instructions cannot be handled The micro-programmed control unit is
by a hardwired control unit because when we able to handle the complex instructions.
design a circuit for this instruction, it will
become complex.
In the form of logic gates, everything has to beThe micro-programmed control unit is
realized in the hardwired control unit. That's less costly as compared to the hardwired
why this CU is more costly as compared to the CU because this control unit only requires
micro-programmed control unit. the microinstruction to generate the
control signals.
Because of the hardware implementation, the The micro-programmed control unit is
hardwired control unit is able to use a limited able to generate control signals for many
number of instructions. instructions.
Difficult to modify as the control signals that Easy to modify as the modification need
need to be generated are hardwired to be done only at the instruction level
Hardwired Control Unit Microprogrammed Control Unit
More costlier as everything has to be Less costlier than hardwired control
realized in terms of logic gates as only micro instructions are used
for generating control signals
Only limited number of instructions are Control signals for many
used due to the hardware implementation instructions can be generated
Used in computer that makes use of Used in computer that makes use
Reduced Instruction Set Computers of Complex Instruction Set
(RISC) Computers (CISC)
Difference Between Hardwired and Microprogrammed Control Unit
Definition
Hardwired Control Unit is a unit that uses combinational logic units, featuring a
finite number of gates that can generate specific results based on the instructions
that were used to invoke those responses. Microprogrammed Control Unit is a
unit that contains microinstructions in the control memory to produce control
signals.
Speed
The speed of operations in Hardwired Control Unit is fast. The speed of
operations in Microprogrammed Control Unit is slow because it requires
frequent memory accesses.
Difference Between Hardwired and Microprogrammed Control Unit
Modification
To do modifications in a Hardwired Control Unit, the entire unit should be
redesigned. In Microprogrammed Control Unit, modifications can be
implemented by changing the microinstructions in the control memory.
Therefore, Microprogrammed Control Unit is more flexible.
Cost
Furthermore, Hardwired Control Unit are more costly to implement than a
Microprogrammed Control Unit.
Handling Complex Instructions
Also, it is difficult for Hardwired Control Unit to handle complex instructions,
but is easier for the Microprogrammed Control Unit to handle complex
instructions.
Difference Between Hardwired and Microprogrammed Control Unit
Instruction Decoding
Moreover, it is difficult to perform instruction decoding in Hardwired Control
Unit than in Microprogrammed Control Unit.
Instruction set Size
In addition to the above differences, the Hardwired Control Unit uses a small
instruction set while the Microprogrammed Control Unit uses a large instruction
set.
Control Memory
Also, there is no control memory usage in Hardwired Control Unit but, on the
other hand, Microprogrammed Control Unit uses control memory.
Difference Between Hardwired and Microprogrammed Control Unit
Applications
Considering the applications, the Hardwired Control Unit is used in processors
that use a simple instruction set known as the Reduced Instruction Set Computers
(RISC). Microprogrammed Control Unit is used in processors based on a
complex instruction set known as Complex Instruction Set Computer (CISC).
References
Reference Books:
●J.P. Hayes, “Computer Architecture and Organization”, Third Edition.
●Mano, M., “Computer System Architecture”, Third Edition, Prentice Hall.
●Stallings, W., “Computer Organization and Architecture”, Eighth Edition,
Pearson Education.
Text Books:
●Carpinelli J.D,” Computer systems organization &Architecture”, Fourth
Edition, Addison Wesley.
●Patterson and Hennessy, “Computer Architecture”, Fifth Edition Morgaon
Kauffman.
Reference Website
●Instruction Set Architecture : Instructions and Formats | Computer
Architecture (witscad.com)
●https://byjus.com/gate/difference-between-hardwired-and-microprogrammed-
control- unit/
●https://www.javatpoint.com/hardwired-vs-micro-programmed-control-unit
University Institute of Engineering
Department of Computer Science & Engineering
ASSISTANT PROFESSOR
BE-CSE
MEMORY HIERARCHY
The computer memory hierarchy looks like a
pyramid structure which is used to describe the
differences among memory types. It separates the
computer storage based on hierarchy.
Level 0: CPU registers
Level 1: Cache memory
Level 2: Main memory or primary
memory
Level 3: Magnetic disks or
secondary memory
Level 4: Optical disks or magnetic types or tertiary
Memory
In Memory Hierarchy the cost of memory, capacity is inversely proportional to
speed. Here the devices are arranged in a manner Fast to slow, that is from
register to Tertiary memory.
1. Capacity:
It is the global volume of information the memory can store. As we move
from top to bottom in the Hierarchy, the capacity increases.
2. Access Time:
It is the time interval between the read/write request and the availability of the
data. As we move from top to bottom in the Hierarchy, the access time
increases.
3. Performance:
Earlier when the computer system was designed without Memory Hierarchy
design, the speed gap increases between the CPU registers and Main Memory
due to large difference in access time. This results in lower performance of the
system and thus, enhancement was required. This enhancement was made in
the form of Memory Hierarchy Design because of which the performance of
the system increases. One of the most significant ways to increase system
performance is minimizing how far down the memory hierarchy one has to go
to manipulate data.
ASSISTANT PROFESSOR
BE-CSE
MAIN MEMORY
The main memory acts as the central storage unit in a computer system. It is a
relatively large and fast memory which is used to store programs and data
during the run time operations.
The primary technology used for the main memory is based on
semiconductor integrated circuits.
The integrated circuits for the main memory are classified into two major
units.
1.RAM (Random Access Memory) integrated circuit chips
2.ROM (Read Only Memory) integrated circuit chips
1. RAM integrated circuit chips
The RAM integrated circuit chips are further classified into two
possible operating modes, static and dynamic.
The primary compositions of a static RAM are flip-flops that store
the binary information. The nature of the stored information is volatile, i.e.
it remains valid as long as power is applied to the system. The static RAM
is easy to use and takes less time performing read and write operations as
compared to dynamic RAM.
The dynamic RAM exhibits the binary information in the form of electric
charges that are applied to capacitors. The capacitors are integrated
inside the chip by MOS transistors. The dynamic RAM consumes less
power and provides large storage capacity in a single memory chip.
RAM chips are available in a variety of sizes and are used as per
the system requirements.
The following block diagram demonstrates the chip interconnection in a 128 * 8
RAM chip.
oA 128 * 8 RAM chip has a memory capacity of 128 words of eight bits (one
byte) per word. This requires a 7-bit address and an 8-bit bidirectional data bus.
o The 8-bit bidirectional data bus allows the transfer of data either from memory
to CPU during a read operation or from CPU to memory during a write
operation.
oThe read and write inputs specify the memory operation, and the two chip
select (CS) control inputs are for enabling the chip only when the
microprocessor selects it.
oThe bidirectional data bus is constructed using three-state buffers.
oThe output generated by three-state buffers can be placed in one of the
three possible states which include a signal equivalent to logic 1, a signal equal
to logic 0, or a high-impedance state.
Note: The logic 1 and 0 are standard digital signals whereas the high-impedance
state behaves like an open circuit, which means that the output does not carry a
signal and has no logic significance.
The following function table specifies the operations of a 128 * 8 RAM chip.
From the functional table above, we can conclude that the unit is in operation only
when CS1 = 1 and CS2 = 0. The bar on top of the second select variable
indicates that this input is enabled when it is equal to 0.
2. ROM integrated circuit
The primary component of the main memory is RAM integrated circuit chips,
but a portion of memory may be constructed with ROM chips.
A ROM memory is used for keeping programs and data that are permanently
resident in the computer.
Apart from the permanent storage of data, the ROM portion of main memory is
needed for storing an initial program called a bootstrap loader. The
primary function of the bootstrap loader program is to start the computer
software operating when power is turned on.
ROM chips are also available in a variety of sizes and are also used as per the
system requirement. The following block diagram demonstrates the chip
interconnection in a 512 * 8 ROM chip.
o A ROM chip has a similar organization as a RAM chip. However, a ROM can
only perform read operation; the data bus can only operate in an output mode.
o The nine address lines in the ROM chip specify any one of the 512 bytes
stored in it.
o The two chip select inputs must be CS1=1 and CS2=0 for the unit to
operate. Otherwise, the data bus is said to be in a high-impedance state.
Memory Address Map
The system designer must calculate the amount of memory required for a given
application and assign it to RAM or ROM.
The interconnection between the processor and the memory is established from
the knowledge of the size of memory required and the type of ROM
and RAM chips available. The addressing of memory can be established by
means of a table that specify the memory address assigned to each chip. The
table is called the Memory address map, is a pictorial representation of
assigned address space for each chip in the system.
Component Hexadecima Address Bus
l Address 10 9 8 7 6 5 4 3 2 1
RAM 1 0000-007F 0 0 0 X X X X X X X
RAM 1 0080-007F 0 0 1 X X X X X X X
RAM 1 0100-017F 0 1 0 X X X X X X X
RAM 1 0180-01FF 0 1 1 X X X X X X X
ROM 0200-03FF 1 X X X X X X X X X
The memory address map for the configuration of 512 bytes RAM and 512 bytes
ROM is shown in table above. The component column specifies whether a RAM
or a ROM chip is used. The hexadecimal address column assigns a range of
hexadecimal equivalent addresses for each chip. The address bus lines are listed in
the third column. The RAM chips have 128 bytes and need seven address lines.
The ROM chip has 512 bytes and needs 9 address lines.
Memory Connection to CPU
Memory
Connection to
the CPU
The connection of memory chips to the CPU is shown in the figure above. This
configuration gives a memory capacity of 512 bytes of RAM and 512 bytes of
ROM. Each RAM receives the seven low-order bits of the address bus to select
one of 128 possible bytes.
The particular RAM chip selected is determined from lines 8 and 9 in the
address bus. This is done through a 2 X 4 decoder whose outputs go to the CS1
inputs in each RAM chip.
Thus, when address lines 8 and 9 are equal to 00, the first RAM chip is selected.
When 01, the second RAM chip is select, and so on.
The RD and WR outputs from the microprocessor are applied to the inputs of
each RAM chip. The selection between RAM and ROM is achieved through bus
line 10. The RAMs are selected when the bit in this line is 0, and the ROM when
the bit is 1.
Address bus lines 1 to 9 are applied to the input address of ROM without going
through the decoder. The data bus of the ROM has only an output
capability, whereas the data bus connected to the RAMs can transfer
information in both directions.
References
Reference Books:
●J.P. Hayes, “Computer Architecture and Organization”, Third Edition.
●Mano, M., “Computer System Architecture”, Third Edition, Prentice Hall.
●Stallings, W., “Computer Organization and Architecture”, Eighth Edition,
Pearson Education.
Text Books:
●Carpinelli J.D,” Computer systems organization &Architecture”, Fourth
Edition, Addison Wesley.
●Patterson and Hennessy, “Computer Architecture”, Fifth Edition Morgaon
Kauffman.
Reference Website:
●Memory Hierarchy Design and its Characteristics - GeeksforGeeks
●What is memory hierarchy? (tutorialspoint.com)
Video Links:
•https://youtu.be/oj4_c9IMOCg?si=1FMiriyxX7JwkE4h
•https://youtu.be/zwovvWfkuSg?si=fKl4vq0THSpiMa0M
•https://www.youtube.com/watch?v=LE_lyLCqy8I&list=PL3R9-
um41JszyaKeoc9qP8Bn45XzqJycj
•https://www.youtube.com/watch?v=_ImgTuITW-s&list=PL3R9-
um41JszyaKeoc9qP8Bn45XzqJycj&index=4
University Institute of Engineering
Department of Computer Science & Engineering
ASSISTANT PROFESSOR
BE-CSE
AUXILIARY MEMORY
• Secondary memory is also termed external memory and refers to the various
storage media on which a computer can store data and programs.
• The Secondary storage media can be fixed or removable. Fixed Storage media
is an internal storage medium like a hard disk that is fixed inside the computer.
A storage medium that is portable and can be taken outside the computer is
termed removable storage media.
• Some examples of secondary memory devices include hard disk drives
(HDDs), solid-state drives (SSDs), magnetic tapes, optical discs such as CDs
and DVDs, and flash memory such as USB drives and memory cards. Each of
these devices uses different technologies to store data, but they all share the
common feature of being non-volatile, meaning that they can store data even
when the computer is turned off.
AUXILIARY MEMORY
• Secondary memory devices are accessed by the CPU via input/output (I/O)
operations, which involve transferring data between the device and
primary memory.
• The speed of these operations is affected by factors such as the type of
device, the size of the file being accessed, and the type of connection between
the device and the computer.
• Overall, secondary memory is an essential component of modern computing
systems and plays a critical role in the storage and retrieval of data and
programs.
Difference between Primary Memory and Secondary Memory:
Primary Memory Secondary Memory
Primary memory is directly Secondary memory is not accessed
accessed by the Central Processing directly by the Central Processing Unit
Unit (CPU). (CPU). Instead, data accessed from a
secondary memory is first loaded into
Random Access Memory (RAM) and
is then sent to the Processing Unit.
RAM provides a much faster- Secondary Memory is slower in data
accessing speed to data than accessing. Typically, primary memory
secondary memory. By loading is six times faster than secondary
software programs and required files memory.
into primary memory (RAM),
computers can process data much
more quickly.
Primary Memory Secondary Memory
Primary memory, i.e. Random Secondary memory provides a
Access Memory (RAM) is feature of being non-volatile, which
volatile and gets completely means it can hold on to its data with
erased when a computer is shut or without electrical power supply.
down.
Uses of Secondary Storage:
Removable Storage-
Removable storage is an external media device that is used by a computer system
to store data, and usually, these are referred to as the Removable Disks drives
or the External Drives. Removable storage is any type of storage device that can
be removed/ejected from a computer system while the system is running. Examples
of external devices include CDs, DVDs, and Blu-ray disk drives, as well as
diskettes and USB drives. Removable storage makes it easier for a user to
transfer data from one computer system to another. In storage
factors, the main benefit of removable disks is that they can provide the fast data
transfer rates associated with storage area networks (SANs)
Types of Removable Storage:
1. Slower access times: Accessing data from secondary memory devices typically
takes longer than accessing data from primary memory.
2. Mechanical failures: Some types of secondary memory devices, such as hard
disk drives, are prone to mechanical failures that can result in data loss.
3. Limited lifespan: Secondary memory devices have a limited lifespan, and
can only withstand a certain number of read and write cycles before they fail.
4. Data corruption: Data stored on secondary memory devices can become corrupted
due to factors such as electromagnetic interference, viruses, or physical damage.
Overall, secondary memory is an essential component of modern computing systems,
but it also has its limitations and drawbacks. The choice of a particular
secondary memory device depends on the user’s specific needs and requirements.
References
Reference Books:
●J.P. Hayes, “Computer Architecture and Organization”, Third Edition.
●Mano, M., “Computer System Architecture”, Third Edition, Prentice Hall.
●Stallings, W., “Computer Organization and Architecture”, Eighth Edition,
Pearson Education.
Text Books:
●Carpinelli J.D,” Computer systems organization &Architecture”, Fourth
Edition, Addison Wesley.
●Patterson and Hennessy, “Computer Architecture”, Fifth Edition Morgaon
Kauffman.
Reference Website:
●Memory Hierarchy Design and its Characteristics - GeeksforGeeks
●What is memory hierarchy? (tutorialspoint.com)
Video Links:
•https://youtu.be/oj4_c9IMOCg?si=1FMiriyxX7JwkE4h
•https://youtu.be/zwovvWfkuSg?si=fKl4vq0THSpiMa0M
•https://youtu.be/NVUWlO5zsk0?si=blZS6dXlR5NJOmKI
•https://youtu.be/_dfsb7Gwems?si=zYh3TYajVcYHgAe9
•https://youtu.be/w04AMAeR60k?si=oWfL6Rrk0_Yf-gZ0
University Institute of Engineering
Department of Computer Science & Engineering
ASSISTANT PROFESSOR
BE-CSE
ASSOCIATIVE MEMORY
A bit Aj in the argument register is compared with all the bits in column j of the
array supported that Kj = 1. This is completed for all columns j = 1, 2, ..., n.
If a match appears between all the unmasked bits of the argument and the bits
in word I, the corresponding bit Mi in the match register is set to 1. If one or
more unmasked bits of the argument and the word do not match, Mi is cleared
to 0.
Applications of Associative memory:
The data or contents of the main memory that are used frequently by CPU are
stored in the cache memory so that the processor can easily access that data in a
shorter time. Whenever the CPU needs to access memory, it first checks the
cache memory. If the data is not found in cache memory, then the CPU moves
into the main memory.
Cache memory is placed between the CPU and the main memory. The block
diagram for a cache memory can be represented as:
The cache is the fastest component in the memory hierarchy and approaches the
speed of CPU components.
L1 or Level 1 Cache: It is the first level of cache memory that is present inside
the processor. It is present in a small amount inside every core of the processor
separately. The size of this memory ranges from 2KB to 64 KB.
L2 or Level 2 Cache: It is the second level of cache memory that may present
inside or outside the CPU. If not present inside the core, It can be shared between
two cores depending upon the architecture and is connected to a processor
with the high-speed bus. The size of memory ranges from 256 KB to 512 KB.
L3 or Level 3 Cache: It is the third level of cache memory that is present outside
the CPU and is shared by all the cores of the CPU. Some high processors may
have this cache. This cache is used to increase the performance of the L2 and L1
cache. The size of this memory ranges from 1 MB to 8MB.
Locality of Reference
The main memory can store 32k words of 12 bits each. The cache is capable of
storing 512 of these words at any given time. For every word stored, there is a
duplicate copy in main memory. The CPU communicates with both memories. It
first sends a 15-bit address to cache. If there is a hit, the CPU accepts the 12-bit
data from cache. If there is a miss, the CPU reads the word from main memory and
the word is then transferred to cache.
•When a read request is received from CPU, contents of a block of
memory words containing the location specified are transferred in to cache.
•When the program references any of the locations in this block, the contents are
read from the cache Number of blocks in cache is smaller than number of
blocks in main memory.
•Correspondence between main memory blocks and those in the cache is
specified by a mapping function.
•Assume cache is full and memory word not in cache is referenced.
•Control hardware decides which block from cache is to be removed to create
space for new block containing referenced word from memory.
•Collection of rules for making this decision is called “Replacement algorithm”
Cache performance
●On searching in the cache if data is found, a cache hit has occurred.
●On searching in the cache if data is not found, a cache miss has occurred.
As we know that the cache memory bridges the mismatch of speed between the
main memory and the processor. Whenever a cache hit occurs,
• The word that is required is present in the memory of the cache. Then the
required word would be delivered from the cache memory to the CPU.
• And, whenever a cache miss occurs, the word that is required isn’t present in
the memory of the cache. The page consists of the required word that we need
to map from the main memory.
• We can perform such a type of mapping using various different techniques of
cache mapping.
Process of Cache Mapping
The process of cache mapping helps us define how a certain block that is present
in the main memory gets mapped to the memory of a cache in the case of any
cache miss.
In simpler words, cache mapping refers to a technique using which we bring the
main memory into the cache memory. Here is a diagram that illustrates the actual
process of mapping:
Important Note:
• The main memory gets divided into multiple partitions of equal size, known
as the frames or blocks.
• The cache memory is actually divided into various partitions of the same sizes
as that of the blocks, known as lines.
• The main memory block is copied simply to the cache during the process of
cache mapping, and this block isn’t brought at all from the main memory.
Cache Mapping Functions
Correspondence between main memory blocks and those in the cache is specified
by a memory mapping function.
• The block which has entered first in the main be replaced first.
• This can lead to a problem known as "Belady's Anomaly", it starts that if
we increase the no. of lines in cache memory the cache miss will increase.
• Belady's Anomaly: For some cache replacement algorithm, the page fault
or miss rate increase as the number of allocated frame increase.
• Example: Let we have a sequence 7, 0 ,1, 2, 0, 3, 0, 4, 2, 3, and cache
memory has 4 lines.
There are a total of 6 misses in the FIFO replacement policy.
2. LRU (Least Recently Used)
• The page which was not used for the largest period of time in the past will get
reported first.
• We can think of this strategy as the optimal cache- replacement algorithm
looking backward in time, rather than forward.
• LRU is much better than FIFO replacement.
• LRU is also called a stack algorithm and can never exhibit belady's anamoly.
• The problem which is most important is how to implement LRU replacement.
An LRU page replacement algorithm may require a sustainable hardware
resource.
• Example: Let we have a sequence 7, 0 ,1, 2, 0, 3, 0, 4, 2, 3 and cache memory
has 3 lines.
There are a total of 6 misses in the LRU replacement policy.
3. LFU (Least Frequently Used):
This cache algorithm uses a counter to keep track of how often an entry is
accessed. With the LFU cache algorithm, the entry with the lowest count is
removed first. This method isn't used that often, as it does not account for an
item that had an initially high access rate and then was not accessed for a long
time.
This algorithm has been used in ARM processors and the famous Intel i860.
Cache Design Issues
1.Cache Addresses:
-Logical Cache/Virtual Cache stores data using virtual addresses. It accesses
cache directly without going through MMU
-Physical Cache stores data using main memory physical addresses.
One obvious advantage of the logical cache is that cache access speed is
faster than for a physical cache, because the cache can respond before the
MMU performs an address translation.
The disadvantage has to do with the fact that most virtual memory
systems supply each application with the same virtual memory address space.
That is, each application sees a virtual memory that starts at address 0. Thus, the
same virtual address in two different applications refers to two different
physical addresses. The cache memory must therefore be completely flushed
with each application switch, or extra bits must be added to each line of the
cache to identify which virtual address space this address refers to.
2. Cache Size
The larger the cache, the larger the number of gates involved in addressing the
cache. The available chip and board area also limit cache size.
The more cache a system has, the more likely it is to register a hit on memory
access because fewer memory locations are forced to share the same cache line.
Although an increase in cache size will increase the hit ratio, a continuous
increase in cache size will not yield an equivalent increase of the hit ratio.
Note: An Increase in cache size from 256K to 512K (increase by 100%)
will yield a 10% improvement of the hit ratio, but an additional increase from
512K to 1024K would yield a less than 5% increase of the hit ratio (law of
diminishing marginal returns).
3. Replacement Algorithm
Once the cache has been filled, when a new block is brought into the cache, one
of the existing blocks must be replaced.
For direct mapping, there is only one possible line for any particular block, and
no choice is possible.
Direct mapping — No choice, Each block only maps to one line. Replace that
line.
For the associative and set-associative techniques, a replacement algorithm
is needed. To achieve high speed, such an algorithm must be implemented in
hardware.
Least Recently Used (LRU) — Most Effective
3. Replacement Algorithm
For two- way set associative, this is easily implemented. Each line includes a
USE bit. When a line is referenced, its USE bit is set to 1 and the USE bit of the
other line in that set is set to 0. When a block is to be read into the set, the line
whose USE bit is 0 is used.
Because we are assuming that more recently used memory locations are
more likely to be referenced, LRU should give the best hit ratio. LRU is also
relatively easy to implement for a fully associative cache. The cache
mechanism maintains a separate list of indexes to all the lines in the cache.
When a line is referenced, it moves to the front of the list. For replacement, the
line at the back of the list is used. Because of its simplicity of implementation,
LRU is the most popular replacement algorithm.
4. Write Policy
When you are saving changes to main memory. There are two techniques
involved:
Write Through:
• Every time an operation occurs, you store to main memory as well as cache
simultaneously. Although that may take longer, it ensures that main memory is
always up to date and this would decrease the risk of data loss if the system
would shut off due to power loss. This is used for highly sensitive
information.
• One of the central caching policies is known as write-through. This means
that data is stored and written into the cache and to the primary storage device
at the same time.
• One advantage of this policy is that it ensures information will be stored safely
without risk of data loss. If the computer crashes or the power goes out, data
can still be recovered without issue.
• To keep data safe, this policy has to perform every write operation twice. The
program or application that is being used must wait until the data has been
written to both the cache and storage device before it can proceed.
• This comes at the cost of system performance but is highly recommended for
sensitive data that cannot be lost.
• Many businesses that deal with sensitive customer information such as
payment details would most likely choose this method since that data is very
critical to keep intact.
Write Back:
• Saves data to cache only.
• But at certain intervals or under a certain condition you would save data to the
main memory.
• Disadvantage: there is a high probability of data loss.
5. Line Size
Another design element is the line size. When a block of data is retrieved and
placed in the cache, not only the desired word but also some number of adjacent
words are retrieved.
As the block size increases from very small to larger sizes, the hit ratio will at
first increase because of the principle of locality, which states that data in the
vicinity of a referenced word are likely to be referenced in the near future.
As the block size increases, more useful data are brought into the cache. The hit
ratio will begin to decrease, however, as the block becomes even bigger and the
probability of using the newly fetched information becomes less than the
probability of reusing the information that has to be replaced.
Two specific effects come into play:
• Larger blocks reduce the number of blocks that fit into a cache. Because each
block fetch overwrites older cache contents, a small number of blocks results
in data being overwritten shortly after they are fetched.
• As a block becomes larger, each additional words is farther from the requested
word and therefore less likely to be needed in the near future.
6. Number of Caches
Multilevel Caches:
•On chip cache accesses are faster than cache reachable via an external bus.
•On chip cache reduces the processor’s external bus activity and therefore
speeds up execution time and system performance since bus access times are
eliminated.
•L1 cache always on chip (fastest level)
•L2 cache could be off the chip in static ram
•L2 cache doesn’t use the system bus as the path for data transfer between the
L2 cache and processor, but it uses a separate data path to reduce the burden on
the system bus. (System bus takes longer to transfer data)
•In modern designed computers L2 cache may now be on the chip. Which
means that an L3 cache can be added over the external bus. However, some L3
caches can be installed on the microprocessor as well.
•In all of these cases there is a performance advantage to adding a third level
cache.
Unified (One cache for data and instructions) vs Split (two, one for data
and one for instructions)
These two caches both exist at the same level, typically as two L1 caches. When
the processor attempts to fetch an instruction from main memory, it first consults
the instruction L1 cache, and when the processor attempts to fetch data from
main memory, it first consults the data L1 cache.
7. Mapping Function
Because there are fewer cache lines than main memory blocks, an algorithm is
needed for mapping main memory blocks into cache lines
Further, a means is needed for determining which main memory block currently
occupies a cache line. The choice of the mapping function dictates how the
cache is organized. Three techniques can be used: direct, associative, and set-
associative.
Cache vs RAM
Although Cache and RAM both are used to increase the performance of the
system there exists a lot of differences in which they operate to increase the
efficiency of the system.
RAM Cache
RAM is larger in size compared to The cache is smaller in size. Memory
cache. Memory ranges from 1MB to ranges from 2KB to a few MB
16GB generally.
It stores data that is currently processed It holds frequently accessed data.
by the processor.
OS interacts with secondary memory to OS interacts with primary memory
get data to be stored in Primary Memory to get data to be stored in Cache.
or RAM
It is ensured that data in RAM are loaded CPU searches for data in Cache, if not
before access to the CPU. This found cache miss occur.
eliminates RAM miss never.
Differences between associative and cache memory:
Reference Books:
●J.P. Hayes, “Computer Architecture and Organization”, Third Edition.
●Mano, M., “Computer System Architecture”, Third Edition, Prentice Hall.
●Stallings, W., “Computer Organization and Architecture”, Eighth Edition,
Pearson Education.
Text Books:
●Carpinelli J.D,” Computer systems organization &Architecture”, Fourth
Edition, Addison Wesley.
●Patterson and Hennessy, “Computer Architecture”, Fifth Edition Morgaon
Kauffman.
Reference Website:
Video Links:
●https://youtu.be/SV7Kk1njt5c?si=ffVZ8zVOF2qW4oqk
●https://youtu.be/wI6_dl4WjlY?si=Hoz7CndJ95pQ71Yz
●https://youtu.be/OfqzoQ9Kw9k?si=K3S7xGMboveTzY7z
●https://youtu.be/QZ_9Oe5E61Q?si=mslQJSaHwmd-Kbkj
●https://youtu.be/hhLdy3J9oqg?si=CVqVMV1QaViTcp4Q
●https://youtu.be/VNw00047giw?si=gUGe-WSt-Hyd3kzX
●https://youtu.be/5LmyIpJcd9I?si=IjbnbbbzAkuldULz
University Institute of Engineering
Department of Computer Science & Engineering
ASSISTANT PROFESSOR
BE-CSE
PAGING AND SEGMENTATION
Paging and segmentation are processes by which data is stored to, then
retrieved from, a computer's storage disk.
WHAT IS PAGING?
A page table stores the definition of each page. When an active process requests
data, the MMU retrieves corresponding pages into frames located in physical
memory for faster processing. The process is called paging.
The MMU uses page tables to translate virtual addresses to physical ones. Each
table entry indicates where a page is located: in RAM or on disk as virtual
memory. Tables may have a single or multi-level page table such as different
tables for applications and segments.
However, constant table lookups can slow down the MMU. A memory
cache called the Translation Lookaside Buffer (TLB) stores recent translations
of virtual to physical addresses for rapid retrieval. Many systems have multiple
TLBs, which may reside at different locations, including between the CPU and
RAM, or between multiple page table levels.
Different frame sizes are available for data sets with larger or smaller pages and
matching- sized frames. 4KB to 2MB are common sizes, and GB-sized
frames are available in high- performance servers.
When a process executes, segmentation assigns related data into segments for
faster processing. The segmentation function maintains a segment table that
includes physical addresses of the segment, size, and other data.
In segmentation, the CPU generates a logical address that contains the Segment
number and segment offset. If the segment offset is a smaller amount than the
limit then the address called valid address otherwise it throws miscalculation
because the address is invalid.
The above figure shows the translation of a logical address to a physical address.
Each segment stores the process primary function, data structures, and utilities. The
CPU keeps a segment map table for every process and memory blocks, along with
segment identification and memory locations.
The CPU generates virtual addresses for running processes. Segmentation translates
the CPU- generated virtual addresses into physical addresses that refer to a
unique physical memory location. The translation is not strictly one-to-one:
different virtual addresses can map to the same physical address.
THE CHALLENGE OF FRAGMENTATION
Some modern computers use a function called segmented paging. Main memory is
divided into variably-sized segments, which are then divided into smaller fixed-size
pages on disk. Each segment contains a page table, and there are multiple page
tables per process.
Each of the tables contains information on every segment page, while the segment
table has information about every segment. Segment tables are mapped to page
tables, and page tables are mapped to individual pages within a segment.
Advantages include less memory usage, more flexibility on page sizes, simplified
memory allocation, and an additional level of data access security over paging. The
process does not cause external fragmentation.
Advantages of Paging:
Disadvantages of Paging:
• No internal fragmentation.
• Segment tables consumes less space compared to page tables.
• Average segment sizes are larger than most page sizes, which allows segments to
store more process data.
• Less processing overhead.
• Simpler to relocate segments than to relocate contiguous address spaces on disk.
• Segment tables are smaller than page tables, and takes up less memory.
Disadvantages of Segmentation:
Size:
• Paging: Fixed block size for pages and frames. Computer hardware
determines page/frame sizes.
• Segmentation: Variable size segments are user-specified.
Fragmentation:
• Paging: Older systems were subject to internal fragmentation by not allocating
entire pages to memory. Modern OS’s no longer have this problem.
• Segmentation: Segmentation leads to external fragmentation.
KEY DIFFERENCES: PAGING AND SEGMENTATION
Tables:
• Paging: Page tables direct the MMU to page location and status. This is a
slower process than segmentation tables, but TLB memory cache accelerates it.
• Segmentation: Segmentation tables contain segment ID and information, and
are faster than direct paging table lookups.
Availability:
• Paging: Widely available on CPUs and as MMU chips.
• Segmentation: Windows servers may support backwards compatibility, while
Linux has very limited support.
References
Reference Books:
•J.P. Hayes, “Computer Architecture and Organization”, Third Edition.
•Mano, M., “Computer System Architecture”, Third Edition, Prentice Hall.
•Stallings, W., “Computer Organization and Architecture”, Eighth Edition,
Pearson Education.
Text Books:
•Carpinelli J.D,” Computer systems organization &Architecture”, Fourth
Edition, Addison Wesley.
•Patterson and Hennessy, “Computer Architecture”, Fifth Edition Morgaon
Kauffman.
Other References
•https://www.ques10.com/p/10067/what-is-virtual-memory-explain-the-role-of-
pagin- 1/
•https://www.enterprisestorageforum.com/storage-hardware/paging-and-
segmentation.html
•https://www.cmpe.boun.edu.tr/~uskudarli/courses/cmpe235/Virtual%20Memory
.pdf
University Institute of Engineering
Department of Computer Science & Engineering
ASSISTANT PROFESSOR
BE-CSE
MEMORY MANAGEMENT HARDWARE
The memory management hardware consists of several key components that work
together to ensure efficient memory usage and allocation. These components include:
• Memory Management Unit (MMU): The MMU is a critical component of
memory management hardware. It translates virtual addresses to physical
addresses, enabling the system to access the correct memory location.
• Translation Lookaside Buffer (TLB): The TLB is a cache that stores
recently used virtual-to-physical address translations, speeding up the address
translation process.
• Memory Segmentation Unit: This unit divides the memory into fixed-size
segments to organize and manage memory resources efficiently.
• Memory Protection Unit (MPU): The MPU ensures the security and
protection of memory by enforcing access permissions and preventing
unauthorized access.
Memory Management Unit (MMU)
The Memory Management Unit (MMU) is a key component of memory
management hardware in computer architecture. It performs the essential
task of translating virtual addresses generated by the CPU into physical
addresses, allowing the system to access the correct memory location.
The MMU works in conjunction with the operating system's memory
management software to allocate and manage memory resources effectively.
It uses a technique called address translation, which involves converting
virtual addresses to physical addresses by utilizing page tables or translation
tables.
The MMU also plays a vital role in memory protection by implementing
memory access control and prevention mechanisms. It enforces access
permissions, ensuring that each process can only access its allocated memory
and preventing unauthorized access to sensitive information.
Translation Lookaside Buffer (TLB)
The Translation Lookaside Buffer (TLB) is a cache in the memory management
hardware that stores recently used virtual-to-physical address translations. It acts
as a high-speed memory for address translation, improving the overall
performance of the system.
When the CPU generates a virtual address, the TLB checks if the translation for
that address is available in its cache. If the translation is found, the TLB provides
the corresponding physical address, eliminating the need for a time-consuming
lookup in the page tables or translation tables.
The TLB operates on the principle of locality, which states that recently
accessed memory locations are likely to be accessed again in the near
future. By storing frequently used translations, the TLB reduces the
overhead of address translation, improving system performance.
Functions of Memory Management Hardware
Reference Books:
•J.P. Hayes, “Computer Architecture and Organization”, Third Edition.
•Mano, M., “Computer System Architecture”, Third Edition, Prentice Hall.
•Stallings, W., “Computer Organization and Architecture”, Eighth Edition,
Pearson Education.
Text Books:
•Carpinelli J.D,” Computer systems organization &Architecture”, Fourth
Edition, Addison Wesley.
•Patterson and Hennessy, “Computer Architecture”, Fifth Edition Morgaon
Kauffman.
Other References
•https://www.ques10.com/p/10067/what-is-virtual-memory-explain-the-role-
of-pagin- 1/
•https://www.enterprisestorageforum.com/storage-hardware/paging-and-
segmentation.html
•https://www.cmpe.boun.edu.tr/~uskudarli/courses/cmpe235/Virtual%20Memo
ry.pdf
University Institute of Engineering
Department of Computer Science & Engineering
ASSISTANT PROFESSOR
BE-CSE
INPUT-OUTPUT ORGANIZATION
Input or output devices attached to the computer are also called peripherals.
•The display terminal can operate in a single-character mode where all
characters entered on the screen through the keyboard are
transmitted to the computer simultaneously. In the block mode, the
edited text is first stored in a local memory inside the terminal. The text is
transferred to the computer as a block of data.
•Printers provide a permanent record on paper of computer output data.
•Magnetic tapes are used mostly for storing files of data.
•Magnetic disks have high-speed rotational surfaces coated with magnetic
material.
Input-Output Interface
The I/O bus consists of data lines, address lines, and control lines. The magnetic
disk, printer, and terminal are employed in practically any general-purpose
computer. The interface selected responds to the function code and proceeds to
execute it.
The function code is referred to as an I/O command and is in essence an
instruction that is executed in the interface and its attached peripheral unit.
There are three ways that computer buses can be used to communicate with
memory and I/O:
1.Use two separate buses, one for memory and the other for I/O.
2.Use one common bus for both memory and I/O but have separate control lines
for each.
3.Use one common bus for memory and I/O with common control lines.
Asynchronous Data Transfer
•Strobe Control: pulse supplied by one of the units to indicate to the other unit
when the transfer has to occur.
•Handshaking: The unit receiving the data item responds with another control
signal to acknowledge receipt of the data.
1. Strobe Control Method
The Strobe Control method of asynchronous data transfer employs a single
control line to time each transfer. This control line is also known as a strobe, and
it may be achieved either by source or destination, depending on which initiate
the transfer.
•Source initiated strobe: In the below block diagram, you can see that strobe is
initiated by source, and as shown in the timing diagram, the source unit first
places the data on the data bus.
After a brief delay to ensure that the data resolve to a stable value, the source
activates a strobe pulse.
The information on the data bus and strobe control signal remains in the active
state for a sufficient time to allow the destination unit to receive the data.
The destination unit uses a falling edge of strobe control to transfer the contents
of a data bus to one of its internal registers.
The source removes the data from the data bus after it disables its strobe pulse.
Thus, new valid data will be available only after the strobe is enabled
again.
In this case, the strobe may be a memory-write control signal from the
CPU to a memory unit. The CPU places the word on the data bus and informs
the memory unit, which is the destination.
• Destination initiated strobe: In the below block diagram, you see that the
strobe initiated by destination, and in the timing diagram, the destination unit
first activates the strobe pulse, informing the source to provide the data.
•The source unit responds by placing the requested binary information on the
data bus. The data must be valid and remain on the bus long enough for the
destination unit to acceptit.
•The falling edge of the strobe pulse can use again to trigger a destination
register. The destination unit then disables the strobe. Finally, and source
removes the data from the data bus after a determined time interval. In this
case, the strobe may be a memory read control from the CPU to a memory
unit. The CPU initiates the read operation to inform the memory, which is a
source unit, to place the selected word into the data bus.
2. Handshaking Method
The strobe method has the disadvantage that the source unit that initiates the
transfer has no way of knowing whether the destination has received the
data that was placed in the bus. Similarly, a destination unit that initiates the
transfer has no way of knowing whether the source unit has placed data on the
bus.
So, this problem is solved by the handshaking method. The handshaking
method introduces a second control signal line that replays the unit that
initiates the transfer.
In this method, one control line is in the same direction as the data flow in the
bus from the source to the destination. The source unit uses it to inform the
destination unit whether there are valid data in the bus.
The other control line is in the other direction from the destination to the
source. This is because the destination unit uses it to inform the source whether
it can accept data. And in it also, the sequence of control depends on the unit
that initiates the transfer.
So, it means the sequence of control depends on whether the transfer is
initiated by source and destination.
• Source initiated handshaking: In the below block diagram, you can see
that two handshaking lines are "data valid", which is generated by the
source unit, and "data accepted", generated by the destination unit.
The timing diagram shows the timing relationship of the exchange of signals
between the two units.
The source initiates a transfer by placing data on the bus and enabling its data
valid signal.
The destination unit then activates the data accepted signal after it accepts the
data from the bus.
The source unit then disables its valid data signal, which invalidates the data on
the bus.
After this, the destination unit disables its data accepted signal, and the system
goes into its initial state.
The source unit does not send the next data item until after the destination unit
shows readiness to accept new data by disabling the data accepted signal.
This sequence of events described in its sequence diagram, which shows the
above sequence in which the system is present at any given time.
•Destination initiated handshaking:
oIt is more flexible, and devices can exchange information at their own pace. In
addition, individual data characters can complete themselves so that even
if one packet is corrupted, its predecessors and successors will not be
affected.
oIt does not require complex processes by the receiving device. Furthermore, it
means that inconsistency in data transfer does not result in a big crisis since
the device can keep up with the data stream. It also makes asynchronous
transfers suitable for applications where character data is generated
irregularly.
Disadvantages of Asynchronous Data Transfer
There are also some disadvantages of using asynchronous data for transfer
in computer organization, such as:
o The success of these transmissions depends on the start bits and their
recognition. Unfortunately, this can be easily susceptible to line interference,
causing these bits to be corrupted or distorted.
o A large portion of the transmitted data is used to control and identify header
bits and thus carries no helpful information related to the transmitted
data. This invariably means that more data packets need to be sent.
References
Reference Books:
•J.P. Hayes, “Computer Architecture and Organization”, Third Edition.
•Mano, M., “Computer System Architecture”, Third Edition, Prentice Hall.
•Stallings, W., “Computer Organization and Architecture”, Eighth
Edition, Pearson Education.
Text Books:
•Carpinelli J.D,” Computer systems organization &Architecture”, Fourth
Edition, Addison Wesley.
•Patterson and Hennessy, “Computer Architecture”, Fifth Edition Morgaon
Kauffman.
Other References
•https://tutorialspoint.dev/computer-science/computer-organization-and-
architecture/io-interface-interrupt-dma-mode
•Asynchronous Data Transfer in Computer Organization - Javatpoint
•https://www.studytonight.com/computer-architecture/input-output-processor
•Handshaking in Computer architecture (includehelp.com)
University Institute of Engineering
Department of Computer Science & Engineering
ASSISTANT PROFESSOR
BE-CSE
INTERRUPTS
Before interrupts, the CPU had to wait for the signal to process by continuously
checking for related hardware and software components (a method used earlier,
referred to as "polling"). This method expends CPU cycles waiting, reducing our
efficiency. The effective usage time of the CPU is reduced, reducing response time,
and there's also increased power consumption.
The solution proposed to the above problem was the use of interrupts. In this
method, instead of constantly checking for a signal, the CPU receives a signal from
the hardware or software components. The process of sending the signal by
hardware or software components is referred to as sending interrupts or interrupting
the system.
INTERRUPTS
The interrupts divert the CPU's attention to the signal request from the running
process. Once the interrupt signal is handled, the control is transferred back to the
previous process to continue from the exact position where it had left off.
Types of Interrupts
1. Hardware interrupt
2. Software interrupt
1. Hardware Interrupt: Interrupts generated by the hardware are referred to as
hardware interrupts. The failure of hardware components and the completion of
I/O can trigger hardware interrupts.
After the interrupt has occurred, the CPU needs to pass the control from the
current process to service the interrupts instead of moving to the next instruction
in the current process. Before transferring the control over to the interrupt
generated program, we need to store the state of the currently running process.
The summary of the process followed during an interrupt is as below:
3. The system sends an acknowledgement for the interrupt, and the interrupt
signal from the source stops on receiving the acknowledgement.
4. The process's state of the current task is stored (register values, address of the
next instruction to be processed when the control comes back to the process
(Program counter in PC register), etc.), i.e., moved to the stack.
5. The processor now handles the interrupt and executes the interrupt generated
program.
6. After handling the interrupt, the control is sent back to the point in the original
process using the state information of the process that we saved earlier.
Interrupt Triggering Methods
The processor samples the interrupt input signal during each instruction
cycle. The processor will recognize the interrupt request if the signal is
asserted when sampling occurs.
Level-triggered inputs allow multiple devices to share a common interrupt
signal via wired-OR connections. The processor polls to determine which
devices are requesting service. After servicing a device, the processor may again
poll and, if necessary, service other devices before exiting the ISR.
• Edge-triggered: An edge-triggered interrupt is an interrupt signaled by a
level transition on the interrupt line, either a falling edge (high to low) or a
rising edge (low to high). A device wishing to signal an interrupt drives a pulse
onto the line and releases it to its inactive state. If the pulse is too short to be
detected by polled I/O, then special hardware may be required to detect it.
Advantages of Interrupts
Reference Books:
•J.P. Hayes, “Computer Architecture and Organization”, Third Edition.
•Mano, M., “Computer System Architecture”, Third Edition, Prentice Hall.
•Stallings, W., “Computer Organization and Architecture”, Eighth Edition,
Pearson Education.
Text Books:
•Carpinelli J.D,” Computer systems organization &Architecture”, Fourth
Edition, Addison Wesley.
•Patterson and Hennessy, “Computer Architecture”, Fifth Edition Morgaon
Kauffman.
Other References
•http://www.ecs.csun.edu/~cputnam/Comp546/Input-Output-Web.pdf
•http://www.ioenotes.edu.np/media/notes/computer-organization-and-
architecture- coa/Chapter7-Input-Output-Organization.pdf
•https://www.geeksforgeeks.org/io-interface-interrupt-dma-mode/
•I/O Interface (Interrupt and DMA Mode) – GeeksforGeeks
University Institute of Engineering
Department of Computer Science & Engineering
ASSISTANT PROFESSOR
BE-CSE
INPUT-OUTPUT INTERFACE
•The method that is used to transfer information between internal storage and
external I/O devices is known as I/O interface.
•The CPU is interfaced using special communication links by the peripherals
connected to any computer system. These communication links are used to
resolve the differences between CPU and peripheral.
•There exist special hardware components between CPU and peripherals to
supervise and synchronize all the input and output transfers that are called
interface units.
Modes of Transfer
The binary information that is received from an external device is usually stored
in the memory unit. The information that is transferred from the CPU to the
external device is originated from the memory unit. CPU merely processes the
information but the source and target are always the memory unit. Data transfer
between CPU and the I/O devices may be done in different modes.
Data transfer to and from the peripherals may be done in any of the three
possible ways
1.Programmed I/O
2.Interrupt-initiated I/O
3.Direct memory access (DMA)
1. Programmed I/O
It is one of the simplest forms of I/O where the CPU has to do all the work. It is
due to the result of the I/O instructions that are written in the computer
program. Each data item transfer is initiated by an instruction in the program.
Usually, the transfer is from a CPU register and memory. In this case it
requires constant monitoring by the CPU of the peripheral devices. This
technique is called Programmed I/O.
2.The program waits for the ready status by repeatedly testing the status
bit(s) and till all the targeted bytes are written to the device.
3.The program in busy (non-waiting) state only after the device gets ready
else wait state.
I/O Commands
• Control: Used to activate a peripheral and tell it what to do. For example, a
magnetic- tape unit may be instructed to rewind or to move forward one record.
These commands are tailored to the particular type of peripheral device.
• Test: Used to test various status conditions associated with an I/O module
and its peripherals. The processor will want to know that the peripheral of
interest is powered on and available for use. It will also want to know if the most
recent I/O operation is completed and if any errors occurred.
• Read: Causes the I/O module to obtain an item of data from the peripheral and
place it in an internal buffer. The processor can then obtain the data item by
requesting that the I/O module place it on the data bus.
• Write: Causes the I/O module to take an item of data (byte or word) from the
data bus and subsequently transmit that data item to the peripheral.
I/O Instruction
With memory mapped I/O, there is a single address space for memory
locations and I/O devices and the processor treats the status and data
registers of I/O modules as memory locations and uses the same machine
instructions to access both memory and I/O devices. So, for example, with 10
address lines, a combined total of = 1024 memory locations and I/O addresses
can be supported, in any combination. With memory-mapped I/O, a single read
line and a single write line are needed on the bus.
With isolated I/O, the bus may be equipped with memory read and write plus
input and output command lines. Now, the command line specifies whether
the address refers to a memory location or an I/O device. The full range of
addresses may be available for both. Again, with 10 address lines, the system
may now support both 1024 memory locations and 1024 I/O addresses.
•Simple to implement
•Very little hardware support
Reference Books:
•J.P. Hayes, “Computer Architecture and Organization”, Third Edition.
•Mano, M., “Computer System Architecture”, Third Edition, Prentice Hall.
•Stallings, W., “Computer Organization and Architecture”, Eighth Edition,
Pearson Education.
Text Books:
•Carpinelli J.D,” Computer systems organization &Architecture”, Fourth
Edition, Addison Wesley.
•Patterson and Hennessy, “Computer Architecture”, Fifth Edition Morgaon
Kauffman.
Other References
ASSISTANT PROFESSOR
BE-CSE
MODES OF TRANSFER
Data transfer to and from the peripherals may be done in any of the three possible
ways
1.Programmed I/O
2.Interrupt-driven I/O
3.Direct memory access (DMA)
2. Interrupt-driven I/O
Interrupt driven I/O is an alternative scheme dealing with I/O. Interrupt I/O is
a way of controlling input/output activity whereby a peripheral or terminal
that needs to make or receive a data transfer sends a signal. This will cause a
program interrupt to be set.
At a time appropriate to the priority level of the I/O interrupt relative to the
total interrupt system, the processors enter an interrupt service routine. The
function of the routine will depend upon the system of interrupt levels
and priorities that is implemented in the processor. The interrupt
technique requires more complex hardware and software, but makes far
more efficient use of the computer’s time and capacities. Figure below shows
the simple interrupt processing.
Isolated I/O Memory Mapped I/O
Isolated I/O uses separate memory Memory mapped I/O uses memory
space. from the main memory.
Limited instructions can be used. Any instruction which references to
Those are IN, OUT, INS, OUTS. memory can be used.
The address for Isolated I/O devices Memory mapped I/O devices are
are called ports treated as memory locations on the
memory map.
For input, the device interrupts the CPU when new data has arrived and is
ready to be retrieved by the system processor. The actual actions to perform
depend on whether the device uses I/O ports or memory mapping.
For output, the device delivers an interrupt either
when it is ready to accept new data or to
acknowledge a successful data transfer.
Memory-mapped and DMA-capable devices
usually generate interrupts to tell the system they
are done with the buffer.
Here the CPU works on its given tasks
continuously. When an input is available, such as
when someone types a key on the keyboard, then
the CPU is interrupted from its work to take care
of the input data. The CPU can work continuously
on a task without checking the input devices,
allowing the devices themselves to interrupt it
as necessary.
Basic Operations of Interrupt
2.The device driver signals the I/O controller for the proper
device, which initiates the requested I/O.
•Fast
•Efficient
The transfer of data between the peripheral and memory without the interaction
of CPU and letting the peripheral device manage the memory bus directly is
termed as Direct Memory Access (DMA).
The two control signals Bus Request and Bus
Grant are used to fascinate the DMA transfer. The
bus request input is used by the DMA controller
to request the CPU for the control of the buses.
When BR signal is high, the CPU terminates the
execution of the current instructions and then
places the address, data, read and write lines to
the high impedance state and sends the bus grant
signal. The DMA controller now takes the control CPU bus signal for DMA transfer
of the buses and transfers the data directly
between memory and I/O without processor
interaction.
When the transfer is completed, the bus request signal is made low by DMA. In
response to which CPU disables the bus grant and again CPU takes the control of
address, data, read and write lines.
The transfer of data between the memory and I/O of course facilitates in two ways
which are DMA Burst and Cycle Stealing.
• CPU is usually much faster than I/O (DMA), thus CPU uses the most of the
memory cycles
• DMA Controller steals the memory cycles from CPU
• For those stolen cycles, CPU remains idle
• For those slow CPU, DMA Controller may steal most of the memory cycles
which may cause CPU remain idle long time
DMA CONTROLLER
The DMA controller communicates with the CPU through the data bus and control
lines. DMA select signal is used for selecting the controller, the register select is for
selecting the register.
When the bus grant signal is zero, the CPU communicates through the data bus to
read or write into the DMA register. When bus grant is one, the DMA controller
takes the control of buses and transfers the data between the memory and I/O.
Block diagram of DMA controller
The address register specifies the desired location of the memory which is
incremented after each word is transferred to the memory.
The word count register holds the number of words to be transferred which is
decremented after each transfer until it is zero. When it is zero, it indicates
the end of transfer.
After which the bus grant signal from CPU is made low and CPU returns to its
normal operation. The control register specifies the mode of transfer which is
Read or Write.
DMA TRANSFER
Reference Books:
•J.P. Hayes, “Computer Architecture and Organization”, Third Edition.
•Mano, M., “Computer System Architecture”, Third Edition, Prentice Hall.
•Stallings, W., “Computer Organization and Architecture”, Eighth Edition,
Pearson Education.
Text Books:
•Carpinelli J.D,” Computer systems organization &Architecture”, Fourth
Edition, Addison Wesley.
•http://www.ecs.csun.edu/~cputnam/Comp546/Input-Output-Web.pdf
•http://www.ioenotes.edu.np/media/notes/computer-organization-and-
architecture- coa/Chapter7-Input-Output-Organization.pdf
•https://www.geeksforgeeks.org/io-interface-interrupt-dma-mode/
•I/O Interface (Interrupt and DMA Mode) - GeeksforGeeks
University Institute of Engineering
Department of Computer Science & Engineering
ASSISTANT PROFESSOR
BE-CSE
I/O PROCESSORS
• Processor with direct memory access capability that communicates with I/O
devices
• Channel can execute a Channel Program
• Stored in the main memory
• Consists of Channel Command Word (CCW)
• Each CCW specifies the parameters needed by the channel to control the I/O
devices and perform data transfer operations
• CPU initiates the channel by executing a channel I/O class instruction
and once initiated, channel operates independently of the CPU
A computer may incorporate one or more external processors and assign
them the task of communicating directly with the I/O devices so that no each
interface needs to communicate with the CPU.
An I/O processor (IOP) is a processor with direct memory access capability that
communicates with I/O devices.
IOP instructions are specifically designed to facilitate I/O transfer. The IOP
can perform other processing tasks such as arithmetic logic, branching and code
translation.
Block diagram of a computer with I/O Processor
The memory unit occupies a central position and can communicate with each
processor by means of direct memory access. The CPU is responsible for
processing data needed in the solution of computational tasks. The IOP provides
a path for transferring data between various peripheral devices and memory unit.
In most computer systems, the CPU is the master while the IOP is a slave
processor.
The CPU initiates the IOP and after which the IOP operates independent
of CPU and transfer data between the peripheral and memory.
For example, the IOP receives 5 bytes from an input device at the device
rate and bit capacity. After which the IOP packs them into one block of 40 bits
and transfer them to memory. Similarly, the O/P word transfer from memory to
IOP is directed from the IOP to the O/P device at the device rate and bit
capacity.
CPU – IOP COMMUNICATION
The memory unit acts as a message center where each processor leaves
information for the other. The operation of typical IOP is appreciated with the
example by which the CPU and IOP communication.
•Distributes and collects data from many remote terminals connected through
telephone and other communication lines.
•Transmission:
oSynchronous
oAsynchronous
•Transmission Error:
oParity
oChecksum
oCyclic Redundancy Check
oLongitudinal Redundancy Check
•Transmission Modes:
oSimples
oHalf Duplex o Full Duplex
•Data Link & Protocol
Data can be transmitted between two points through three different modes. First
is simplex where data can be transmitted in only one direction such as TV
broadcasting. Second is half duplex where data can be transmitted in both
directions at a time such as walkie-talkie. The third is full duplex where data
can be transmitted in both directions simultaneously such as telephone.
The communication lines, modems and other equipment used in the
transmission of information between two or more stations is called data
link. The orderly transfer of information in a data link is accomplished by
means of a protocol.
References
Reference Books:
•J.P. Hayes, “Computer Architecture and Organization”, Third Edition.
•Mano, M., “Computer System Architecture”, Third Edition, Prentice Hall.
•Stallings, W., “Computer Organization and Architecture”, Eighth Edition,
Pearson Education.
Text Books:
•Carpinelli J.D,” Computer systems organization &Architecture”, Fourth
Edition, Addison Wesley.
•Patterson and Hennessy, “Computer Architecture”, Fifth Edition Morgaon
Kauffman.
Other References
•http://www.ecs.csun.edu/~cputnam/Comp546/Input-Output-Web.pdf
•http://www.ioenotes.edu.np/media/notes/computer-organization-and-
architecture- coa/Chapter7-Input-Output-Organization.pdf
•https://www.geeksforgeeks.org/io-interface-interrupt-dma-mode/
•I/O Interface (Interrupt and DMA Mode) - GeeksforGeeks