0% found this document useful (0 votes)
27 views268 pages

Coa Unit 2

The document discusses the design and characteristics of control units in computer architecture, specifically focusing on Hardwired Control Units and Micro-programmed Control Units. Hardwired Control Units generate control signals using fixed hardware circuits, making them faster but less flexible, while Micro-programmed Control Units use programmable microinstructions stored in control memory, allowing for easier modifications but generally slower performance. A comparative analysis highlights the advantages and disadvantages of both types of control units, emphasizing their operational differences and applications.

Uploaded by

JAYESH SAINI
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views268 pages

Coa Unit 2

The document discusses the design and characteristics of control units in computer architecture, specifically focusing on Hardwired Control Units and Micro-programmed Control Units. Hardwired Control Units generate control signals using fixed hardware circuits, making them faster but less flexible, while Micro-programmed Control Units use programmable microinstructions stored in control memory, allowing for easier modifications but generally slower performance. A comparative analysis highlights the advantages and disadvantages of both types of control units, emphasizing their operational differences and applications.

Uploaded by

JAYESH SAINI
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 268

University Institute of Engineering

Department of Computer Science & Engineering

COMPUTER ORGANIZATION & ARCHITECTURE


(23CST-204/23ITT-204)

ER. SHIKHA ATWAL


E11186

ASSISTANT PROFESSOR

BE-CSE
DESIGN OF CONTROL
UNIT

To execute an instruction, the control unit


of the CPU must generate the required
control signal in the proper sequence.
There are two approaches used for
generating the control signals in proper
sequence as Hardwired Control unit and
Micro-programmed control unit.

Computer system components


The Control Unit is classified into two major categories:

•Hardwired Control Unit


•Microprogrammed Control Unit

Hardwired Control Unit

• A hardwired control unit is a method of generating control signals with the


help of Finite State Machines (FSM).
• The control signals that are necessary for instruction execution control in the
Hardwired Control Unit are generated by specially built hardware logical
circuits, and we can’t change the signal production mechanism without
physically changing the circuit structure.
Block Diagram of
Hardwired Control Unit
Characteristics of Hardwired Control Unit

•Two decoders, sequence counter and logic gates make up a Hardwired Control.
•The instruction register stores an instruction retrieved from the memory unit
(IR).
•An instruction register consists of the operation code, the I bit, and bits 0
through 11.
•A 3 x 8 decoder is used to encode the operation code in bits 12 through 14.
•The decoder’s outputs are denoted by the letters D0 through D7.
•The bit 15 operation code is transferred to a flip-flop with the symbol I.
•The control logic gates are programmed with operation codes from bits 0 to 11.
•The sequence counter (or SC) can count from 0 to 15 in binary.
Designing of Hardwired Control Unit

The following are some of the ways for constructing hardwired control logic that
have been proposed:

•Sequence Counter Method − It is the most practical way to design a


somewhat complex controller.
•Delay Element Method – For creating the sequence of control signals, this
method relies on the usage of timed delay elements.
•State Table Method − The standard algorithmic approach to designing
the Notes controller utilizing the classical state table method is used in this
method.
Working of a Hardwired Control Unit

The basic data for control signal creation is contained in the operation code of an
instruction. The operation code is decoded in the instruction decoder. The
instruction decoder is a collection of decoders that decode various fields of the
instruction opcode.

As a result, only a few of the instruction decoder’s output lines have active signal
values. These output lines are coupled to the matrix’s inputs, which provide control
signals for the computer’s executive units. This matrix combines the decoded
signals from the instruction opcode with the outputs from that matrix which
generates signals indicating consecutive control unit states, as well as signals from
the outside world, such as interrupt signals. The matrices are constructed in the
same way that programmable logic arrays are.
Hardwired control unit
Generation of a Signal

• Control signals for instruction execution must be generated during the whole-
time range that corresponds to the cycle of instruction execution, not just at a
single moment in time.
• The control unit organises the appropriate sequence of internal states based on
the structure of this cycle.
• The control signal generator matrix sends a number of signals back to
the inputs of the following control state generator matrix. This matrix
mixes these signals with the timing signals created by the timing unit
depending on the rectangular patterns typically provided by the quartz
generator.
• The control unit is in the beginning state of new instruction, fetching
whenever a new instruction arrives at it.
Generation of a Signal

• Instruction decoding permits the control unit to enter the first state relevant to
the new instruction execution, which lasts as long as the computer’s timing
signals as well as other input signals, such as flags and state
information, stay unchanged.
• A change in any of the previously stated signals causes the control unit’s
status to change.
Result
A new corresponding input for the control signal generator matrix is formed as a
result of this. When an external signal (such as an interrupt) comes, the control
unit enters the next control state, which is concerned with the response to the
external signal (for example, interrupt processing). The computer’s flags and
state variables are utilised to choose appropriate states for the cycle of instruction
execution.

The cycle’s last states are control states that begin fetching the program’s next
instruction: sending the program’s counter content to the address of the main
memory buffer register and then reading the instruction word into the
computer’s instruction register. The control unit enters an OS state, where it
waits for the next user directive when the running instruction is the stop
instruction, which terminates programme execution.
Advantages of Hardwired Control Unit

• Hardwired Control Unit is quick due to the usage of combinational circuits to


generate signals.
• The amount of delay that can occur in the creation of control signals is
dependent on the number of gates.
• It can be tweaked to get the fastest mode of operation.
• Quicker than a micro-programmed control unit.
Disadvantages of Hardwired Control Unit

• As we require additional control signals to be created, the design becomes


more complex (need for more encoders or decoders).
• Changes to control signals are challenging since they necessitate rearranging
wires in the hardware circuit.
• It’s difficult and time-consuming to add a new feature.
• It’s difficult to evaluate and fix flaws in the initial design.
• It’s a bit pricey.
References

Reference Books:
•J.P. Hayes, “Computer Architecture and Organization”, Third Edition.
•Mano, M., “Computer System Architecture”, Third Edition, Prentice Hall.
•Stallings, W., “Computer Organization and Architecture”, Eighth Edition,
Pearson Education.

Text Books:
•Carpinelli J.D,” Computer systems organization &Architecture”, Fourth
Edition, Addison Wesley.
•Patterson and Hennessy, “Computer Architecture”, Fifth Edition Morgaon
Kauffman.
Reference Website:

•Design of Control Unit | Computer Organization and Architecture Tutorial -


javatpoint
•Hardwired Control Unit | GATE Notes (byjus.com)

Reference Video Links:

•https://www.youtube.com/watch?v=MxvZQLR6zqM
•https://www.youtube.com/watch?v=gXXVX64yhME
•https://youtu.be/1q2JKX3qg-4?si=dUjGIZjK8JVg78Xy
University Institute of Engineering
Department of Computer Science & Engineering

COMPUTER ORGANIZATION & ARCHITECTURE


(23CST-204/23ITT-204)

ER. SHIKHA ATWAL


E11186

ASSISTANT PROFESSOR

BE-CSE
MICRO-PROGRAMMED CONTROL UNIT

The programming approach is used to implement a microprogrammed control unit.


A program made up of microinstructions is used to carry out a series of micro-
operations. The control unit’s control memory stores a microprogram composed of
microinstructions. The creation of a set of control signals is dependent on the
execution of a microinstruction.
The block diagram of this type of organization is shown below:

Micro-Programmed Control Unit


Characteristics of Micro-programmed Control Unit

• The microinstruction address is specified in the control memory address register.


• All the control information is saved in the control memory, which is considered
to be a ROM.
• The microinstruction received from memory is stored in the control register.
• A control word in the microinstruction specifies one or multiple micro-operations
for a data processor.
• The next address is calculated in the circuit of the next address generator
and then transferred to the control address register for reading the next
microinstruction when the micro-operations are being executed.
• Because it determines the sequence of addresses received from control memory,
the next address generator is also known as a microprogram sequencer.
Designing of Micro-programmed Control Unit

The existence of the control store, which is used to store words containing encoded
control signals required for instruction execution, is the main distinction between
microprogrammed structures and the hardwired control unit structure.
Each bit in the microinstruction is connected to a single control signal. The control
signal is active when the bit is set., and it becomes inactive when it is cleared. A
sequence of these microinstructions can be kept in the internal ‘control’ memory. A
microprogram-controlled computer’s control unit is a computer within a computer.
Some Important Terms –

1. Control Word: A control word is a word whose individual bits represent


various control signals.
2. Micro-routine: A sequence of control words corresponding to the control
sequence of a machine instruction constitutes the micro-routine for that
instruction.
3. Micro-instruction: Individual control words in this micro-routine are
referred to as microinstructions.
4. Micro-program: A sequence of micro-instructions is called a micro-
program, which is stored in a ROM or RAM called a Control Memory (CM).
5. Control Store: the micro-routines for all instructions in the instruction set of
a computer are stored in a special memory called the Control Store.
Here is the block diagram:

Micro-Programmed Control Unit Organization

Working of Micro-programmed Control Unit


• The instruction words are normally fetched into the instruction register in
micro-programmed control units.
• The operation code of each instruction, on the other hand, is not directly
decoded to enable instant control signal generation; instead, it contains
the initial address of a microprogram in the control store.
• The control store address register receives the instruction opcode from the
instruction register.
• The first microinstruction of a microprogram that interprets the execution of
such an instruction is read to the microinstruction register based on this
address.
• The operation element of this microinstruction contains encoded control
signals, usually in the form of a few bit fields. The fields are decoded using a
collection of microinstruction field decoders.
• The address of the next microinstruction in the supplied instruction
microprogram is also included in the microinstruction, as well as a
control field for controlling the microinstruction address generator’s actions.
Microprogrammed Control Unit
• The addressing mode or addressing operation that is to be applied to the
address encoded in the continuing microinstruction is determined by the last-
mentioned field.
• This address is refined in microinstructions and conditional addressing mode
by employing the processor condition flags, which describe the status of
calculations in the current program.
• The microinstruction that fetches the very next instruction to the instruction
register from the main memory is the last microinstruction in the instruction of
the provided microprogram.
Types of Micro-programmed Control Unit
Based on the type of Control Word stored in the Control Memory (CM), it is
classified into two types:
1.Horizontal Micro-programmed control Unit:
•The control signals are represented in the decoded binary format that is 1
bit/CS.
•Example: If 53 Control signals are present in the processor than 53 bits are
required. More than 1 control signal can be enabled at a time.
•It supports longer control word.
•It is used in parallel processing applications.
•It allows higher degree of parallelism. If degree is n, n CS are enabled at a time.
•It requires no additional hardware(decoders). It means it is faster than
Vertical Microprogrammed.
•It is more flexible than vertical microprogrammed
2. Vertical Micro-programmed control Unit:
•The control signals re represented in the encoded binary format. For N
control signals- Log2(N) bits are required.
•It supports shorter control words.
•It supports easy implementation of new control signals therefore it is more
flexible.
•It allows low degree of parallelism i.e. degree of parallelism is either 0 or 1.
•Requires an additional hardware (decoders) to generate control signals, it
implies it is slower than horizontal microprogrammed.
•It is less flexible than horizontal but more flexible than that of hardwired
control unit.
Advantages of Micro-Programmed Control Unit

•It allows for a more methodical control unit design.


•It’s easier to troubleshoot and modify.
•It can keep the control function’s fundamental structure.
•It can make the control unit’s design easier. As a result, it is less expensive
and less prone to errors or glitches.
•It has the ability to design in a methodical and ordered manner.
•It is used to control software-based functions rather than hardware-based
functions.
•It’s more adaptable.
•It is used to do complex functions with ease.
Disadvantages of Micro-Programmed Control Unit

•Adaptability comes at a higher price.


•It is comparatively slower than a control unit that is hardwired.
References

Reference Books:
•J.P. Hayes, “Computer Architecture and Organization”, Third Edition.
•Mano, M., “Computer System Architecture”, Third Edition, Prentice Hall.
•Stallings, W., “Computer Organization and Architecture”, Eighth
Edition, Pearson Education.

Text Books:
•Carpinelli J.D,” Computer systems organization &Architecture”,
Fourth Edition, Addison Wesley.
•Patterson and Hennessy, “Computer Architecture” , Fifth Edition Morgaon
Kauffman.
Other References:
• http://www.pvpsiddhartha.ac.in/dep_it/lecturenotes/CSA/unit-3.pdf
• http://www.cs.binghamton.edu/~reckert/hardwire3new.html
• https://www.geeksforgeeks.org/last-minute-notes-computer-organization/
• https://bmsit.ac.in/system/study_materials/documents/000/000/007/origin
al/Comp uter_Organization_Chapter7.pdf?1477057102
• https://www.javatpoint.com/design-of-control-unit

Reference Video Links:


•https://youtu.be/fqe8n_7ln2c?si=hgui5tMiCt-ZPU7g
•https://www.youtube.com/live/tYiC5xqDzBc?si=shs5d4bTt99AO4wL
•https://youtu.be/5Sk5OpoT66E?si=DoCgVZ36iUTQDyB2
University Institute of Engineering
Department of Computer Science & Engineering

COMPUTER ORGANIZATION & ARCHITECTURE


(23CST-204/23ITT-204)

ER. SHIKHA ATWAL


E11186

ASSISTANT PROFESSOR

BE-CSE
COMPARATIVE STUDY OF HARDWIRED AND MICRO-
PROGRAMMED CONTROL UNIT
•To execute an instruction, there are two types of control units: Hardwired
Control unit and Micro-programmed control unit.
•Hardwired control units are generally faster than microprogrammed
designs. In hardwired control, we saw how all the control signals required
inside the CPU can be generated using a state counter and a Programmable
Logic Array (PLA) circuit.
•A microprogrammed control unit is a relatively simple logic circuit that is
capable of sequencing through microinstructions and generating control signals
to execute each microinstruction.
•The main difference between Hardwired and Microprogrammed Control Unit is
that a Hardwired Control Unit is a sequential circuit that generates control
signals while a Microprogrammed Control Unit is a unit with microinstructions
in the control memory to generate control signals.
Following are the further differences in tabular form.
Hardwired Control Unit Microprogrammed Control Unit
With the help of a hardware circuit, we While with the help of programming, we
can implement the hardwired control unit. In can implement the micro-
other words, we can say that it is a programmed control unit.
circuitry approach.
The hardwired control unit uses the logic The micro-programmed CU uses
circuit so that it can generate the control microinstruction so that it can
signals, which are required for the processor. generate the control signals. Usually,
control memory is used to store
these microinstructions.
In this CU, the control signals are going to be It is very easy to modify the micro-
generated in the form of hard wired. programmed control unit because
That's why it is very difficult to modify the the modifications are going to be
hardwired control unit. performed only at the instruction level.
Hardwired Control Unit Microprogrammed Control Unit
The complex instructions cannot be handled The micro-programmed control unit is
by a hardwired control unit because when we able to handle the complex instructions.
design a circuit for this instruction, it will
become complex.
In the form of logic gates, everything has to beThe micro-programmed control unit is
realized in the hardwired control unit. That's less costly as compared to the hardwired
why this CU is more costly as compared to the CU because this control unit only requires
micro-programmed control unit. the microinstruction to generate the
control signals.
Because of the hardware implementation, the The micro-programmed control unit is
hardwired control unit is able to use a limited able to generate control signals for many
number of instructions. instructions.
Difficult to modify as the control signals that Easy to modify as the modification need
need to be generated are hardwired to be done only at the instruction level
Hardwired Control Unit Microprogrammed Control Unit
More costlier as everything has to be Less costlier than hardwired control
realized in terms of logic gates as only micro instructions are used
for generating control signals
Only limited number of instructions are Control signals for many
used due to the hardware implementation instructions can be generated
Used in computer that makes use of Used in computer that makes use
Reduced Instruction Set Computers of Complex Instruction Set
(RISC) Computers (CISC)
Difference Between Hardwired and Microprogrammed Control Unit

Definition
Hardwired Control Unit is a unit that uses combinational logic units, featuring a
finite number of gates that can generate specific results based on the instructions
that were used to invoke those responses. Microprogrammed Control Unit is a
unit that contains microinstructions in the control memory to produce control
signals.

Speed
The speed of operations in Hardwired Control Unit is fast. The speed of
operations in Microprogrammed Control Unit is slow because it requires
frequent memory accesses.
Difference Between Hardwired and Microprogrammed Control Unit

Modification
To do modifications in a Hardwired Control Unit, the entire unit should be
redesigned. In Microprogrammed Control Unit, modifications can be
implemented by changing the microinstructions in the control memory.
Therefore, Microprogrammed Control Unit is more flexible.
Cost
Furthermore, Hardwired Control Unit are more costly to implement than a
Microprogrammed Control Unit.
Handling Complex Instructions
Also, it is difficult for Hardwired Control Unit to handle complex instructions,
but is easier for the Microprogrammed Control Unit to handle complex
instructions.
Difference Between Hardwired and Microprogrammed Control Unit

Instruction Decoding
Moreover, it is difficult to perform instruction decoding in Hardwired Control
Unit than in Microprogrammed Control Unit.
Instruction set Size
In addition to the above differences, the Hardwired Control Unit uses a small
instruction set while the Microprogrammed Control Unit uses a large instruction
set.
Control Memory
Also, there is no control memory usage in Hardwired Control Unit but, on the
other hand, Microprogrammed Control Unit uses control memory.
Difference Between Hardwired and Microprogrammed Control Unit

Applications
Considering the applications, the Hardwired Control Unit is used in processors
that use a simple instruction set known as the Reduced Instruction Set Computers
(RISC). Microprogrammed Control Unit is used in processors based on a
complex instruction set known as Complex Instruction Set Computer (CISC).
References

Reference Books:
●J.P. Hayes, “Computer Architecture and Organization”, Third Edition.
●Mano, M., “Computer System Architecture”, Third Edition, Prentice Hall.
●Stallings, W., “Computer Organization and Architecture”, Eighth Edition,
Pearson Education.

Text Books:
●Carpinelli J.D,” Computer systems organization &Architecture”, Fourth
Edition, Addison Wesley.
●Patterson and Hennessy, “Computer Architecture”, Fifth Edition Morgaon
Kauffman.
Reference Website
●Instruction Set Architecture : Instructions and Formats | Computer
Architecture (witscad.com)
●https://byjus.com/gate/difference-between-hardwired-and-microprogrammed-
control- unit/
●https://www.javatpoint.com/hardwired-vs-micro-programmed-control-unit
University Institute of Engineering
Department of Computer Science & Engineering

COMPUTER ORGANIZATION & ARCHITECTURE


(23CST-204/23ITT-204)

ER. SHIKHA ATWAL


E11186

ASSISTANT PROFESSOR

BE-CSE
MEMORY HIERARCHY
The computer memory hierarchy looks like a
pyramid structure which is used to describe the
differences among memory types. It separates the
computer storage based on hierarchy.
Level 0: CPU registers
Level 1: Cache memory
Level 2: Main memory or primary
memory
Level 3: Magnetic disks or
secondary memory
Level 4: Optical disks or magnetic types or tertiary
Memory
In Memory Hierarchy the cost of memory, capacity is inversely proportional to
speed. Here the devices are arranged in a manner Fast to slow, that is from
register to Tertiary memory.

Let us discuss each level in detail:


Level-0 (Registers)
The registers are present inside the CPU. As they are present inside the CPU,
they have least access time. Registers are most expensive and smallest in size
generally in kilobytes. They are implemented by using Flip-Flops.
Level-1 (Cache Memory)
Cache memory is used to store the segments of a program that are frequently
accessed by the processor. It is expensive and smaller in size generally in
Megabytes and is implemented by using static RAM.
Level-2 (Primary or Main Memory)
It directly communicates with the CPU and with auxiliary memory devices
through an I/O processor. Main memory is less expensive than cache memory
and larger in size generally in Gigabytes. This memory is implemented by using
dynamic RAM.
Level-3 (Secondary storage)
Secondary storage devices like Magnetic Disk are present at level 3. They are
used as backup storage. They are cheaper than main memory and larger in size
generally in a few TB.
Level-4 (Tertiary storage)
Tertiary storage devices like magnetic tape are present at level 4. They are
used to store removable files and are the cheapest and largest in size (1-20 TB).
Let us see the memory levels in terms of size, access time, bandwidth.

Level Register Cache Primary Secondary


memory memory memory
Bandwidth 4k to 32k 800 to 5k 400 to 2k 4 to 32 MB/sec
MB/sec MB/sec MB/sec
Size Less than 1KB Less than Less than 2 GB Greater than 2
4MB GB
Access time 2 to 5 nsec 3 to 10 nsec 80 to 400 nsec 5 ms
Managed by Compiler Hardware Operating OS or user
system
Why is Memory Hierarchy used in systems?

Memory hierarchy is arranging different kinds of storage present on a computing


device based on speed of access. At the very top, the highest performing storage is
CPU registers which are the fastest to read and write to. Next is cache
memory followed by conventional DRAM memory, followed by disk storage
with different levels of performance including SSD, optical and magnetic disk
drives.
To bridge the processor memory performance gap, hardware designers are
increasingly relying on memory at the top of the memory hierarchy to
close/reduce the performance gap. This is done through increasingly larger cache
hierarchies (which can be accessed by processors much faster), reducing the
dependency on main memory which is slower.
This Memory Hierarchy Design is divided into 2 main types:

●External Memory or Secondary Memory –


Comprising of Magnetic Disk, Optical Disk, Magnetic Tape i.e. peripheral
storage devices which are accessible by the processor via I/O Module.

●Internal Memory or Primary Memory –


Comprising of Main Memory, Cache Memory & CPU registers. This is
directly accessible by the processor.
These are the following characteristics of Memory Hierarchy Design from above
figure:

1. Capacity:
It is the global volume of information the memory can store. As we move
from top to bottom in the Hierarchy, the capacity increases.

2. Access Time:
It is the time interval between the read/write request and the availability of the
data. As we move from top to bottom in the Hierarchy, the access time
increases.
3. Performance:
Earlier when the computer system was designed without Memory Hierarchy
design, the speed gap increases between the CPU registers and Main Memory
due to large difference in access time. This results in lower performance of the
system and thus, enhancement was required. This enhancement was made in
the form of Memory Hierarchy Design because of which the performance of
the system increases. One of the most significant ways to increase system
performance is minimizing how far down the memory hierarchy one has to go
to manipulate data.

4. Cost per bit:


As we move from bottom to top in the Hierarchy, the cost per bit increases
i.e. Internal Memory is costlier than External Memory.
Reference Books:
●J.P. Hayes, “Computer Architecture and Organization”, Third Edition.
●Mano, M., “Computer System Architecture”, Third Edition, Prentice Hall.
●Stallings, W., “Computer Organization and Architecture”, Eighth Edition,
Pearson Education.
Text Books:
●Carpinelli J.D,” Computer systems organization &Architecture”, Fourth
Edition, Addison Wesley.
●Patterson and Hennessy, “Computer Architecture”, Fifth Edition Morgaon
Kauffman.
Reference Website
●Memory Hierarchy Design and its Characteristics - GeeksforGeeks
●What is memory hierarchy? (tutorialspoint.com)
University Institute of Engineering
Department of Computer Science & Engineering

COMPUTER ORGANIZATION & ARCHITECTURE


(23CST-204/23ITT-204)

ER. SHIKHA ATWAL


E11186

ASSISTANT PROFESSOR

BE-CSE
MAIN MEMORY

The main memory acts as the central storage unit in a computer system. It is a
relatively large and fast memory which is used to store programs and data
during the run time operations.
The primary technology used for the main memory is based on
semiconductor integrated circuits.
The integrated circuits for the main memory are classified into two major
units.
1.RAM (Random Access Memory) integrated circuit chips
2.ROM (Read Only Memory) integrated circuit chips
1. RAM integrated circuit chips
The RAM integrated circuit chips are further classified into two
possible operating modes, static and dynamic.
The primary compositions of a static RAM are flip-flops that store
the binary information. The nature of the stored information is volatile, i.e.
it remains valid as long as power is applied to the system. The static RAM
is easy to use and takes less time performing read and write operations as
compared to dynamic RAM.
The dynamic RAM exhibits the binary information in the form of electric
charges that are applied to capacitors. The capacitors are integrated
inside the chip by MOS transistors. The dynamic RAM consumes less
power and provides large storage capacity in a single memory chip.
RAM chips are available in a variety of sizes and are used as per
the system requirements.
The following block diagram demonstrates the chip interconnection in a 128 * 8
RAM chip.

oA 128 * 8 RAM chip has a memory capacity of 128 words of eight bits (one
byte) per word. This requires a 7-bit address and an 8-bit bidirectional data bus.
o The 8-bit bidirectional data bus allows the transfer of data either from memory
to CPU during a read operation or from CPU to memory during a write
operation.
oThe read and write inputs specify the memory operation, and the two chip
select (CS) control inputs are for enabling the chip only when the
microprocessor selects it.
oThe bidirectional data bus is constructed using three-state buffers.
oThe output generated by three-state buffers can be placed in one of the
three possible states which include a signal equivalent to logic 1, a signal equal
to logic 0, or a high-impedance state.

Note: The logic 1 and 0 are standard digital signals whereas the high-impedance
state behaves like an open circuit, which means that the output does not carry a
signal and has no logic significance.
The following function table specifies the operations of a 128 * 8 RAM chip.

From the functional table above, we can conclude that the unit is in operation only
when CS1 = 1 and CS2 = 0. The bar on top of the second select variable
indicates that this input is enabled when it is equal to 0.
2. ROM integrated circuit
The primary component of the main memory is RAM integrated circuit chips,
but a portion of memory may be constructed with ROM chips.
A ROM memory is used for keeping programs and data that are permanently
resident in the computer.
Apart from the permanent storage of data, the ROM portion of main memory is
needed for storing an initial program called a bootstrap loader. The
primary function of the bootstrap loader program is to start the computer
software operating when power is turned on.
ROM chips are also available in a variety of sizes and are also used as per the
system requirement. The following block diagram demonstrates the chip
interconnection in a 512 * 8 ROM chip.
o A ROM chip has a similar organization as a RAM chip. However, a ROM can
only perform read operation; the data bus can only operate in an output mode.
o The nine address lines in the ROM chip specify any one of the 512 bytes
stored in it.
o The two chip select inputs must be CS1=1 and CS2=0 for the unit to
operate. Otherwise, the data bus is said to be in a high-impedance state.
Memory Address Map
The system designer must calculate the amount of memory required for a given
application and assign it to RAM or ROM.
The interconnection between the processor and the memory is established from
the knowledge of the size of memory required and the type of ROM
and RAM chips available. The addressing of memory can be established by
means of a table that specify the memory address assigned to each chip. The
table is called the Memory address map, is a pictorial representation of
assigned address space for each chip in the system.
Component Hexadecima Address Bus
l Address 10 9 8 7 6 5 4 3 2 1
RAM 1 0000-007F 0 0 0 X X X X X X X
RAM 1 0080-007F 0 0 1 X X X X X X X
RAM 1 0100-017F 0 1 0 X X X X X X X
RAM 1 0180-01FF 0 1 1 X X X X X X X
ROM 0200-03FF 1 X X X X X X X X X

The memory address map for the configuration of 512 bytes RAM and 512 bytes
ROM is shown in table above. The component column specifies whether a RAM
or a ROM chip is used. The hexadecimal address column assigns a range of
hexadecimal equivalent addresses for each chip. The address bus lines are listed in
the third column. The RAM chips have 128 bytes and need seven address lines.
The ROM chip has 512 bytes and needs 9 address lines.
Memory Connection to CPU

• The data and address buses are


used to connect RAM and ROM
chips to a CPU.
• The low-order lines in the address
bus choose the byte within the
chips and other lines in the
address bus select a particular chip
through its chip select inputs.

Memory
Connection to
the CPU
The connection of memory chips to the CPU is shown in the figure above. This
configuration gives a memory capacity of 512 bytes of RAM and 512 bytes of
ROM. Each RAM receives the seven low-order bits of the address bus to select
one of 128 possible bytes.
The particular RAM chip selected is determined from lines 8 and 9 in the
address bus. This is done through a 2 X 4 decoder whose outputs go to the CS1
inputs in each RAM chip.
Thus, when address lines 8 and 9 are equal to 00, the first RAM chip is selected.
When 01, the second RAM chip is select, and so on.
The RD and WR outputs from the microprocessor are applied to the inputs of
each RAM chip. The selection between RAM and ROM is achieved through bus
line 10. The RAMs are selected when the bit in this line is 0, and the ROM when
the bit is 1.
Address bus lines 1 to 9 are applied to the input address of ROM without going
through the decoder. The data bus of the ROM has only an output
capability, whereas the data bus connected to the RAMs can transfer
information in both directions.
References

Reference Books:
●J.P. Hayes, “Computer Architecture and Organization”, Third Edition.
●Mano, M., “Computer System Architecture”, Third Edition, Prentice Hall.
●Stallings, W., “Computer Organization and Architecture”, Eighth Edition,
Pearson Education.

Text Books:
●Carpinelli J.D,” Computer systems organization &Architecture”, Fourth
Edition, Addison Wesley.
●Patterson and Hennessy, “Computer Architecture”, Fifth Edition Morgaon
Kauffman.
Reference Website:
●Memory Hierarchy Design and its Characteristics - GeeksforGeeks
●What is memory hierarchy? (tutorialspoint.com)

Video Links:
•https://youtu.be/oj4_c9IMOCg?si=1FMiriyxX7JwkE4h
•https://youtu.be/zwovvWfkuSg?si=fKl4vq0THSpiMa0M
•https://www.youtube.com/watch?v=LE_lyLCqy8I&list=PL3R9-
um41JszyaKeoc9qP8Bn45XzqJycj
•https://www.youtube.com/watch?v=_ImgTuITW-s&list=PL3R9-
um41JszyaKeoc9qP8Bn45XzqJycj&index=4
University Institute of Engineering
Department of Computer Science & Engineering

COMPUTER ORGANIZATION & ARCHITECTURE


(23CST-204/23ITT-204)

ER. SHIKHA ATWAL


E11186

ASSISTANT PROFESSOR

BE-CSE
AUXILIARY MEMORY

• Secondary memory is a type of computer memory that is used for long-term


storage of data and programs. It is also known as auxiliary memory or external
memory, and is distinct from primary memory, which is used for short-term
storage of data and instructions that are currently being processed by the CPU.
• Secondary memory devices are typically larger and slower than primary
memory, but offer a much larger storage capacity. This makes them ideal for
storing large files such as documents, images, videos, and other multimedia
content.
• Primary memory has limited storage capacity and is volatile. Secondary
memory overcomes this limitation by providing permanent storage of data and
in bulk quantity.
AUXILIARY MEMORY

• Secondary memory is also termed external memory and refers to the various
storage media on which a computer can store data and programs.
• The Secondary storage media can be fixed or removable. Fixed Storage media
is an internal storage medium like a hard disk that is fixed inside the computer.
A storage medium that is portable and can be taken outside the computer is
termed removable storage media.
• Some examples of secondary memory devices include hard disk drives
(HDDs), solid-state drives (SSDs), magnetic tapes, optical discs such as CDs
and DVDs, and flash memory such as USB drives and memory cards. Each of
these devices uses different technologies to store data, but they all share the
common feature of being non-volatile, meaning that they can store data even
when the computer is turned off.
AUXILIARY MEMORY

• Secondary memory devices are accessed by the CPU via input/output (I/O)
operations, which involve transferring data between the device and
primary memory.
• The speed of these operations is affected by factors such as the type of
device, the size of the file being accessed, and the type of connection between
the device and the computer.
• Overall, secondary memory is an essential component of modern computing
systems and plays a critical role in the storage and retrieval of data and
programs.
Difference between Primary Memory and Secondary Memory:
Primary Memory Secondary Memory
Primary memory is directly Secondary memory is not accessed
accessed by the Central Processing directly by the Central Processing Unit
Unit (CPU). (CPU). Instead, data accessed from a
secondary memory is first loaded into
Random Access Memory (RAM) and
is then sent to the Processing Unit.
RAM provides a much faster- Secondary Memory is slower in data
accessing speed to data than accessing. Typically, primary memory
secondary memory. By loading is six times faster than secondary
software programs and required files memory.
into primary memory (RAM),
computers can process data much
more quickly.
Primary Memory Secondary Memory
Primary memory, i.e. Random Secondary memory provides a
Access Memory (RAM) is feature of being non-volatile, which
volatile and gets completely means it can hold on to its data with
erased when a computer is shut or without electrical power supply.
down.
Uses of Secondary Storage:

• Permanent Storage: Primary Memory (RAM) is volatile, i.e. it loses all


information when the electricity is turned off, so in order to secure the data
permanently in the device, Secondary storage devices are needed.
• Portability: Storage mediums, like CDs, flash drives can be used to transfer
the data from one device to another.
Fixed and Removable Storage Fixed Storage-

• Fixed storage is an internal media device that is used by a computer system


to store data, and usually, these are referred to as the Fixed
disk drives or Hard Drives.
• Fixed storage devices are literally not fixed, obviously, these can be removed
from the system for repairing work, maintenance purposes, and also for an
upgrade, etc.
• But in general, this can’t be done without a proper toolkit to open up the
computer system to provide physical access, and that needs
to be done by an engineer.
• Technically, almost all of the data i.e. being processed on a computer system
is stored on some type of a built-in fixed storage device.
Types of fixed storage:
• Internal flash memory (rare)
• SSD (solid-state disk) units
• Hard disk drives (HDD)

Removable Storage-
Removable storage is an external media device that is used by a computer system
to store data, and usually, these are referred to as the Removable Disks drives
or the External Drives. Removable storage is any type of storage device that can
be removed/ejected from a computer system while the system is running. Examples
of external devices include CDs, DVDs, and Blu-ray disk drives, as well as
diskettes and USB drives. Removable storage makes it easier for a user to
transfer data from one computer system to another. In storage
factors, the main benefit of removable disks is that they can provide the fast data
transfer rates associated with storage area networks (SANs)
Types of Removable Storage:

• Optical discs (CDs, DVDs, Blu-ray discs)


• Memory cards
• Floppy disks
• Magnetic tapes
• Disk packs
• Paper storage (punched tapes, punched cards)
Advantages:

1. Large storage capacity: Secondary memory devices typically have a


much larger storage capacity than primary memory, allowing users to store
large amounts of data and programs.
2. Non-volatile storage: Data stored on secondary memory devices is
typically non- volatile, meaning it can be retained even when the computer is
turned off.
3. Portability: Many secondary memory devices are portable, making it easy
to transfer data between computers or devices.
4. Cost-effective: Secondary memory devices are generally more cost-
effective than primary memory.
Disadvantages:

1. Slower access times: Accessing data from secondary memory devices typically
takes longer than accessing data from primary memory.
2. Mechanical failures: Some types of secondary memory devices, such as hard
disk drives, are prone to mechanical failures that can result in data loss.
3. Limited lifespan: Secondary memory devices have a limited lifespan, and
can only withstand a certain number of read and write cycles before they fail.
4. Data corruption: Data stored on secondary memory devices can become corrupted
due to factors such as electromagnetic interference, viruses, or physical damage.
Overall, secondary memory is an essential component of modern computing systems,
but it also has its limitations and drawbacks. The choice of a particular
secondary memory device depends on the user’s specific needs and requirements.
References

Reference Books:
●J.P. Hayes, “Computer Architecture and Organization”, Third Edition.
●Mano, M., “Computer System Architecture”, Third Edition, Prentice Hall.
●Stallings, W., “Computer Organization and Architecture”, Eighth Edition,
Pearson Education.

Text Books:
●Carpinelli J.D,” Computer systems organization &Architecture”, Fourth
Edition, Addison Wesley.
●Patterson and Hennessy, “Computer Architecture”, Fifth Edition Morgaon
Kauffman.
Reference Website:
●Memory Hierarchy Design and its Characteristics - GeeksforGeeks
●What is memory hierarchy? (tutorialspoint.com)

Video Links:
•https://youtu.be/oj4_c9IMOCg?si=1FMiriyxX7JwkE4h
•https://youtu.be/zwovvWfkuSg?si=fKl4vq0THSpiMa0M
•https://youtu.be/NVUWlO5zsk0?si=blZS6dXlR5NJOmKI
•https://youtu.be/_dfsb7Gwems?si=zYh3TYajVcYHgAe9
•https://youtu.be/w04AMAeR60k?si=oWfL6Rrk0_Yf-gZ0
University Institute of Engineering
Department of Computer Science & Engineering

COMPUTER ORGANIZATION & ARCHITECTURE


(23CST-204/23ITT-204)

ER. SHIKHA ATWAL


E11186

ASSISTANT PROFESSOR

BE-CSE
ASSOCIATIVE MEMORY

An associative memory can be treated as a memory unit whose saved


information can be recognized for approach by the content of the information
itself instead of by an address or memory location. Associative memory is also
known as Content Addressable Memory (CAM).
The block diagram of associative memory is
shown in the figure above. It consists of a
memory array and logic for m words with n bits
per word.
The argument register A and key register K each
have n bits, one for each bit of a word. The
match register M has m bits, one for each
memory word. Each word in memory is
compared in parallel with the content of the
argument register. The words that match the
bits of the argument register set a corresponding
bit in the match register.
After the matching process, those bits in the match register that have been set
indicate the fact that their corresponding words have been matched.
Reading is accomplished by a sequential access to memory for those
words whose corresponding bits in the match register have been set.
The key register supports a mask for selecting a specific field or key in the
argument word. The whole argument is distinguished with each memory word if
the key register includes all 1's.
Hence, there are only those bits in the argument that have 1's in their equivalent
position of the key register are compared. Therefore, the key gives a mask or
recognizing a piece of data that determines how the reference to memory is
created.
The following figure can define the relation between the memory array and
the external registers in associative memory.
The cells in the array are considered by the letter C with two subscripts. The
first subscript provides the word number and the second determines the bit
position in the word. Therefore, cell cij is the cell for bit j in word i.

A bit Aj in the argument register is compared with all the bits in column j of the
array supported that Kj = 1. This is completed for all columns j = 1, 2, ..., n.

If a match appears between all the unmasked bits of the argument and the bits
in word I, the corresponding bit Mi in the match register is set to 1. If one or
more unmasked bits of the argument and the word do not match, Mi is cleared
to 0.
Applications of Associative memory:

1. It can be only used in memory allocation format.


2. It is widely used in the database management systems, etc.
3. Networking: Associative memory is used in network routing tables to
quickly find the path to a destination network based on its address.
4. Image processing: Associative memory is used in image processing
applications to search for specific features or patterns within an image.
5. Artificial intelligence: Associative memory is used in artificial intelligence
applications such as expert systems and pattern recognition.
6. Database management: Associative memory can be used in database
management systems to quickly retrieve data based on its content.
Advantages of Associative memory:

1. It is used where search time needs to be less or short.


2. It is suitable for parallel searches.
2. It is often used to speedup databases.
3. It is used in page tables used by the virtual memory and used in neural
networks.

Disadvantages of Associative memory:

1. It is more expensive than RAM


2. Each cell must have storage capability and logical circuits for matching its
content with external argument.
CACHE MEMORY

The data or contents of the main memory that are used frequently by CPU are
stored in the cache memory so that the processor can easily access that data in a
shorter time. Whenever the CPU needs to access memory, it first checks the
cache memory. If the data is not found in cache memory, then the CPU moves
into the main memory.
Cache memory is placed between the CPU and the main memory. The block
diagram for a cache memory can be represented as:
The cache is the fastest component in the memory hierarchy and approaches the
speed of CPU components.

Need of cache memory


Data in primary memory can be accessed faster than secondary memory but still,
access times of primary memory are generally in few microseconds, whereas CPU
is capable of performing operations in nanoseconds. Due to the time lag
between accessing data and acting of data performance of the system decreases
as the CPU is not utilized properly, it may remain idle for some time. In order to
minimize this time gap new segment of memory is Introduced known as Cache
Memory.
Types of Cache Memory

L1 or Level 1 Cache: It is the first level of cache memory that is present inside
the processor. It is present in a small amount inside every core of the processor
separately. The size of this memory ranges from 2KB to 64 KB.
L2 or Level 2 Cache: It is the second level of cache memory that may present
inside or outside the CPU. If not present inside the core, It can be shared between
two cores depending upon the architecture and is connected to a processor
with the high-speed bus. The size of memory ranges from 256 KB to 512 KB.
L3 or Level 3 Cache: It is the third level of cache memory that is present outside
the CPU and is shared by all the cores of the CPU. Some high processors may
have this cache. This cache is used to increase the performance of the L2 and L1
cache. The size of this memory ranges from 1 MB to 8MB.
Locality of Reference

•Effectiveness of cache mechanism is based on a property of computer


programs called
“locality of reference”.
•The references to memory at any given time interval tend to be confined
within a localized area.
•Analysis of programs shows that most of their execution time is spent on
routines in which instructions are executed repeatedly These instructions may
be – loops, nested loops, or few procedures that call each other.
•Many instructions in localized areas of program are executed repeatedly
during some time period and remainder of the program is accessed
infrequently.
•This property is called “Locality of Reference”.
Locality of reference is manifested in two ways:

1. Temporal means that a recently executed instruction is likely to be executed


again very soon. The information which will be used in near future is likely to
be in use already (e.g. reuse of information in loops)

2. Spatial means that instructions in close proximity to a recently executed


instruction are also likely to be executed soon. If a word is accessed, adjacent
(near) words are likely to be accessed soon (e.g. related data items
(arrays) are usually stored together; instructions are executed sequentially)
Principles of cache

The main memory can store 32k words of 12 bits each. The cache is capable of
storing 512 of these words at any given time. For every word stored, there is a
duplicate copy in main memory. The CPU communicates with both memories. It
first sends a 15-bit address to cache. If there is a hit, the CPU accepts the 12-bit
data from cache. If there is a miss, the CPU reads the word from main memory and
the word is then transferred to cache.
•When a read request is received from CPU, contents of a block of
memory words containing the location specified are transferred in to cache.
•When the program references any of the locations in this block, the contents are
read from the cache Number of blocks in cache is smaller than number of
blocks in main memory.
•Correspondence between main memory blocks and those in the cache is
specified by a mapping function.
•Assume cache is full and memory word not in cache is referenced.
•Control hardware decides which block from cache is to be removed to create
space for new block containing referenced word from memory.
•Collection of rules for making this decision is called “Replacement algorithm”
Cache performance
●On searching in the cache if data is found, a cache hit has occurred.
●On searching in the cache if data is not found, a cache miss has occurred.

Performance of cache is measured by the number of cache hits to the number of


searches. This parameter of measuring performance is known as the Hit Ratio.

Hit ratio = (Number of cache hits)/(Number of searches)


Cache Mapping

As we know that the cache memory bridges the mismatch of speed between the
main memory and the processor. Whenever a cache hit occurs,
• The word that is required is present in the memory of the cache. Then the
required word would be delivered from the cache memory to the CPU.
• And, whenever a cache miss occurs, the word that is required isn’t present in
the memory of the cache. The page consists of the required word that we need
to map from the main memory.
• We can perform such a type of mapping using various different techniques of
cache mapping.
Process of Cache Mapping
The process of cache mapping helps us define how a certain block that is present
in the main memory gets mapped to the memory of a cache in the case of any
cache miss.
In simpler words, cache mapping refers to a technique using which we bring the
main memory into the cache memory. Here is a diagram that illustrates the actual
process of mapping:
Important Note:

• The main memory gets divided into multiple partitions of equal size, known
as the frames or blocks.
• The cache memory is actually divided into various partitions of the same sizes
as that of the blocks, known as lines.
• The main memory block is copied simply to the cache during the process of
cache mapping, and this block isn’t brought at all from the main memory.
Cache Mapping Functions

Correspondence between main memory blocks and those in the cache is specified
by a memory mapping function.

There are three techniques in memory mapping


1.Direct Mapping
2.Fully Associative Mapping
3.Set Associative Mapping
Direct Mapping
In direct mapping, the cache consists of normal high-speed random-access
memory. A certain block of the main memory would be able to map a cache only
up to a certain line of the cache. The total line numbers of cache to which any
distinct block can map are given by the following:
Cache line number = (Address of the Main Memory Block) Modulo (Total
number of lines in Cache)
For example,
Let us consider that particular cache memory is divided into a total of ‘n’ number
of lines.
Then, the block ‘j’ of the main memory would be able to map to line number only
of the cache (j mod n).
The Need for Replacement Algorithm
In the case of direct mapping,
• There is no requirement for a replacement
algorithm.
• It is because the block of the main memory
would be able to map to a certain line of the
cache only.
• Thus, the incoming (new) block always happens
to replace the block that already exists, if any, in
this certain line.
Division of Physical Address
In the case of direct mapping, the division of the physical address occurs as
follows:
Fully Associative Mapping

In the case of fully associative mapping,


• The main memory block is capable of
mapping to any given line of the cache
that’s available freely at that particular
moment.
• It helps us make a fully associative
mapping comparatively more flexible
than direct mapping.
Let us consider the scenario given as
follows:
Here, we can see that,
• Every single line of cache is available freely.
• Thus, any main memory block can map to a line of the cache.
• In case all the cache lines are occupied, one of the blocks that exists already
needs to be replaced.

The Need for Replacement Algorithm


In the case of fully associative mapping,
• The replacement algorithm is always required.
• The replacement algorithm suggests a block that is to be replaced whenever
all the cache lines happen to be occupied.
• So replacement algorithms such as LRU Algorithm, FCFS Algorithm, etc. are
employed.
Division of Physical Address
In the case of fully associative mapping, the division of the physical address
occurs as follows:
K-way Set Associative Mapping
In the case of k-way set associative mapping,
• The grouping of the cache lines occurs into various sets where all the sets
consist of k number of lines.
• Any given main memory block can map only to a particular cache set.
• However, within that very set, the block of memory can map any cache line
that is freely available.
• The cache set to which a certain main memory block can map is basically
given as follows:
Cache set number = (Block Address of the Main Memory) Modulo (Total
Number of sets present in the Cache)
Let us consider the example given as follows of a two-way set-associative
mapping:
In this case,
• k = 2 would suggest that every set consists of
two cache lines.
• Since the cache consists of 6 lines, the total
number of sets that are present in the cache =
6/2 = 3 sets.
• The block ‘j’ of the main memory is capable of
mapping to the set number only (j mod 3) of the
cache.
• Here, within this very set, the block ‘j’ is
capable of mapping to any cache line that is
freely available at that moment.
• In case all the available cache lines happen to be
occupied, then one of the blocks that already
exist needs to be replaced.
The Need for Replacement Algorithm
In the case of k-way set associative mapping,
• The k-way set associative mapping refers to a combination of the direct
mapping as well as the fully associative mapping.
• It makes use of the fully associative mapping that exists within each set.
• Therefore, the k-way set associative mapping needs a certain type of
replacement algorithm.

Division of Physical Address


In the case of fully k-way set mapping, the division of the physical address occurs
as follows:
Special Cases

• In case k = 1, the k-way set associative mapping would become direct


mapping. Thus, Direct Mapping = one-way set associative mapping
• In the case of k = The total number of lines present in the cache, then the k-
way set associative mapping would become fully associative mapping.
CACHE REPLACEMENT POLICIES

Replacement algorithms are used when there is no available space in a cache in


which to place a data.

A page replacement algorithm is needed to decide which page needs to be


replaced when the new page comes in. Whenever a new page is referred to and is
not present in memory, the page fault occurs and the Operating System replaces
one of the existing pages with a newly needed page.
Four of the most common cache replacement algorithms are described below:

1.FIFO (First In First Out) Policy

• The block which has entered first in the main be replaced first.
• This can lead to a problem known as "Belady's Anomaly", it starts that if
we increase the no. of lines in cache memory the cache miss will increase.
• Belady's Anomaly: For some cache replacement algorithm, the page fault
or miss rate increase as the number of allocated frame increase.
• Example: Let we have a sequence 7, 0 ,1, 2, 0, 3, 0, 4, 2, 3, and cache
memory has 4 lines.
There are a total of 6 misses in the FIFO replacement policy.
2. LRU (Least Recently Used)

• The page which was not used for the largest period of time in the past will get
reported first.
• We can think of this strategy as the optimal cache- replacement algorithm
looking backward in time, rather than forward.
• LRU is much better than FIFO replacement.
• LRU is also called a stack algorithm and can never exhibit belady's anamoly.
• The problem which is most important is how to implement LRU replacement.
An LRU page replacement algorithm may require a sustainable hardware
resource.
• Example: Let we have a sequence 7, 0 ,1, 2, 0, 3, 0, 4, 2, 3 and cache memory
has 3 lines.
There are a total of 6 misses in the LRU replacement policy.
3. LFU (Least Frequently Used):
This cache algorithm uses a counter to keep track of how often an entry is
accessed. With the LFU cache algorithm, the entry with the lowest count is
removed first. This method isn't used that often, as it does not account for an
item that had an initially high access rate and then was not accessed for a long
time.

4. Random Replacement (RR):


This algorithm randomly selects an object when it reaches maximum capacity.
It has the benefit of not keeping any reference or history of objects and being
very simple to implement at the same time.

This algorithm has been used in ARM processors and the famous Intel i860.
Cache Design Issues
1.Cache Addresses:
-Logical Cache/Virtual Cache stores data using virtual addresses. It accesses
cache directly without going through MMU
-Physical Cache stores data using main memory physical addresses.
One obvious advantage of the logical cache is that cache access speed is
faster than for a physical cache, because the cache can respond before the
MMU performs an address translation.
The disadvantage has to do with the fact that most virtual memory
systems supply each application with the same virtual memory address space.
That is, each application sees a virtual memory that starts at address 0. Thus, the
same virtual address in two different applications refers to two different
physical addresses. The cache memory must therefore be completely flushed
with each application switch, or extra bits must be added to each line of the
cache to identify which virtual address space this address refers to.
2. Cache Size
The larger the cache, the larger the number of gates involved in addressing the
cache. The available chip and board area also limit cache size.
The more cache a system has, the more likely it is to register a hit on memory
access because fewer memory locations are forced to share the same cache line.
Although an increase in cache size will increase the hit ratio, a continuous
increase in cache size will not yield an equivalent increase of the hit ratio.
Note: An Increase in cache size from 256K to 512K (increase by 100%)
will yield a 10% improvement of the hit ratio, but an additional increase from
512K to 1024K would yield a less than 5% increase of the hit ratio (law of
diminishing marginal returns).
3. Replacement Algorithm
Once the cache has been filled, when a new block is brought into the cache, one
of the existing blocks must be replaced.
For direct mapping, there is only one possible line for any particular block, and
no choice is possible.
Direct mapping — No choice, Each block only maps to one line. Replace that
line.
For the associative and set-associative techniques, a replacement algorithm
is needed. To achieve high speed, such an algorithm must be implemented in
hardware.
Least Recently Used (LRU) — Most Effective
3. Replacement Algorithm
For two- way set associative, this is easily implemented. Each line includes a
USE bit. When a line is referenced, its USE bit is set to 1 and the USE bit of the
other line in that set is set to 0. When a block is to be read into the set, the line
whose USE bit is 0 is used.
Because we are assuming that more recently used memory locations are
more likely to be referenced, LRU should give the best hit ratio. LRU is also
relatively easy to implement for a fully associative cache. The cache
mechanism maintains a separate list of indexes to all the lines in the cache.
When a line is referenced, it moves to the front of the list. For replacement, the
line at the back of the list is used. Because of its simplicity of implementation,
LRU is the most popular replacement algorithm.
4. Write Policy
When you are saving changes to main memory. There are two techniques
involved:
Write Through:
• Every time an operation occurs, you store to main memory as well as cache
simultaneously. Although that may take longer, it ensures that main memory is
always up to date and this would decrease the risk of data loss if the system
would shut off due to power loss. This is used for highly sensitive
information.
• One of the central caching policies is known as write-through. This means
that data is stored and written into the cache and to the primary storage device
at the same time.
• One advantage of this policy is that it ensures information will be stored safely
without risk of data loss. If the computer crashes or the power goes out, data
can still be recovered without issue.
• To keep data safe, this policy has to perform every write operation twice. The
program or application that is being used must wait until the data has been
written to both the cache and storage device before it can proceed.
• This comes at the cost of system performance but is highly recommended for
sensitive data that cannot be lost.
• Many businesses that deal with sensitive customer information such as
payment details would most likely choose this method since that data is very
critical to keep intact.
Write Back:
• Saves data to cache only.
• But at certain intervals or under a certain condition you would save data to the
main memory.
• Disadvantage: there is a high probability of data loss.
5. Line Size
Another design element is the line size. When a block of data is retrieved and
placed in the cache, not only the desired word but also some number of adjacent
words are retrieved.

As the block size increases from very small to larger sizes, the hit ratio will at
first increase because of the principle of locality, which states that data in the
vicinity of a referenced word are likely to be referenced in the near future.

As the block size increases, more useful data are brought into the cache. The hit
ratio will begin to decrease, however, as the block becomes even bigger and the
probability of using the newly fetched information becomes less than the
probability of reusing the information that has to be replaced.
Two specific effects come into play:

• Larger blocks reduce the number of blocks that fit into a cache. Because each
block fetch overwrites older cache contents, a small number of blocks results
in data being overwritten shortly after they are fetched.
• As a block becomes larger, each additional words is farther from the requested
word and therefore less likely to be needed in the near future.
6. Number of Caches

Multilevel Caches:
•On chip cache accesses are faster than cache reachable via an external bus.
•On chip cache reduces the processor’s external bus activity and therefore
speeds up execution time and system performance since bus access times are
eliminated.
•L1 cache always on chip (fastest level)
•L2 cache could be off the chip in static ram
•L2 cache doesn’t use the system bus as the path for data transfer between the
L2 cache and processor, but it uses a separate data path to reduce the burden on
the system bus. (System bus takes longer to transfer data)
•In modern designed computers L2 cache may now be on the chip. Which
means that an L3 cache can be added over the external bus. However, some L3
caches can be installed on the microprocessor as well.
•In all of these cases there is a performance advantage to adding a third level
cache.

Unified (One cache for data and instructions) vs Split (two, one for data
and one for instructions)
These two caches both exist at the same level, typically as two L1 caches. When
the processor attempts to fetch an instruction from main memory, it first consults
the instruction L1 cache, and when the processor attempts to fetch data from
main memory, it first consults the data L1 cache.
7. Mapping Function
Because there are fewer cache lines than main memory blocks, an algorithm is
needed for mapping main memory blocks into cache lines
Further, a means is needed for determining which main memory block currently
occupies a cache line. The choice of the mapping function dictates how the
cache is organized. Three techniques can be used: direct, associative, and set-
associative.

Cache vs RAM
Although Cache and RAM both are used to increase the performance of the
system there exists a lot of differences in which they operate to increase the
efficiency of the system.
RAM Cache
RAM is larger in size compared to The cache is smaller in size. Memory
cache. Memory ranges from 1MB to ranges from 2KB to a few MB
16GB generally.
It stores data that is currently processed It holds frequently accessed data.
by the processor.
OS interacts with secondary memory to OS interacts with primary memory
get data to be stored in Primary Memory to get data to be stored in Cache.
or RAM
It is ensured that data in RAM are loaded CPU searches for data in Cache, if not
before access to the CPU. This found cache miss occur.
eliminates RAM miss never.
Differences between associative and cache memory:

Associative Memory Cache Memory


A memory unit access by content is A fast and small memory is called
called associative memory. cache memory.
It reduces the time required to find the It reduces the average memory access
item stored in memory. time.
Here data accessed by its content. Here, data are accessed by its address.
It is used where search time is very short. It is used when a particular group of
data is accessed repeatedly.
Its basic characteristic is its logic circuit Its basic characteristic is its fast access.
for matching its content.
It is not as expensive as cache memory. It is expensive as compared to
associative memory.
It is suitable for parallel data search It is useful in increasing the efficiency
mechanism. of data retrieval.
Advantages of Cache Memory
• The main memory is slower than cache memory.
• It creates a way for fast data transfers so it consumes less access time as
compared to main memory.
• It stores frequently access that can be executed within a short period of time.

Disadvantages of Cache Memory


• It is limited capacity memory.
• It is very expensive as compared to Memory (random access memory
(RAM)) and Hard Disk.
References

Reference Books:
●J.P. Hayes, “Computer Architecture and Organization”, Third Edition.
●Mano, M., “Computer System Architecture”, Third Edition, Prentice Hall.
●Stallings, W., “Computer Organization and Architecture”, Eighth Edition,
Pearson Education.

Text Books:
●Carpinelli J.D,” Computer systems organization &Architecture”, Fourth
Edition, Addison Wesley.
●Patterson and Hennessy, “Computer Architecture”, Fifth Edition Morgaon
Kauffman.
Reference Website:

●Differences between Associative and Cache Memory – GeeksforGeeks


●https://searchstorage.techtarget.com/definition/cache-algorithm
●https://www.includehelp.com/cso/types-of-cache-replacement-policies.aspx

Video Links:
●https://youtu.be/SV7Kk1njt5c?si=ffVZ8zVOF2qW4oqk
●https://youtu.be/wI6_dl4WjlY?si=Hoz7CndJ95pQ71Yz
●https://youtu.be/OfqzoQ9Kw9k?si=K3S7xGMboveTzY7z
●https://youtu.be/QZ_9Oe5E61Q?si=mslQJSaHwmd-Kbkj
●https://youtu.be/hhLdy3J9oqg?si=CVqVMV1QaViTcp4Q
●https://youtu.be/VNw00047giw?si=gUGe-WSt-Hyd3kzX
●https://youtu.be/5LmyIpJcd9I?si=IjbnbbbzAkuldULz
University Institute of Engineering
Department of Computer Science & Engineering

COMPUTER ORGANIZATION & ARCHITECTURE


(23CST-204/23ITT-204)

ER. SHIKHA ATWAL


E11186

ASSISTANT PROFESSOR

BE-CSE
PAGING AND SEGMENTATION

Paging and segmentation are processes by which data is stored to, then
retrieved from, a computer's storage disk.

Paging is a computer memory management function that presents storage


locations to the computer's CPU as additional memory, called virtual
memory. Each piece of data needs a storage address.

Segmentation is a virtual process that creates variable-sized address spaces in


computer storage for related data, called segments. This process speed retrieval.
Managing computer memory is a basic operating system function -- both
paging and segmentation are basic functions of the OS. No system can
efficiently rely on limited RAM alone. So, the computer’s memory
management unit (MMU) uses the storage disk, HDD or SSD, as virtual
memory to supplement RAM.

Let's look in-depth at paging, then we'll look in-depth at segmentation.

WHAT IS PAGING?

As mentioned above, the memory management function called paging


specifies storage locations to the CPU as additional memory, called virtual
memory. The CPU cannot directly access storage disk, so the MMU emulates
memory by mapping pages to frames that are in RAM.
Before we launch into a more detailed explanation of pages and frames,
let’s define some technical terms.

• Page: A fixed-length contiguous block of virtual memory residing on disk.


• Frame: A fixed-length contiguous block located in RAM; whose sizing is
identical to pages.
• Physical memory: The computer’s random access memory (RAM), typically
contained in DIMM cards attached to the computer’s motherboard.
• Virtual memory: Virtual memory is a portion of an HDD or SSD that is
reserved to emulate RAM. The MMU serves up virtual memory from disk to
the CPU to reduce the workload on physical memory.
• Virtual address: The CPU generates a virtual address for each active
process. The MMU maps the virtual address to a physical location in RAM
and passes the address
to the bus. A virtual address space is the range of virtual addresses under CPU
control.
• Physical address: The physical address is a location in RAM. The physical
address space is the set of all physical addresses corresponding to the CPU’s
virtual addresses. A physical address space is the range of physical addresses
under MMU control.

By assigning an address to a piece of data using a "page table" between the


CPU and the computer's physical memory, a computer's MMU enables the
system to retrieve that data whenever needed.
Paging

• External fragmentation is avoided by using paging technique.


• Paging is a technique in which physical memory is broken into blocks of the
same size called pages (size is power of 2, between 512 bytes and 8192 bytes).
• When a process is to be executed, its corresponding pages are loaded into any
available memory frames.
• Logical address space of a process can be non-contiguous and a process is
allocated physical memory whenever the free memory frame is available.
• Operating system keeps track of all free frames. Operating system needs n free
frames to run a program of size n pages.
Address generated by CPU is divided
into

• Page number (p) -- page number is


used as an index into a page table
which contains base address of each
page in physical memory.
• Page offset (d) -- page offset is
combined with base address to
define the physical memory address.
THE PAGING PROCESS

A page table stores the definition of each page. When an active process requests
data, the MMU retrieves corresponding pages into frames located in physical
memory for faster processing. The process is called paging.

The MMU uses page tables to translate virtual addresses to physical ones. Each
table entry indicates where a page is located: in RAM or on disk as virtual
memory. Tables may have a single or multi-level page table such as different
tables for applications and segments.
However, constant table lookups can slow down the MMU. A memory
cache called the Translation Lookaside Buffer (TLB) stores recent translations
of virtual to physical addresses for rapid retrieval. Many systems have multiple
TLBs, which may reside at different locations, including between the CPU and
RAM, or between multiple page table levels.
Different frame sizes are available for data sets with larger or smaller pages and
matching- sized frames. 4KB to 2MB are common sizes, and GB-sized
frames are available in high- performance servers.

An issue called hidden fragmentation used to be a problem in older Windows


deployments (95, 98, and Me). The problem was internal (or hidden)
fragmentation. Unlike the serious external fragmentation of segmenting, internal
fragmentation occurred if every frame is not the exact size of the page size.
However, this is not an issue in modern Windows OS.
WHAT IS SEGMENTATION?

The process known as segmentation is a virtual process that creates address


spaces of various sizes in a computer system, called segments. Each segment is a
different virtual address space that directly corresponds to process objects.

When a process executes, segmentation assigns related data into segments for
faster processing. The segmentation function maintains a segment table that
includes physical addresses of the segment, size, and other data.
In segmentation, the CPU generates a logical address that contains the Segment
number and segment offset. If the segment offset is a smaller amount than the
limit then the address called valid address otherwise it throws miscalculation
because the address is invalid.
The above figure shows the translation of a logical address to a physical address.

Segmentation speeds up a computer's information retrieval by assigning related


data into a "segment table" between the CPU and the physical memory.

• Segmentation is a technique to break memory into logical pieces where


each piece represents a group of related information.
• For example, data segments or code segment for each process, data
segment for operating system and so on.
• Segmentation can be implemented using or without using paging.
• Unlike paging, segment are having varying sizes and thus eliminates
internal fragmentation.
• External fragmentation still exists but to lesser extent.
Address generated by CPU is divided into

• Segment number (s) -- segment number is


used as an index into a segment table which
contains base address of each segment in
physical memory and a limit of segment.
• Segment offset (o) -- segment offset is first
checked against limit and then is combined
with base address to define the physical
memory address.

Address generated by CPU


THE SEGMENTATION PROCESS

Each segment stores the process primary function, data structures, and utilities. The
CPU keeps a segment map table for every process and memory blocks, along with
segment identification and memory locations.

The CPU generates virtual addresses for running processes. Segmentation translates
the CPU- generated virtual addresses into physical addresses that refer to a
unique physical memory location. The translation is not strictly one-to-one:
different virtual addresses can map to the same physical address.
THE CHALLENGE OF FRAGMENTATION

Although segmentation is a high-speed and highly secure memory


management function, external fragmentation proved to be an
insurmountable challenge. Segmentation causes external fragmentation to the
point that modern x86-64 servers treat it as a legacy application, and only support
it for backwards compatibility.

External fragmentation occurs when unusable memory is located outside of


allocated memory blocks. The issue is that the system may have enough memory
to satisfy process request, but the available memory is not in a contiguous
location. In time, the fragmentation worsens and significantly slows the
segmentation process.
SEGMENTED PAGING

Some modern computers use a function called segmented paging. Main memory is
divided into variably-sized segments, which are then divided into smaller fixed-size
pages on disk. Each segment contains a page table, and there are multiple page
tables per process.

Each of the tables contains information on every segment page, while the segment
table has information about every segment. Segment tables are mapped to page
tables, and page tables are mapped to individual pages within a segment.

Advantages include less memory usage, more flexibility on page sizes, simplified
memory allocation, and an additional level of data access security over paging. The
process does not cause external fragmentation.
Advantages of Paging:

• On the programmer level, paging is a transparent function and does not


require intervention.
• No external fragmentation.
• No internal fragmentation on updated OS’s.
• Frames do not have to be contiguous.

Disadvantages of Paging:

• Paging causes internal fragmentation on older systems.


• Longer memory lookup times than segmentation; remedy with TLB memory
caches.
Advantages of Segmentation:

• No internal fragmentation.
• Segment tables consumes less space compared to page tables.
• Average segment sizes are larger than most page sizes, which allows segments to
store more process data.
• Less processing overhead.
• Simpler to relocate segments than to relocate contiguous address spaces on disk.
• Segment tables are smaller than page tables, and takes up less memory.
Disadvantages of Segmentation:

• Uses legacy technology in x86-64 servers.


• Linux only supports segmentation in 80x86 microprocessors: states that
paging simplifies memory management by using the same set of linear addresses.
• Porting Linux to different architectures is problematic because of limited
segmentation support.
• Requires programmer intervention.
• Subject to serious external fragmentation.
KEY DIFFERENCES: PAGING AND SEGMENTATION

Size:
• Paging: Fixed block size for pages and frames. Computer hardware
determines page/frame sizes.
• Segmentation: Variable size segments are user-specified.

Fragmentation:
• Paging: Older systems were subject to internal fragmentation by not allocating
entire pages to memory. Modern OS’s no longer have this problem.
• Segmentation: Segmentation leads to external fragmentation.
KEY DIFFERENCES: PAGING AND SEGMENTATION

Tables:
• Paging: Page tables direct the MMU to page location and status. This is a
slower process than segmentation tables, but TLB memory cache accelerates it.
• Segmentation: Segmentation tables contain segment ID and information, and
are faster than direct paging table lookups.

Availability:
• Paging: Widely available on CPUs and as MMU chips.
• Segmentation: Windows servers may support backwards compatibility, while
Linux has very limited support.
References

Reference Books:
•J.P. Hayes, “Computer Architecture and Organization”, Third Edition.
•Mano, M., “Computer System Architecture”, Third Edition, Prentice Hall.
•Stallings, W., “Computer Organization and Architecture”, Eighth Edition,
Pearson Education.

Text Books:
•Carpinelli J.D,” Computer systems organization &Architecture”, Fourth
Edition, Addison Wesley.
•Patterson and Hennessy, “Computer Architecture”, Fifth Edition Morgaon
Kauffman.
Other References

•https://www.ques10.com/p/10067/what-is-virtual-memory-explain-the-role-of-
pagin- 1/
•https://www.enterprisestorageforum.com/storage-hardware/paging-and-
segmentation.html
•https://www.cmpe.boun.edu.tr/~uskudarli/courses/cmpe235/Virtual%20Memory
.pdf
University Institute of Engineering
Department of Computer Science & Engineering

COMPUTER ORGANIZATION & ARCHITECTURE


(23CST-204/23ITT-204)

ER. SHIKHA ATWAL


E11186

ASSISTANT PROFESSOR

BE-CSE
MEMORY MANAGEMENT HARDWARE

Memory management hardware plays a crucial role in computer architecture


by efficiently allocating and managing the available memory resources. It is
responsible for managing and organizing the memory subsystem, ensuring
optimal performance, and facilitating the execution of various programs
and processes. This article explores the various aspects of memory
management hardware, including its components, functions, and techniques.
Components of Memory Management Hardware

The memory management hardware consists of several key components that work
together to ensure efficient memory usage and allocation. These components include:
• Memory Management Unit (MMU): The MMU is a critical component of
memory management hardware. It translates virtual addresses to physical
addresses, enabling the system to access the correct memory location.
• Translation Lookaside Buffer (TLB): The TLB is a cache that stores
recently used virtual-to-physical address translations, speeding up the address
translation process.
• Memory Segmentation Unit: This unit divides the memory into fixed-size
segments to organize and manage memory resources efficiently.
• Memory Protection Unit (MPU): The MPU ensures the security and
protection of memory by enforcing access permissions and preventing
unauthorized access.
Memory Management Unit (MMU)
The Memory Management Unit (MMU) is a key component of memory
management hardware in computer architecture. It performs the essential
task of translating virtual addresses generated by the CPU into physical
addresses, allowing the system to access the correct memory location.
The MMU works in conjunction with the operating system's memory
management software to allocate and manage memory resources effectively.
It uses a technique called address translation, which involves converting
virtual addresses to physical addresses by utilizing page tables or translation
tables.
The MMU also plays a vital role in memory protection by implementing
memory access control and prevention mechanisms. It enforces access
permissions, ensuring that each process can only access its allocated memory
and preventing unauthorized access to sensitive information.
Translation Lookaside Buffer (TLB)
The Translation Lookaside Buffer (TLB) is a cache in the memory management
hardware that stores recently used virtual-to-physical address translations. It acts
as a high-speed memory for address translation, improving the overall
performance of the system.
When the CPU generates a virtual address, the TLB checks if the translation for
that address is available in its cache. If the translation is found, the TLB provides
the corresponding physical address, eliminating the need for a time-consuming
lookup in the page tables or translation tables.
The TLB operates on the principle of locality, which states that recently
accessed memory locations are likely to be accessed again in the near
future. By storing frequently used translations, the TLB reduces the
overhead of address translation, improving system performance.
Functions of Memory Management Hardware

The memory management hardware performs several critical functions to


ensure efficient memory usage and allocation. These functions include:
Address Translation: The memory management hardware translates virtual
addresses into physical addresses, allowing the CPU to access the correct
memory location.
Memory Allocation: It allocates and deallocates memory resources to
processes, ensuring that each process has sufficient memory to execute
efficiently.
Memory Protection: The memory management hardware enforces access
permissions and prevents unauthorized access to memory areas.
Virtual Memory Management: It manages the mapping of virtual addresses to
physical addresses, enabling the efficient use of limited physical memory by
utilizing disk-based virtual memory.
Address Translation

Address translation is one of the primary functions of memory management


hardware. It involves converting virtual addresses generated by the CPU into
physical addresses, allowing the system to access the correct memory location.
During address translation, the memory management hardware uses page tables
or translation tables to map virtual addresses to physical addresses. This process
ensures that each process has its isolated memory space, protecting it from
interference by other processes.
Address translation is essential for enabling the efficient use of physical
memory and facilitating the execution of multiple processes concurrently. By
translating virtual addresses to physical addresses, the memory management
hardware provides each process with a unique memory address space.
Memory Allocation

Memory allocation is another crucial function of memory management hardware.


It involves assigning memory resources to processes and deallocating them when
no longer needed.
The memory management hardware tracks the available memory blocks,
allocating them to processes based on their memory requirements. It ensures that
each process has a sufficient amount of memory to execute efficiently, preventing
resource contention.
Efficient memory allocation allows for the simultaneous execution of
multiple processes, maximizing system productivity. By managing memory
resources effectively, the memory management hardware minimizes wastage
and fragmentation, optimizing overall system performance.
Techniques Used in Memory Management Hardware

The memory management hardware employs various techniques to


enhance memory utilization and performance. Some of the commonly used
techniques include:

• Paging:Paging involves dividing memory into fixed-size blocks called


pages and mapping them in a virtual address space. It allows for efficient
memory allocation and utilization by allocating pages based on demand.

• Segmentation: Segmentation involves dividing memory into logical segments


based on program structure and memory requirements. It provides a flexible
memory allocation scheme but can lead to external fragmentation.
Techniques Used in Memory Management Hardware

• Virtual Memory: Virtual memory is a technique that allows the execution of


programs larger than the available physical memory. It uses disk-based storage
to supplement the physical memory, transferring data between memory and
disk as needed.

• Memory Protection: The memory management hardware implements various


memory protection mechanisms, such as access permissions and privilege
levels, to ensure the security and integrity of memory.
Virtual Memory

Virtual memory is a memory management technique that allows the execution


of programs larger than the available physical memory. It provides an illusion
of larger memory space by utilizing disk-based storage as an extension of the
physical memory.
In virtual memory, the memory management hardware transparently transfers
data between the physical memory and disk storage as needed. It uses
techniques like demand paging or demand segmentation to efficiently manage
memory resources.
Virtual memory significantly improves system performance by allowing the
execution of larger programs without requiring an equivalent amount of
physical memory. By utilizing disk storage as a supplement to physical
memory, virtual memory enables the efficient execution of memory-intensive
applications.
Role of Memory Management Hardware in Computer Architecture

The memory management hardware plays a critical role in computer


architecture by ensuring efficient memory usage, allocation, and protection.
It facilitates the execution of various programs and processes by managing
the memory subsystem effectively.
Memory management hardware components such as the MMU and TLB
enable address translation, allowing the CPU to access the correct memory
location. They also optimize memory access by caching frequently used
translations, reducing the overhead of address translation.
By allocating memory resources to processes, the memory management
hardware enables the simultaneous execution of multiple programs, maximizing
system productivity. It also enforces memory protection mechanisms to ensure
the security and integrity of memory, preventing unauthorized access and
interference.
Furthermore, memory management hardware techniques like paging,
segmentation, and virtual memory enhance memory utilization and
performance. They allow for efficient memory allocation, utilization, and
the execution of programs larger than the available physical memory.

In conclusion, memory management hardware is a crucial component of


computer architecture. It plays a vital role in managing memory resources,
optimizing performance, and ensuring the smooth execution of programs and
processes.
References

Reference Books:
•J.P. Hayes, “Computer Architecture and Organization”, Third Edition.
•Mano, M., “Computer System Architecture”, Third Edition, Prentice Hall.
•Stallings, W., “Computer Organization and Architecture”, Eighth Edition,
Pearson Education.

Text Books:
•Carpinelli J.D,” Computer systems organization &Architecture”, Fourth
Edition, Addison Wesley.
•Patterson and Hennessy, “Computer Architecture”, Fifth Edition Morgaon
Kauffman.
Other References

•https://www.ques10.com/p/10067/what-is-virtual-memory-explain-the-role-
of-pagin- 1/
•https://www.enterprisestorageforum.com/storage-hardware/paging-and-
segmentation.html
•https://www.cmpe.boun.edu.tr/~uskudarli/courses/cmpe235/Virtual%20Memo
ry.pdf
University Institute of Engineering
Department of Computer Science & Engineering

COMPUTER ORGANIZATION & ARCHITECTURE


(23CST-204/23ITT-204)

ER. SHIKHA ATWAL


E11186

ASSISTANT PROFESSOR

BE-CSE
INPUT-OUTPUT ORGANIZATION

The input-output subsystem of a computer, referred to as I/O, provides an


efficient mode of communication between the central system and the outside
environment.
Programs and data must be entered into computer memory for processing and
results obtained from computations must be recorded or displayed for the user.
Peripheral Devices

Input or output devices attached to the computer are also called peripherals.
•The display terminal can operate in a single-character mode where all
characters entered on the screen through the keyboard are
transmitted to the computer simultaneously. In the block mode, the
edited text is first stored in a local memory inside the terminal. The text is
transferred to the computer as a block of data.
•Printers provide a permanent record on paper of computer output data.
•Magnetic tapes are used mostly for storing files of data.
•Magnetic disks have high-speed rotational surfaces coated with magnetic
material.
Input-Output Interface

Input-output interface provides a method for transferring information between


internal storage and external I/O devices. Peripherals connected to a computer
need special communication links for interfacing them with the central
processing unit. The major differences are:

1.Peripherals are electromechanical and electromagnetic devices and their


manner of operation is different from the operation of the CPU and memory,
which are electronic devices. Therefore, a conversion of signal values may
be required.
2.The data transfer rate of peripherals is usually slower than the transfer rate
of the CPU, and consequently, a synchronization mechanism may be needed.
3. Data codes and formats in peripherals differ from the word format in
the CPU and memory.
4. The operating modes of peripherals are different from each other and each
must be controlled so as not to disturb the operation of other peripherals
connected to the CPU.

A typical communication link between the processor and several peripherals is


shown in the figure below.
v

Connection of I/O bus to input-output devices

The I/O bus consists of data lines, address lines, and control lines. The magnetic
disk, printer, and terminal are employed in practically any general-purpose
computer. The interface selected responds to the function code and proceeds to
execute it.
The function code is referred to as an I/O command and is in essence an
instruction that is executed in the interface and its attached peripheral unit.

There are three ways that computer buses can be used to communicate with
memory and I/O:

1.Use two separate buses, one for memory and the other for I/O.
2.Use one common bus for both memory and I/O but have separate control lines
for each.
3.Use one common bus for memory and I/O with common control lines.
Asynchronous Data Transfer

The internal operations in a digital system are synchronized by means of clock


pulses supplied by a common pulse generator. If the registers in the interface
share a common clock with the CPU registers, the transfer between the two units
is said to be synchronous.
In most cases, the internal timing in each unit is independent from the other in
that each uses its own private clock for internal registers. In that case, the
two units are said to be asynchronous to each other. This approach is
widely used in most computer systems. Asynchronous data transfer between
two independent units requires that control signals be transmitted between
the communicating units to indicate the time at which data is being
transmitted.
There are two ways of achieving this mentioned below:

•Strobe Control: pulse supplied by one of the units to indicate to the other unit
when the transfer has to occur.

•Handshaking: The unit receiving the data item responds with another control
signal to acknowledge receipt of the data.
1. Strobe Control Method
The Strobe Control method of asynchronous data transfer employs a single
control line to time each transfer. This control line is also known as a strobe, and
it may be achieved either by source or destination, depending on which initiate
the transfer.
•Source initiated strobe: In the below block diagram, you can see that strobe is
initiated by source, and as shown in the timing diagram, the source unit first
places the data on the data bus.
After a brief delay to ensure that the data resolve to a stable value, the source
activates a strobe pulse.
The information on the data bus and strobe control signal remains in the active
state for a sufficient time to allow the destination unit to receive the data.
The destination unit uses a falling edge of strobe control to transfer the contents
of a data bus to one of its internal registers.
The source removes the data from the data bus after it disables its strobe pulse.
Thus, new valid data will be available only after the strobe is enabled
again.
In this case, the strobe may be a memory-write control signal from the
CPU to a memory unit. The CPU places the word on the data bus and informs
the memory unit, which is the destination.
• Destination initiated strobe: In the below block diagram, you see that the
strobe initiated by destination, and in the timing diagram, the destination unit
first activates the strobe pulse, informing the source to provide the data.
•The source unit responds by placing the requested binary information on the
data bus. The data must be valid and remain on the bus long enough for the
destination unit to acceptit.
•The falling edge of the strobe pulse can use again to trigger a destination
register. The destination unit then disables the strobe. Finally, and source
removes the data from the data bus after a determined time interval. In this
case, the strobe may be a memory read control from the CPU to a memory
unit. The CPU initiates the read operation to inform the memory, which is a
source unit, to place the selected word into the data bus.
2. Handshaking Method

The strobe method has the disadvantage that the source unit that initiates the
transfer has no way of knowing whether the destination has received the
data that was placed in the bus. Similarly, a destination unit that initiates the
transfer has no way of knowing whether the source unit has placed data on the
bus.
So, this problem is solved by the handshaking method. The handshaking
method introduces a second control signal line that replays the unit that
initiates the transfer.
In this method, one control line is in the same direction as the data flow in the
bus from the source to the destination. The source unit uses it to inform the
destination unit whether there are valid data in the bus.

The other control line is in the other direction from the destination to the
source. This is because the destination unit uses it to inform the source whether
it can accept data. And in it also, the sequence of control depends on the unit
that initiates the transfer.
So, it means the sequence of control depends on whether the transfer is
initiated by source and destination.
• Source initiated handshaking: In the below block diagram, you can see
that two handshaking lines are "data valid", which is generated by the
source unit, and "data accepted", generated by the destination unit.
The timing diagram shows the timing relationship of the exchange of signals
between the two units.
The source initiates a transfer by placing data on the bus and enabling its data
valid signal.
The destination unit then activates the data accepted signal after it accepts the
data from the bus.
The source unit then disables its valid data signal, which invalidates the data on
the bus.
After this, the destination unit disables its data accepted signal, and the system
goes into its initial state.
The source unit does not send the next data item until after the destination unit
shows readiness to accept new data by disabling the data accepted signal.
This sequence of events described in its sequence diagram, which shows the
above sequence in which the system is present at any given time.
•Destination initiated handshaking:

In the below block diagram, you see that the


two handshaking lines are "data valid",
generated by the source unit, and "ready for
data" generated by the destination
unit. Note that the name of signal data
accepted generated by the destination unit
has been changed to ready for data to reflect
its new meaning.
The destination transfer is initiated, so the source unit does not place data on
the data bus until it receives a ready data signal from the destination unit.
After that, the handshaking process is the same as that of the source initiated.
The sequence of events is shown in its sequence diagram, and the timing
relationship between signals is shown in its timing diagram. Therefore, the
sequence of events in both cases would be identical.
Advantages of Asynchronous Data Transfer

Asynchronous Data Transfer in computer organization has the following


advantages, such as:

oIt is more flexible, and devices can exchange information at their own pace. In
addition, individual data characters can complete themselves so that even
if one packet is corrupted, its predecessors and successors will not be
affected.
oIt does not require complex processes by the receiving device. Furthermore, it
means that inconsistency in data transfer does not result in a big crisis since
the device can keep up with the data stream. It also makes asynchronous
transfers suitable for applications where character data is generated
irregularly.
Disadvantages of Asynchronous Data Transfer
There are also some disadvantages of using asynchronous data for transfer
in computer organization, such as:

o The success of these transmissions depends on the start bits and their
recognition. Unfortunately, this can be easily susceptible to line interference,
causing these bits to be corrupted or distorted.
o A large portion of the transmitted data is used to control and identify header
bits and thus carries no helpful information related to the transmitted
data. This invariably means that more data packets need to be sent.
References

Reference Books:
•J.P. Hayes, “Computer Architecture and Organization”, Third Edition.
•Mano, M., “Computer System Architecture”, Third Edition, Prentice Hall.
•Stallings, W., “Computer Organization and Architecture”, Eighth
Edition, Pearson Education.

Text Books:
•Carpinelli J.D,” Computer systems organization &Architecture”, Fourth
Edition, Addison Wesley.
•Patterson and Hennessy, “Computer Architecture”, Fifth Edition Morgaon
Kauffman.
Other References

•https://tutorialspoint.dev/computer-science/computer-organization-and-
architecture/io-interface-interrupt-dma-mode
•Asynchronous Data Transfer in Computer Organization - Javatpoint
•https://www.studytonight.com/computer-architecture/input-output-processor
•Handshaking in Computer architecture (includehelp.com)
University Institute of Engineering
Department of Computer Science & Engineering

COMPUTER ORGANIZATION & ARCHITECTURE


(23CST-204/23ITT-204)

ER. SHIKHA ATWAL


E11186

ASSISTANT PROFESSOR

BE-CSE
INTERRUPTS

Before interrupts, the CPU had to wait for the signal to process by continuously
checking for related hardware and software components (a method used earlier,
referred to as "polling"). This method expends CPU cycles waiting, reducing our
efficiency. The effective usage time of the CPU is reduced, reducing response time,
and there's also increased power consumption.

The solution proposed to the above problem was the use of interrupts. In this
method, instead of constantly checking for a signal, the CPU receives a signal from
the hardware or software components. The process of sending the signal by
hardware or software components is referred to as sending interrupts or interrupting
the system.
INTERRUPTS

The interrupts divert the CPU's attention to the signal request from the running
process. Once the interrupt signal is handled, the control is transferred back to the
previous process to continue from the exact position where it had left off.

Types of Interrupts

The interrupts are of two types:

1. Hardware interrupt
2. Software interrupt
1. Hardware Interrupt: Interrupts generated by the hardware are referred to as
hardware interrupts. The failure of hardware components and the completion of
I/O can trigger hardware interrupts.

The two subtypes under it are:

•Internal Interrupts: These interrupts occur when there is an error due to


some instruction. For example, overflows (register overflow), incorrect
instructions code, etc. These types of interrupts are commonly referred to as
traps.
•External Interrupts: These interrupts are the ones issued by hardware
components. For example, when the I/O process is completed data transfer,
an infinite loop in the given code, power failure, etc.
2. Software Interrupt: As the name suggests, these interrupts are caused by
software, mostly in user mode. When a software interrupt occurs, the control is
handed over to an interrupt handler (a part of the Operating System).
Termination of programs or requests of certain services like the output to
screen or using the printer can trigger it. These interrupts also have higher
priority than hardware interrupts.

There are two types of software interrupts:

•Normal Interrupts: These interrupts are caused by software instructions


and are made intentionally.
•Exception Interrupts: These interrupts are unplanned that occur during
the execution of the program. An example of this is the divide by zero
exception.
Handling of Interrupts by CPU

Let's assume there is a program composed of many instructions where we are


processing the instructions one by one. Eventually, we have reached a certain
instruction, and an interrupt occurs. Interrupts can occur for various reasons, for
example, if you have a program that does the 'X' thing when the user presses the
'A' key on the keyboard. If we press the key 'A', the normal flow of the program
has to be interrupted to process this new signal and do the 'X' thing. Later, we
return to executing the instructions from where we left off before the
previous process's interruption.

After the interrupt has occurred, the CPU needs to pass the control from the
current process to service the interrupts instead of moving to the next instruction
in the current process. Before transferring the control over to the interrupt
generated program, we need to store the state of the currently running process.
The summary of the process followed during an interrupt is as below:

1. While executing some instructions of a process, an interrupt is issued.

2. Execution of the current instruction is completed, and the system responds to


the interrupt.

3. The system sends an acknowledgement for the interrupt, and the interrupt
signal from the source stops on receiving the acknowledgement.

4. The process's state of the current task is stored (register values, address of the
next instruction to be processed when the control comes back to the process
(Program counter in PC register), etc.), i.e., moved to the stack.
5. The processor now handles the interrupt and executes the interrupt generated
program.

6. After handling the interrupt, the control is sent back to the point in the original
process using the state information of the process that we saved earlier.
Interrupt Triggering Methods

Each interrupts signal input is designed to be triggered by either a logic


signal level or a particular signal edge (level transition).

Level-sensitive inputs continuously request processor service so long as a


particular (high or low) logic level is applied to the input.

Edge-sensitive inputs react to signal edges, a particular (rising or falling) edge


will cause a service request to be latched. The processor resets the latch when
the interrupt handler executes.
•Level-triggered: A level-triggered interrupt is requested by holding the
interrupt signal at its particular (high or low) active logic level. A device
invokes a level-triggered interrupt by driving the signal to and holding it at the
active level. It negates the signal when the processor commands it, typically
after the device has been serviced.

The processor samples the interrupt input signal during each instruction
cycle. The processor will recognize the interrupt request if the signal is
asserted when sampling occurs.
Level-triggered inputs allow multiple devices to share a common interrupt
signal via wired-OR connections. The processor polls to determine which
devices are requesting service. After servicing a device, the processor may again
poll and, if necessary, service other devices before exiting the ISR.
• Edge-triggered: An edge-triggered interrupt is an interrupt signaled by a
level transition on the interrupt line, either a falling edge (high to low) or a
rising edge (low to high). A device wishing to signal an interrupt drives a pulse
onto the line and releases it to its inactive state. If the pulse is too short to be
detected by polled I/O, then special hardware may be required to detect it.
Advantages of Interrupts

Interrupts offer several advantages in system design:


• Efficiency: By using interrupts, the CPU can perform other tasks instead of
constantly polling for events. This increases overall system efficiency and
allows for better utilization of processing resources.
• Responsiveness: Interrupts enable the system to respond quickly to external
events. This is particularly important in real-time applications where delays can
lead to system failures or degraded performance.
• Modularity: Interrupts support modular design by allowing different parts of a
system to handle specific events independently. This separation of concerns
can simplify development and maintenance.
• Power Saving: In low-power systems, interrupts can help conserve energy by
allowing the CPU to enter low-power modes when idle and wake up only
when an interrupt occurs, reducing overall power consumption.
References

Reference Books:
•J.P. Hayes, “Computer Architecture and Organization”, Third Edition.
•Mano, M., “Computer System Architecture”, Third Edition, Prentice Hall.
•Stallings, W., “Computer Organization and Architecture”, Eighth Edition,
Pearson Education.

Text Books:
•Carpinelli J.D,” Computer systems organization &Architecture”, Fourth
Edition, Addison Wesley.
•Patterson and Hennessy, “Computer Architecture”, Fifth Edition Morgaon
Kauffman.
Other References

•http://www.ecs.csun.edu/~cputnam/Comp546/Input-Output-Web.pdf
•http://www.ioenotes.edu.np/media/notes/computer-organization-and-
architecture- coa/Chapter7-Input-Output-Organization.pdf
•https://www.geeksforgeeks.org/io-interface-interrupt-dma-mode/
•I/O Interface (Interrupt and DMA Mode) – GeeksforGeeks
University Institute of Engineering
Department of Computer Science & Engineering

COMPUTER ORGANIZATION & ARCHITECTURE


(23CST-204/23ITT-204)

ER. SHIKHA ATWAL


E11186

ASSISTANT PROFESSOR

BE-CSE
INPUT-OUTPUT INTERFACE

•The method that is used to transfer information between internal storage and
external I/O devices is known as I/O interface.
•The CPU is interfaced using special communication links by the peripherals
connected to any computer system. These communication links are used to
resolve the differences between CPU and peripheral.
•There exist special hardware components between CPU and peripherals to
supervise and synchronize all the input and output transfers that are called
interface units.
Modes of Transfer

The binary information that is received from an external device is usually stored
in the memory unit. The information that is transferred from the CPU to the
external device is originated from the memory unit. CPU merely processes the
information but the source and target are always the memory unit. Data transfer
between CPU and the I/O devices may be done in different modes.
Data transfer to and from the peripherals may be done in any of the three
possible ways
1.Programmed I/O
2.Interrupt-initiated I/O
3.Direct memory access (DMA)
1. Programmed I/O

It is one of the simplest forms of I/O where the CPU has to do all the work. It is
due to the result of the I/O instructions that are written in the computer
program. Each data item transfer is initiated by an instruction in the program.
Usually, the transfer is from a CPU register and memory. In this case it
requires constant monitoring by the CPU of the peripheral devices. This
technique is called Programmed I/O.

Consider a user process that wants to print the Nine-character string


‘‘TUTORIALS’’ on the printer with the help of a serial interface. The software
first assembles the string in a buffer in user space, as shown in the figure
below.
Step 1 − The user process acquires the printer for writing by using a system
call to open it.
Step 2 − If the printer is currently in use by another process, this system call
fails and returns an error code or it blocks until the printer is available,
depending on the operating system and the parameters of the call.
Step 3 − Once the printer is available, the user process makes a system call
telling the operating system to print the string on the printer.
Step 4 − The operating system generally copies the buffer with the string to an
array.
Step 5 − It then checks to see if the printer is currently available. If not, it waits
until it is. Whenever the printer is available, the operating system copies
the first character to the printer’s data register, In the above example use
memory-mapped I/O. This action activates the printer. The character does not
appear still, because some printers buffer a line or a page before printing
anything.
Step 6 − In the next figure, we see that the first character has been printed and
that the system has marked the ‘‘U’’ as the next character to be printed.
Step 7 − Whenever it has copied the first character to the printer, the operating
system checks to see if the printer is ready to accept another one.
Step 8 − Generally, the printer has a second register, which gives its status. The
act of writing to the data register causes the status to become not ready.
Step 9 − When the printer controller has processed the current character, it
indicates its availability by setting some bit in its status register or putting some
value in it.
Step 10 − At this point the operating system waits for the printer to become
ready again.
Step 11 − It prints the next character, as shown in the third figure.
Step 12 − This loop continues till the entire string has been printed.
Step 13 − Then control returns to the user process.
Programmable I/O is one of the I/O technique other than the interrupt-driven
I/O and direct memory access (DMA). The programmed I/O was the simplest
type of I/O technique for the exchanges of data or any types of
communication between the processor and the external devices. With
programmed I/O, data are exchanged between the processor and the I/O module.
The processor executes a program that gives it direct control of the I/O
operation, including sensing device status, sending a read or write command, and
transferring the data. When the processor issues a command to the I/O module, it
must wait until the I/O operation is complete. If the processor is faster than the
I/O module, this is wasteful of processor time.
The overall operation of the programmed I/O can be summaries as follows:

1.The processor is executing a program and encounters an instruction


relating to I/O operation.
2.The processor then executes that instruction by issuing a command to the
appropriate I/O module.
3.The I/O module will perform the requested action based on the I/O command
issued by the processor (READ/WRITE) and set the appropriate bits in the
I/O status register.
4.The processor will periodically check the status of the I/O module until it
finds that the operation is complete.
Programmed I/O Mode Input Data Transfer

1. Each input is read after first testing whether the


device is ready with the input (a state reflected by a
bit in a status register).

2. The program waits for the ready status by


repeatedly testing the status bit and till all targeted
bytes are read from the input device.

3. The program is in busy (non-waiting) state only


after the device gets ready else in wait state.
Programmed I/O Mode Output Data Transfer
1.Each output written after first testing whether the device is ready to accept
the byte at its output register or output buffer is empty.

2.The program waits for the ready status by repeatedly testing the status
bit(s) and till all the targeted bytes are written to the device.

3.The program in busy (non-waiting) state only after the device gets ready
else wait state.
I/O Commands

To execute an I/O-related instruction, the processor issues an address, specifying the


particular I/O module and external device, and an I/O command. There are four
types of I/O commands that an I/O module may receive when it is addressed by a
processor:

• Control: Used to activate a peripheral and tell it what to do. For example, a
magnetic- tape unit may be instructed to rewind or to move forward one record.
These commands are tailored to the particular type of peripheral device.
• Test: Used to test various status conditions associated with an I/O module
and its peripherals. The processor will want to know that the peripheral of
interest is powered on and available for use. It will also want to know if the most
recent I/O operation is completed and if any errors occurred.
• Read: Causes the I/O module to obtain an item of data from the peripheral and
place it in an internal buffer. The processor can then obtain the data item by
requesting that the I/O module place it on the data bus.
• Write: Causes the I/O module to take an item of data (byte or word) from the
data bus and subsequently transmit that data item to the peripheral.
I/O Instruction

With programmed I/O, there is a close correspondence between the I/O-related


instructions that the processor fetches from memory and the I/O commands that the
processor issues to an I/O module to execute the instructions. Typically, there will
be many I/O devices connected through I/O modules to the system. Each device is
given a unique identifier or address. When the processor issues an I/O command,
the command contains the address of the desired device. Thus, the I/O module must
interpret the address lines to determine if the command is for itself and also which
external devices that the address refer to.
When the processor, main memory share a common bus, two modes of
addressing are possible:
1. Memory mapped I/O
2. Isolated I/O

With memory mapped I/O, there is a single address space for memory
locations and I/O devices and the processor treats the status and data
registers of I/O modules as memory locations and uses the same machine
instructions to access both memory and I/O devices. So, for example, with 10
address lines, a combined total of = 1024 memory locations and I/O addresses
can be supported, in any combination. With memory-mapped I/O, a single read
line and a single write line are needed on the bus.
With isolated I/O, the bus may be equipped with memory read and write plus
input and output command lines. Now, the command line specifies whether
the address refers to a memory location or an I/O device. The full range of
addresses may be available for both. Again, with 10 address lines, the system
may now support both 1024 memory locations and 1024 I/O addresses.

For most types of processors, there is a relatively large set of different


instructions for referencing memory. If isolated I/O is used, there are only a
few I/O instructions. Thus, an advantage of memory-mapped I/O is that this
large repertoire of instructions can be used, allowing more efficient
programming. A disadvantage is that valuable memory address space is used up.
Differences between Isolated I/O and Memory Mapped I/O:

Isolated I/O Memory Mapped I/O


Isolated I/O uses separate memory Memory mapped I/O uses memory
space. from the main memory.
Limited instructions can be used. Any instruction which references to
Those are IN, OUT, INS, OUTS. memory can be used.
The address for Isolated I/O devices Memory mapped I/O devices are
are called ports treated as memory locations on the
memory map.
Advantages of Programmed I/O

•Simple to implement
•Very little hardware support

Disadvantages of Programmed I/O

•Busy in waiting for completion of task


•Ties up CPU for long period with no useful work
References

Reference Books:
•J.P. Hayes, “Computer Architecture and Organization”, Third Edition.
•Mano, M., “Computer System Architecture”, Third Edition, Prentice Hall.
•Stallings, W., “Computer Organization and Architecture”, Eighth Edition,
Pearson Education.

Text Books:
•Carpinelli J.D,” Computer systems organization &Architecture”, Fourth
Edition, Addison Wesley.
•Patterson and Hennessy, “Computer Architecture”, Fifth Edition Morgaon
Kauffman.
Other References

•Programmed I/O (brainkart.com)


•https://youtu.be/CIlIrjPb2bs?si=9I2K5yX3iFCAZ4eF
•https://youtu.be/ZC7CedalOYE?si=6vWuju9FjjeIhMcM
University Institute of Engineering
Department of Computer Science & Engineering

COMPUTER ORGANIZATION & ARCHITECTURE


(23CST-204/23ITT-204)

ER. SHIKHA ATWAL


E11186

ASSISTANT PROFESSOR

BE-CSE
MODES OF TRANSFER

Data transfer to and from the peripherals may be done in any of the three possible
ways
1.Programmed I/O
2.Interrupt-driven I/O
3.Direct memory access (DMA)
2. Interrupt-driven I/O
Interrupt driven I/O is an alternative scheme dealing with I/O. Interrupt I/O is
a way of controlling input/output activity whereby a peripheral or terminal
that needs to make or receive a data transfer sends a signal. This will cause a
program interrupt to be set.
At a time appropriate to the priority level of the I/O interrupt relative to the
total interrupt system, the processors enter an interrupt service routine. The
function of the routine will depend upon the system of interrupt levels
and priorities that is implemented in the processor. The interrupt
technique requires more complex hardware and software, but makes far
more efficient use of the computer’s time and capacities. Figure below shows
the simple interrupt processing.
Isolated I/O Memory Mapped I/O
Isolated I/O uses separate memory Memory mapped I/O uses memory
space. from the main memory.
Limited instructions can be used. Any instruction which references to
Those are IN, OUT, INS, OUTS. memory can be used.
The address for Isolated I/O devices Memory mapped I/O devices are
are called ports treated as memory locations on the
memory map.

For input, the device interrupts the CPU when new data has arrived and is
ready to be retrieved by the system processor. The actual actions to perform
depend on whether the device uses I/O ports or memory mapping.
For output, the device delivers an interrupt either
when it is ready to accept new data or to
acknowledge a successful data transfer.
Memory-mapped and DMA-capable devices
usually generate interrupts to tell the system they
are done with the buffer.
Here the CPU works on its given tasks
continuously. When an input is available, such as
when someone types a key on the keyboard, then
the CPU is interrupted from its work to take care
of the input data. The CPU can work continuously
on a task without checking the input devices,
allowing the devices themselves to interrupt it
as necessary.
Basic Operations of Interrupt

1.CPU issues read command.


2.I/O module gets data from peripheral whilst CPU does other work.
3.I/O module interrupts CPU.
4.CPU requests data.
5.I/O module transfers data.
Interrupt Processing

1.A device driver initiates an I/O request on behalf of a


process.

2.The device driver signals the I/O controller for the proper
device, which initiates the requested I/O.

3.The device signals the I/O controller that is ready to


retrieve input, the output is
complete or that an error has been generated.

4.The CPU receives the interrupt signal on the interrupt-


request line and transfer control over the interrupt handler
routine.
5. The interrupt handler determines the cause of the
interrupt, performs the necessary processing and executes a
“return from” interrupt instruction.

6.The CPU returns to the execution state prior to the


interrupt being signalled.

7.The CPU continues processing until the cycle begins


again.
Advantages of Interrupt-Driven I/O

•Fast
•Efficient

Disadvantages of Interrupt Drive I/O

•can be tricky to write if using a low-level language


•can be tough to get various pieces to work well together
•usually done by the hardware manufacturer / OS maker, e.g. Microsoft
3. Direct Memory Access (DMA)
The data transfer between a fast storage
media such as magnetic disk and memory
unit is limited by the speed of the CPU.
Thus we can allow the peripherals directly
communicate with each other using the
memory buses, removing the intervention of
the CPU. This type of data transfer
technique is known as DMA or direct
memory access. During DMA the CPU is
idle and it has no control over the memory
buses. The DMA controller takes over the
buses to manage the transfer directly
between the I/O devices and the memory
unit.
DMA

• Large blocks of data transferred at a high speed to or from high-speed devices,


magnetic drums, disks, tapes, etc.
• DMA controller Interface that provides I/O transfer of data directly to and
from the memory and the I/O device
• CPU initializes the DMA controller by sending a memory address and the
number of words to be transferred
• Actual transfer of data is done directly between the device and memory
through DMA controller -> Freeing CPU for other tasks

The transfer of data between the peripheral and memory without the interaction
of CPU and letting the peripheral device manage the memory bus directly is
termed as Direct Memory Access (DMA).
The two control signals Bus Request and Bus
Grant are used to fascinate the DMA transfer. The
bus request input is used by the DMA controller
to request the CPU for the control of the buses.
When BR signal is high, the CPU terminates the
execution of the current instructions and then
places the address, data, read and write lines to
the high impedance state and sends the bus grant
signal. The DMA controller now takes the control CPU bus signal for DMA transfer
of the buses and transfers the data directly
between memory and I/O without processor
interaction.
When the transfer is completed, the bus request signal is made low by DMA. In
response to which CPU disables the bus grant and again CPU takes the control of
address, data, read and write lines.
The transfer of data between the memory and I/O of course facilitates in two ways
which are DMA Burst and Cycle Stealing.

DMA BURST: The block of data consisting a number of memory words is


transferred at a time.
CYCLE STEALING: DMA transfers one data word at a time after which it
must return control of the buses to the CPU.

• CPU is usually much faster than I/O (DMA), thus CPU uses the most of the
memory cycles
• DMA Controller steals the memory cycles from CPU
• For those stolen cycles, CPU remains idle
• For those slow CPU, DMA Controller may steal most of the memory cycles
which may cause CPU remain idle long time
DMA CONTROLLER

The DMA controller communicates with the CPU through the data bus and control
lines. DMA select signal is used for selecting the controller, the register select is for
selecting the register.
When the bus grant signal is zero, the CPU communicates through the data bus to
read or write into the DMA register. When bus grant is one, the DMA controller
takes the control of buses and transfers the data between the memory and I/O.
Block diagram of DMA controller
The address register specifies the desired location of the memory which is
incremented after each word is transferred to the memory.
The word count register holds the number of words to be transferred which is
decremented after each transfer until it is zero. When it is zero, it indicates
the end of transfer.
After which the bus grant signal from CPU is made low and CPU returns to its
normal operation. The control register specifies the mode of transfer which is
Read or Write.
DMA TRANSFER

• DMA request signal is given from I/O device to DMA controller.


• DMA sends the bus request signal to CPU in response to which CPU disables its
current instructions and initialize the DMA by sending the following information.
oThe starting address of the memory block where the data are available (for read)
and where data to be stored (for write)
oThe word count which is the number of words in the memory block
oControl to specify the mode of transfer
oSends a bust grant as 1 so that DMA controller can take the control of the buses
oDMA sends the DMA acknowledge signal in response to which
peripheral device puts the words in the data bus (for write) or receives a word
from the data bus (for read).
DMA transfer in a
computer system
DMA OPERATION

•CPU tells DMA controller:


oRead/Write
oDevice address
oStarting address of memory block for data o Amount of data to be
transferred
•CPU carries on with other work
•DMA controller deals with transfer
•DMA controller sends interrupt when finished
References

Reference Books:
•J.P. Hayes, “Computer Architecture and Organization”, Third Edition.
•Mano, M., “Computer System Architecture”, Third Edition, Prentice Hall.
•Stallings, W., “Computer Organization and Architecture”, Eighth Edition,
Pearson Education.

Text Books:
•Carpinelli J.D,” Computer systems organization &Architecture”, Fourth
Edition, Addison Wesley.

•Patterson and Hennessy, “Computer Architecture”, Fifth Edition Morgaon


Kauffman.
Other References

•http://www.ecs.csun.edu/~cputnam/Comp546/Input-Output-Web.pdf
•http://www.ioenotes.edu.np/media/notes/computer-organization-and-
architecture- coa/Chapter7-Input-Output-Organization.pdf
•https://www.geeksforgeeks.org/io-interface-interrupt-dma-mode/
•I/O Interface (Interrupt and DMA Mode) - GeeksforGeeks
University Institute of Engineering
Department of Computer Science & Engineering

COMPUTER ORGANIZATION & ARCHITECTURE


(23CST-204/23ITT-204)

ER. SHIKHA ATWAL


E11186

ASSISTANT PROFESSOR

BE-CSE
I/O PROCESSORS

• Processor with direct memory access capability that communicates with I/O
devices
• Channel can execute a Channel Program
• Stored in the main memory
• Consists of Channel Command Word (CCW)
• Each CCW specifies the parameters needed by the channel to control the I/O
devices and perform data transfer operations
• CPU initiates the channel by executing a channel I/O class instruction
and once initiated, channel operates independently of the CPU
A computer may incorporate one or more external processors and assign
them the task of communicating directly with the I/O devices so that no each
interface needs to communicate with the CPU.

An I/O processor (IOP) is a processor with direct memory access capability that
communicates with I/O devices.

IOP instructions are specifically designed to facilitate I/O transfer. The IOP
can perform other processing tasks such as arithmetic logic, branching and code
translation.
Block diagram of a computer with I/O Processor

The memory unit occupies a central position and can communicate with each
processor by means of direct memory access. The CPU is responsible for
processing data needed in the solution of computational tasks. The IOP provides
a path for transferring data between various peripheral devices and memory unit.
In most computer systems, the CPU is the master while the IOP is a slave
processor.
The CPU initiates the IOP and after which the IOP operates independent
of CPU and transfer data between the peripheral and memory.

For example, the IOP receives 5 bytes from an input device at the device
rate and bit capacity. After which the IOP packs them into one block of 40 bits
and transfer them to memory. Similarly, the O/P word transfer from memory to
IOP is directed from the IOP to the O/P device at the device rate and bit
capacity.
CPU – IOP COMMUNICATION

The memory unit acts as a message center where each processor leaves
information for the other. The operation of typical IOP is appreciated with the
example by which the CPU and IOP communication.

CPU – IOP communication


•The CPU sends an instruction to test the IOP path.
•The IOP responds by inserting a status word in memory for the CPU to check.
•The bits of the status word indicate the condition of the IOP and I/O device, such
as IOP overload condition, device busy with another transfer or device ready
for I/O transfer.
•The CPU refers to the status word in in memory to decide what to do next.
•If all right up to this, the CPU sends the instruction to start I/O transfer.
•The CPU now continues with another program while IOP is busy with I/O
program.
•When IOP terminates the execution, it sends an interrupt request to CPU.
•CPU responds by issuing an instruction to read the status from the IOP.
•IOP responds by placing the contents to its status report into specified memory
location.
•Status word indicates whether the transfer has been complete or with error.
DATA COMMUNICATION PROCESSOR

•Distributes and collects data from many remote terminals connected through
telephone and other communication lines.
•Transmission:
oSynchronous
oAsynchronous
•Transmission Error:
oParity
oChecksum
oCyclic Redundancy Check
oLongitudinal Redundancy Check
•Transmission Modes:
oSimples
oHalf Duplex o Full Duplex
•Data Link & Protocol

A data communication (command) processor is an I/O processor that distributes


and collects data from remote terminals connected through telephone and other
communication lines. In processor communication, processor communicates
with the I/O device through a common bus i.e., data and control with sharing
by each peripherals. In data communication, processor communicates with
each terminal through a single pair of wires.
The way that remote terminals are connected to a data communication
processor is via telephone lines or other public or private communication
facilities. The data communication may be either through synchronous
transmission or through asynchronous transmission. One of the functions of
data communication processor is check for transmission errors. An error can be
detected by checking the parity in each character received. The other
ways are checksum, longitudinal redundancy check (LRC) and cyclic
redundancy check (CRC).

Data can be transmitted between two points through three different modes. First
is simplex where data can be transmitted in only one direction such as TV
broadcasting. Second is half duplex where data can be transmitted in both
directions at a time such as walkie-talkie. The third is full duplex where data
can be transmitted in both directions simultaneously such as telephone.
The communication lines, modems and other equipment used in the
transmission of information between two or more stations is called data
link. The orderly transfer of information in a data link is accomplished by
means of a protocol.
References

Reference Books:
•J.P. Hayes, “Computer Architecture and Organization”, Third Edition.
•Mano, M., “Computer System Architecture”, Third Edition, Prentice Hall.
•Stallings, W., “Computer Organization and Architecture”, Eighth Edition,
Pearson Education.

Text Books:
•Carpinelli J.D,” Computer systems organization &Architecture”, Fourth
Edition, Addison Wesley.
•Patterson and Hennessy, “Computer Architecture”, Fifth Edition Morgaon
Kauffman.
Other References

•http://www.ecs.csun.edu/~cputnam/Comp546/Input-Output-Web.pdf
•http://www.ioenotes.edu.np/media/notes/computer-organization-and-
architecture- coa/Chapter7-Input-Output-Organization.pdf
•https://www.geeksforgeeks.org/io-interface-interrupt-dma-mode/
•I/O Interface (Interrupt and DMA Mode) - GeeksforGeeks

You might also like