0% found this document useful (0 votes)
20 views

Computer Arch. and Org. USS

Uploaded by

nellybrand03
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views

Computer Arch. and Org. USS

Uploaded by

nellybrand03
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

COMPUTER ORGANIZATION AND ARCHITECTURE

Introduction
Computer Computer architecture refers to those attributes of a system that have a direct impact on the logical
Architecture execution of a program like the instruction set, the number of bits used to represent various data types,
I/O mechanisms and techniques for addressing memory.

Computer It refers to the physical organization of the hardware components of a computer system, and to the
Organization characteristics of these components, i.e. the operational units and their interconnections that realize
the architectural specifications.

For a given architecture, there could be many different models, each representing a different type of
organization, depending on the cost, physical size and technology used.

Internal hardware components


The most important internal components include processor, main memory, I/O controllers and buses (control bus, address bus
and data bus). These components are connected to a single printed circuit board (PCB) referred to as the motherboard, or logic
board

Functional Units of a Computer

1
A STRUCTURE AND ROLE OF THE PROCESSOR AND ITS COMPONENTS

The processor is a device that carries out computation on data by following


instructions. It handles all of the instructions that it receives from the user
and from the hardware and software
Components of the CPU
The components of the CPU include: ALU, CU, Registers, Cache Memory, and
Buses.

Simplified Internal Structure of a CPU


Arithmetic Logic Unit (ALU)
This unit performs the arithmetic and logical operations on the available data. Arithmetic operations include addition,
subtraction, multiplication and division while logical operations include comparisons, Bitwise logical operations (AND,
NOT, OR, etc.) and shift operations are also performed by the ALU to manipulate individual bits.
Control Unit
− The control unit is responsible for coordinating and controlling all the operations of the computer. It does this by
issuing timing and control signals to the other units, instructing them on how to respond to a program’s instructions.
− It locates and retrieves program instructions from memory, interpreting them and ensuring that they are executed
in proper sequence.
Registers:
Registers are high-speed temporary memory units within the CPU that hold (store) instructions and data that the
processor is currently working on.
Each register typically holds a word of data, often 32 or 64 bits, depending on the word size of the computer. The WORD
SIZE refers to the maximum number of bits that the processor can manipulate at the same time. (or can manipulate as
a unit in one machine cycle)

2
Examples of some registers are the instruction register, program counter, accumulator, status register, memory data
register and memory address register.
− Program Counter
The program counter (PC) holds (stores) the memory location (address) of the next instruction to be executed
− Instruction Register
The instruction register (IR) holds (stores) the instruction that is currently being executed by the processor. PC
holds the address of an instruction while IR holds an actual instruction
− Accumulator
The accumulator (AC) holds (stores) results of computations performed by the ALU.
− Memory Address Register
The memory address register (MAR) holds (stores) the memory address where data is about to be read
(fetched) or written (stored).
− Memory Data Register
The memory data register (MDR) holds (stores) data that has just been read in from memory or data produced
by the ALU and waiting to be written to memory.
− Status Register
The status register holds(stores) information about the state of the processor such as whether an overflow has
been detected. Status register bits are called flag bits or flags and each flag has a unique purpose. For example,
there are flag bits for negative, zero, overflow, and carry.
The control unit is made up of: PC (Program Counter), IR (Instruction Register), and Instruction Decoder
Registers that INTERFACE the CPU with its memory system.
Buffer registers interface the processor with its memory system. The two standard buffer registers are the memory
address register (MAR) and the memory buffer register (MBR). The MBR, is also known as the memory data register.
The Size or Length of Each Register
The size or length of each register is determined by its function. For example, the memory address register, which holds
the address of the next location in memory to be accessed, must have the same number of bits as the memory address.
Instruction register holds the instruction being executed and, therefore, should be of the same number of bits as the
instruction.
The clock
All computers have an internal clock. The clock generates a signal that is used to synchronize the operation of the
processor and the movement of data around the other components of the computer.
The speed of a clock is measured in either megahertz (MHz – millions of cycles per second) or gigahertz (GHz – 1000
million cycles per second). Common clock speeds today range from 2GHz to 3 GHz. Intel’s 5.5GHz i9-12900KS is
currently [15/10/2022] the world’s fastest desktop processor.
Buses
A bus is an electronic path along which data travels from one system component to another. It consists of a set of parallel
lines that interconnects computer components together, allowing the exchange of data between them.
A bus can be unidirectional (transmission of data can be only in one direction) or bidirectional (transmission of data
can be in both directions).
A bus that connects to all the three system components (CPU, memory and I/O devices) is called a system bus. System
buses can be grouped into data bus, address bus and control bus.
Data Bus
− The data bus is a bidirectional path for moving data and instructions between system components.
− The number of lines present in a data bus is called the width of data bus. Therefore, bus width is the number of bits that
can be sent down a bus in one go. Data bus width limits the maximum number of bits, which can be transferred
simultaneously between two modules.
Address Bus
− The address bus is a unidirectional bus that carries address information from the CPU to system components.
− The CPU uses the address bus to send the address of the memory location, data is to be written or read from.

3
− Also, when the CPU reads data from or writes to a port, it sends the port address out on the address bus.
− The width of the address bus determines the maximum possible memory of the system that can be addressed.
− In other words, the size of the address bus determines the address space of the computer. A computer with 32-bit
address bus can address a maximum of 232 B (4GB) memory locations
Control Bus
The control bus is a bidirectional bus that transmits command, timing and specific status information between system
components. Typical control signals include: memory write, memory read, I/O read, I/O write, bus request, bus grant,
transfer ACK, interrupt request, interrupt ACK, clock and reset.

Internal Components of a traditional shared bus Computer System


Note the directionality of each bus.
− The control bus is bi-directional at all points.
− The address bus is uni-directional away from the processor.
− The data bus is sometimes bi-directional, but data only comes in from the keyboard controller and out to the video
controller.
CPU CYCLE
Once the program and necessary data are in main memory, the CPU executes the program instructions one after the
other in the order specified by the program. The process of fetching, decoding and executing an instruction is called the
CPU cycle, machine instruction cycle or fetch-execute cycle.
Fetch Phase: The instruction is retrieved (or gotten) from main memory (RAM)
During the fetch phase, the following activities take place:
− Load content of PC into MAR MAR← [PC]
− Increment PC PC ← [PC]+1
− Send ‘read’ signal to memory MDR ← [𝑀𝑒𝑚𝑜𝑟𝑦]𝑎𝑑𝑑𝑟𝑒𝑠𝑠
− Load content of MDR into IR IR ← [MDR]
Decode phase: The fetched instruction is interpreted
During the fetch phase, the following activities take place:
− Identify the “Opcode” of instruction in IR
− Identify the “Operands” associated with instruction
− Evaluate effective address [EA] for instructions that require memory access
− Obtain source operands [OP] needed to perform operation
− Generate signals to activate the circuitry to carryout instruction
Execute phase: The operation specified by the instruction is performed or carried out
− Performs or carries out the operation using the operands
4
Store: Write result of operation to destination (register or memory)
− Transfer result of operation from ACC into MDR
− Load address of memory location to be written into MAR
− Content of MDR is copied into addressed memory location
− Factors affecting processor performance
There are a number of factors that affect processor performance
− Clock speed: [Number of pulses per second generated by an oscillator]
The clock rate or clock speed refers to the frequency at which the clock generator of a processor can generate pulses,
which are used to synchronize the operations of its components and is used as an indicator of the processor’s speed.
The speed of a clock is measured in either megahertz (MHz – millions of cycles per second) or gigahertz (GHz – 1000
million cycles per second). In theory therefore, increasing the clock speed will increase the speed at which the processor
executes instructions.
We can increase a CPU clock speed (Clock speed is measured in MHz or GHz) to try to make the computer run faster-
this is called overclocking. Overclocking can cause the CPU to overheat if they are forced to work faster than they were
designed to work and it can also cause data corruption.
− Bus width:
Bus width represents the number of bits that can be sent down a bus in one go. Increasing the width of the data bus
means that more bits and therefore more data can be passed down it with each pulse of the clock, which in turn means
more data can be processed within a given time interval. Increasing the width of the address bus will increase the
amount of memory that can be addressed and therefore allows more memory to be installed on the computer
− Word length:
Related to the data bus width is the word length. A word is a collection of bits that can be addressed and manipulated
as a single unit. Computer systems may have a word length of 32 or 64 bits, indicating that 64 bits of data can be handled
in one pulse of the clock. Word length and bus width are closely related in that a system with a 64-bit word length will
need 64-bit buses.
− Multiple cores:
Multicore technology combines two or more processor cores and cache memory on a single integrated circuit. A dual-
core processor has two execution cores and two L2 memory caches. These two “cores” work together simultaneously
to carry out different tasks. More cores can be added to further improve computer response time for increasingly
complex operations. To achieve these computing gains however, the operating system and applications must be
adjusted to take advantage of the multiple cores using a technique called multithreading, or passing tasks
simultaneously to different cores to execute. In theory therefore, increasing the number of cores increases processor
performance.
− A single-core processor can carry out one instruction at a time.
− A dual-core processor can carry out two instructions at a time.
− A quad-core processor can carry out four instructions at a time.
− … and so on.
However, it is not true that a quad-core processor will be four times faster than a single-core because:
− Some tasks cannot be run in parallel
− Other tasks may cause a delay while waiting for the result of another instruction
− There is a processing overhead in splitting the task into separate threads.
That said, a quad-core processor will typically complete a set of tasks more quickly than an equivalent single-core
processor.
− Cache memory:
Caching is a technique where instructions and data that are needed frequently, are placed into a temporary area of
memory that is separate from main memory. The advantage of this is that the cache can be accessed much more quickly
than main memory, so programs run faster. The key to this is ensuring that the most commonly used functions or data
used in a program are placed into the cache.

5
Von Neumann and Harvard architectures
− Von Neumann architecture
In the strictest definition, the term Von Neumann Computer refers to a specific type of computer architecture in which
INSTRUCTIONS AND DATA are stored together in a COMMON SHARED MEMORY. It is a stored-program computer
model based on the following 3 concepts:
− Data and instructions are stored in a SINGLE read-write memory
− The contents of this memory are addressable by location, without regard to the type of data contained there
− Execution occurs in a SEQUENTIAL fashion (unless explicitly modified) from one instruction to the next.
Stored Program Computer
Stored program concept: [Both program and the data on which it performs processing and calculations are stored in memory together]
− The program to be executed is resident in a memory directly accessible to the processor.
− Instructions are fetched one at a time (serially) from this memory and executed by the processor
− Data is resident in a memory directly accessible to the processor which can change it, if instructed to by the
executing program.
− Thus, the same data can be accessed repeatedly if so desired and the same instructions can be executed repeatedly
if so required
A Von-Neumann Computer Consists of Five Major Units:
Input unit, arithmetic/logic unit, control unit, memory unit and output unit.
CPU

Arithmetic/Logic Unit
Output Unit
Input Unit
Control Unit

Memory Unit

Block Diagram of a Von Neumann Computer

The Von-Neumann Bottleneck


Instructions and data are stored in the same memory unit and share a common communication pathway to the CPU.
This leads to a limited rate of data transfer (throughput) between the CPU and memory, a situation known as the Von
Neumann Bottleneck. The Von Neumann Bottleneck arises from the fact that CPU speed and memory size have grown
at a much more rapid rate than the throughput between them. Thus, although memory may hold a lot of data that needs
to be processed, and the CPU may be using only a fraction of its computational power, the limited data access speed
prevents the computer from doing its work any faster.
Approaches to overcome the Von-Neumann bottleneck include:
− Caching: The holding of frequently used data in a special area so that it is more readily accessible than if it were
stored in main memory
− Prefetching: The transfer of data into cache before it is requested to speed access in the event of a request
− Multithreading: Managing multiple requests simultaneously in separate threads
− RAMBUS: A memory subsystem consisting of the RAM, the RAM controller, and the bus connecting RAM to the
microprocessor and devices in the computer that use it.

6
Harvard Architecture
The Harvard computer architecture stores program instructions and data in separate memories. The program memory
and data memory have different communication pathways to the CPU.
The key difference between this and von Neumann is that separate buses are used for data and instructions, both of
which address different parts of memory. So rather than storing data and instructions in the same memory and then
passing them through the same bus, there are two separate buses addressing two different memories

Harvard architecture.
The advantage of this is that the instructions and data are handled more quickly as they do not have to share the same
bus. Therefore, a program running on Harvard architecture can be executed faster and more efficiently. Harvard
architecture is widely used on EMBEDDED COMPUTER SYSTEMS such as mobile phones, burglar alarms etc. where
there is a specific use, rather than being used within general purpose PCs.
B THE STORAGE UNIT

− The storage medium is the surface or substrate or physical material that holds actual data e.g. hard disk, floppy
disks, CDs, DVDs,
− Storage device is the computer hardware that reads from or writes data onto the storage medium e.g. floppy disk
drives, and CD or DVD drive.
Computer storage is classified basically into two: Primary storage and secondary storage.
Primary Storage:
Use:
Primary storage is computer storage used for holding programs and data that the CPU is currently working on.
Description:
It is also called Immediate Access Storage (IAS) as it can be directly accessed by the CPU.
Examples:
Main memory (RAM), Cache memory and ROM
Random Access Memory:
Use:
RAM is used to store data, files, part of an application or part of the operating system CURRENTLY IN USE
Characteristics:
• RAM is volatile (memory contents are lost on powering off the computer).
• RAM provides random access to the stored bytes, words, or larger data units. This means that it requires the same
amount of time to access information from RAM, irrespective of where it is located in it.
• RAM can also be written to or read from, and the data stored can be changed by the user or by the computer
Why RAMs with higher storage capacity improve computer performance:
• With more RAM, more of program instructions and data can be loaded into RAM, and there is less need to keep
swapping data in and out to the swap file on the hard disk drive. (Swapping of data slows down the speed at which
applications can run)
• Also, an increase in RAM will improve the multitasking capabilities of the computer as the instructions of several
programs will be able to be stored in RAM at the same time.
Types of RAM: DYNAMIC RAM (DRAM) AND STATIC RAM (SRAM)
DYNAMIC RAM (DRAM):
Characteristics:
• DRAM must be refreshed every few milliseconds to prevent data loss. This is because DRAM is made up of micro
capacitors that slowly leak their charge over time.

7
• It uses 1 TRANSISTOR AND 1 CAPACITOR PER BIT
Use:
DRAM is mostly used as MAIN MEMORY because, DRAM:
− Are much less expensive to manufacture than SRAMs
− Consume less power than SRAMs
− Have a higher memory capacity than SRAMs.
STATIC RAM (SRAM):
Characteristics:
• It makes use of flip flops (a bistable circuit composed of four or six transistors) which hold each bit of memory. It
does NOT have a capacitor in each cell.
• SRAM is more expensive than DRAM, and it takes up more space.
• SRAM does not need to be constantly refreshed.
• An SRAM memory cell has more parts so it takes more space on a chip than DRAM cell.
• SRAM is much faster than DRAM when it comes to data access
Use (application):
SRAM chip is usually used in CACHE MEMORY due to its HIGH SPEED.
Memory modules can be grouped into SIMM and DIMM.
The principal difference between SIMM (Single Inline Memory Module) and DIMM (Dual Inline Memory Module) is that
pins on opposite sides of a SIMM are ‘tied together’ to form one electrical contact while on a DIMM, opposing pins remain
electrically isolated to form two separate contacts.
Read Only Memory (ROM)
Use
• ROM has specialized uses for the storage of data or programs that are going to be used unchanged over and over
again
• In a general-purpose computer system, the most important use is in storing the bootstrap program. This is a
program that runs immediately when a system is switched on.
• The ROM memory chip stores the Basic Input Output System (BIOS).
Characteristics:
• ROM shares the random-access or direct-access properties of RAM.
• ROMs are non-volatile (the contents are not lost after powering off the computer)
• ROM are permanent memory devices (the contents cannot be changed)
• ROM, as the name implies, has only read capability and no write capability.
Types of ROM
Programmable ROM: PROM can be programmed with a special tool, but after it has been programmed the contents
cannot be changed. The manufacturer of the chip supplies blank PROM chips to a system builder. The system builder
installs their programs or data into the chips. The program or data once installed cannot be changed.
Erasable Programmable ROM (EPROM): The installed data or program can be ERASED (USING ULTRAVIOLET LIGHT)
and new data or a new program can be installed. However, this reprogramming usually requires the chip to be removed
from the circuit (or from the computer or device using it).
Electrically Erasable PROM (EEPROM). An electrical signal can be used to remove existing data. This has the major
advantage that the chip can remain in the circuit while the contents are changed.
Flash Memory is a kind of semiconductor-based non-volatile, rewritable computer memory that can be
electrically erased and reprogrammed. It is a specific type of EEPROM.
Flash memory is used in devices such as digital camera, mobile phone, printer, laptop computer, and record and play
back sound devices, such as MP3 players.

Cache Memory
Use
The cache is a smaller, faster memory which stores copies of the data and instructions from the MOST FREQUENTLY
used main memory locations so that they are immediately available to the CPU when needed. Cache memory is used by
the central processing unit of a computer to reduce the average time to access memory.
Why Cache improves computer performance:

8
Cache memory is faster than main memory (RAM). This means that the CPU can access cache memory more quickly
than it can access RAM. Therefore, retrieving frequently requested data and instructions from RAM and storing them
into cache memory, speeds up memory accesses thereby increasing the performance of the computer.
Another advantage of cache memory is that the CPU does not have to use the motherboard’s system bus for data transfer.
Whenever data must be passed through the system bus, the data transfer speed slows to the motherboard’s capability.
The CPU can process data much faster by avoiding the bottleneck created by the system bus
Cache Memory Operation
Cache memory uses the principle of LOCALITY OF REFERENCE, which states that over a short interval of time, the
addresses generated by a typical program refer to a few localized areas of memory repeatedly, while the remainder of
memory is accessed relatively infrequently.
When the processor attempts to read a word of memory, a check is made to determine if the word is in the cache. If so,
the word is delivered to the processor. If not, a block of main memory, consisting of some fixed number of words, is read
into the cache and then the word is delivered to the processor. Because of the phenomenon of LOCALITY OF
REFERENCE, when a block of data is fetched into the cache to satisfy a single memory reference, it is likely that there
will be future references to that same memory location or to other words in the block. If the information is present in
the cache, it is called a CACHE HIT. If the information is not present in cache, then it is called a CACHE MISS.

Cache Mapping Techniques:


Cache mapping is the method by which the contents of main memory are brought into the cache and referenced by the
CPU. The three main mapping techniques include: Direct mapping, fully associative mapping and set associative
mapping.

Cache memory in a computer system.


Secondary Storage
Use
Secondary storage is computer storage used to hold (store) programs and data for future use or backup purposes.
All applications, the operating system, and general files (for example, documents, photos and music) are stored on
secondary storage.
Characteristics
− Secondary storage is NOT DIRECTLY ACCESSIBLE by the CPU as programs and data from secondary storage must
be transferred to main memory for processing.
− Very high storage capacity
− They are non-volatile devices which allow data to be stored as long as required by the user.
− Relatively slower access
− Stores data and instructions that are not currently being used by CPU but may be required later for processing
− Cheapest among all memory.
Depending on the type of medium, secondary storage can be classified into magnetic storage, optical storage and solid-
state storage
9
Magnetic Storage
Examples of magnetic storage devices include: Floppy disks, hard disks, and magnetic tapes.
Optical Storage
Examples of optical storage devices are: Compact Discs (CD-DA, CD-ROM, CD-R, CD-RW), Digital Versatile Disc (DVD-
ROM, DVD-R, DVD-RW) and Blu-ray discs.
Solid-State Storage:
Examples of solid-state devices are USB flash drives, memory cards, and secure digital cards.
What are the main benefits of using an SSD (Solid State Drive) rather than an HDD (Hard Disk Drive? Solid state drives
• Are more reliable (no moving parts to go wrong)
• Are considerably lighter (which makes them suitable for laptops)
• Have a lower power consumption
• Access data considerably faster.

Characteristics of Storage Media:


Two most important characteristics of storage media are STORAGE CAPACITY and ACCESS SPEED. Other characteristics
include access method (sequential access e.g. magnetic tape and random access e.g. RAM) and volatility.
Memory Hierarchy
Modern computers manage memory by organizing
memory into a hierarchy in which large (in terms of
storage capacity), cheap but slower memories feed
data into smaller (in terms of storage capacity),
costlier but faster memories for faster processing of
data. This organization of computer memory is
known as memory hierarchy. As one goes DOWN the
hierarchy, the following occur:
• Decreasing cost per bit;
• Increasing capacity;
• Increasing access time;
• Decreasing frequency of access of the memory by
the processor.

ACCESS TIME is the time interval between the read/write request and the availability of data. The LESSER the access
time, the faster is the speed of memory

10
Storage Capacity: Registers < Cache Memory < Primary Memory < Magnetic Disk < Magnetic Tape
Access Speed: Magnetic Tape < Magnetic Disk < Primary Memory < Cache Memory <Register
Cost Per Bit: Magnetic Tape < Magnetic Disk < Primary Memory < Cache Memory < Register
Access Time: Registers < Cache Memory < Primary Memory < Magnetic Disk < Magnetic Tape

Memory Organisation and Addressing


Each memory chip can be viewed as a matrix of tiny cells, each of which can hold one bit of data. Each row, implemented
by a register, has a length typically equivalent to the addressable unit size of the machine. Each row corresponding to a
memory location has a unique address; memory addresses usually start at zero and progress upward. This shown below

N 8-Bit Memory Locations

M 16-Bit Memory Locations


Normally, memory is byte addressable, which means that each individual byte has a unique address. Some machines
may have a word size that is larger than a single byte. For example, a computer might handle 32-bit words but still
employ a byte-addressable architecture. In this situation, when a word uses multiple bytes, the byte with the lowest
address determines the address of the entire word.
It is also possible that a computer might be word addressable, which means each word (not necessarily each byte) has
its own address
Memory is often referred to using the notation length × width (L × W). L: Indicates the depth of the chip (in terms of
locations), W: Indicates the width of the chip in bits. For example, if there are M words in memory, each consisting of N bits,
then memory is said to be of size M×N.
For example, 4M × 8 means the memory is 4M long (it has 4M = 22 × 220 = 222 items) and each item is 8 bits wide
(which means that each item is a byte). In other words, 4M × 8 chip is a chip with 4 million locations, each of which is
8 bits wide.
To obtain the number of bits required to create M addresses corresponding to the M memory locations, we use the
relation:
2𝑁 = 𝑀, 𝑁 𝑖𝑠 𝑡ℎ𝑒 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑏𝑖𝑡𝑠 𝑟𝑒𝑞𝑢𝑖𝑟𝑒𝑑 𝑎𝑛𝑑 𝑀 𝑖𝑠 𝑡ℎ𝑒 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑚𝑒𝑚𝑜𝑟𝑦 𝑙𝑜𝑐𝑎𝑡𝑖𝑜𝑛𝑠 𝑡𝑜 𝑏𝑒 𝑎𝑑𝑑𝑟𝑒𝑠𝑠𝑒𝑑.
The number of bits, N = log 2 𝑀
𝐍 = 𝐥𝐨𝐠 𝟐 𝑴
N: Number of Bits or Address Lines or Number of Cables on the Address Bus
M: Number of Memory Locations or Number of Addresses or Number of Words

11
Main memory is usually larger than one RAM chip. Consequently, these chips are
combined into a single memory of the desired size. For example, suppose you need
to build a 32K × 8 byte-addressable memory and all you have are 2K × 8 RAM
chips. You could connect 16 rows of chips together as shown in the figure adjacent
Each chip addresses 2K bytes. Addresses for this memory must have 15 bits (there
are 32K = 25 × 210 bytes to access).
But each chip requires only 11 address lines (each chip holds only 211 bytes). In
this situation, a decoder is needed to decode either the leftmost or rightmost 4 bits
of the address to determine which chip holds the desired data. Once the proper
chip has been located, the remaining 11 bits are used to determine the offset on
that chip. Whether we use the 4 leftmost or 4 rightmost bits depends on how the
memory is interleaved. (Note: We could also build a 16K × 16 memory using 8
rows of 2 RAM chips each. If this memory were word addressable, assuming 16-
bit words, an address for this machine would have only 14 bits.)
Examples
Example 1: A RAM chip has a capacity of 32K ×16
1. How many memory addresses does this RAM have?
2. How many address lines will be needed for this RAM?
1 RAM has 32K memory addresses.
32K=32× 1024 = 𝟑𝟐𝟕𝟔𝟖 𝑴𝒆𝒎𝒐𝒓𝒚 𝑨𝒅𝒅𝒓𝒆𝒔𝒔𝒆𝒔
2 𝐍 = 𝐥𝐨𝐠 𝟐 𝟑𝟐𝟕𝟔𝟖 = 𝟏𝟓
To address 32K memory locations, we require 15 lines

Example 2: A computer memory chip is composed of 16 chips, each of size 4K× 8.


1. How many memory locations are there in this memory?
2. How many address bits are needed to address this memory?
1 Each chip has 4K memory locations
There are 16 chips
16× 4𝐾 = 16 × 4 × 1024 = 65536 𝑀𝑒𝑚𝑜𝑟𝑦 𝐿𝑜𝑐𝑎𝑡𝑖𝑜𝑛𝑠
2 𝐍 = 𝐥𝐨𝐠 𝟐 65536 = 𝟏𝟔
To address 65536 memory locations, we require 16 address bits.

Example 3: The capacity of 2K × 16 PROM is to be expanded to 16 K × 16. Find the number of PROM chips required
and the number of address lines in the expanded memory.
3 Required capacity =16k x 16
Available chip (PROM) =2k x 16
16𝐾×16
The no of chips = 2𝐾×16
= 8
Thus, the address line required for the single chip = 11
In the expanded memory, the word capacity 16k = 214
Now the address lines required are 14. Among then 11 will be common and 3 will be connected to 3 x8
decoder.
Example 4: A certain memory has a capacity of 4K×8
1. How many data input and data output lines does it have?
2. How many address lines does it have?
3. What is its capacity in bytes?
As in the 4Kx8, the second number represents the number of bits in each word so the number of data input lines will
be 8(also the data output lines)

12
Example 5: A microprocessor uses RAM chips of 1024x1 capacity.
1. How many chips will be required and how many address lines will be connected to provide capacity of 1024
bytes.
2. How many chips will be required to obtain a memory of capacity of 16 K bytes.
1 Available chips = 1024 x 1 capacity
Required capacity = 1024 x 8 capacity
No of Chips= 8
Number of address lines are required = 10
As the word capacity is same (1024) so same address lines will be connected to all chips.
2 No of Chips Required = 128

Example 6:
1. How many 128× 8 RAM chips are required to provide a memory capacity of 2048 bytes.
2. How many lines of address bus must be used to access 2048 bytes of memory. How many lines of these will be
common to each chip?
3. How many bits must be decoded for chip select? What is the size of decoder?

i Available RAM chips = 128 x 8


Required memory capacity = 2048 x 8
Number of chips required = 16.
ii Chips available are of 128 x 8 in size. It means that total 128 locations are there and each location can store 8
bits. Thus, the total number of address lines required to access 128 locations is 7. These seven lines are
common to all chips. Now to access 2048 locations, we require 11 address lines
iii These higher order lines will be applied to decoder input. The number of inputs to the decoder will be 11 - 7
= 4. The size of the decoder will be 4x16. These 16 decoder outputs will be connected to the chip select input
of individual chips.

Example 7: For the following memory units (specified by the number of words the number of bits per word) determine
the number of address lines, input/output lines and the number of bytes that can be stored in the specified memory
i. 64K x 8
ii. 16M x 32
iii. 4G x 64
iv. 2K x 16
i 64K x 8
i/p, o/p lines = 8
Address lines = 16
Mem = 64K
ii 16M x 32
i/p, o/p lines = 32
Add = 24
Mem = 64M (16M x 4)
iii 4G x 64
i/p, o/p lines = 64
Add = 32
Mem = 32GB (4G x 8)
iv 2K x 16
i/p, o/p = 16
Add = 11
Mem = 4K

13
C THE INPUT/OUTPUT UNIT
Input Devices:
An input device is a hardware that allows the computer user to enter data and commands into the computer. Examples
of input devices are keyboard, mouse, scanner, joystick, light pen, touch pad, trackball and microphone.
Output Devices:
Output devices are used to communicate the results of computations to the user in a form they understand. Examples
of output devices are monitors, printers, speakers, plotters and projectors.
Monitors:
A monitor is a device that displays computer output on a screen. Another name for monitor is visual display unit (VDU).
Characteristics
Monitors are characterized by the TECHNOLOGY THEY USE, their RESOLUTION, their REFRESH RATE and their SIZE.
Technology
Based on technology used monitors are of two main types: Cathode Ray Tube (CRT) and Liquid Crystal Display Monitors.
Refresh rate:
The refresh rate of a monitor refers to the number of times an image is redrawn on the screen per second. This number
is measured in hertz (Hz)
Screen Size:
The screen size refers to the DIAGONAL DISTANCE from one corner of the display to the opposite corner. It is measured
in INCHES.
Input/ Output Devices
They allow data and commands to be entered into the computer AND at the same time convey information out of the
computer. Examples include touch screen and electronic whiteboard
Expansion Cards:
An expansion card is a circuit board that is inserted into an expansion slot on the motherboard to add functionality to
the computer. Expansion cards are also called expansion boards, plug-in boards, add-on cards, controller cards, adapter
cards, or interface cards. Examples are the graphics card, sound card, and network interface card
Graphics Card
A graphics card, also called video card, or graphics adapter, is an expansion card that controls and produces video on
the monitor. It controls and calculates an image’s appearance on the screen.
Sound Card
A sound card also known as audio card, or audio adapter is an expansion card that enables a computer to manipulate
and output sound. Sound cards enable the computer to output sound through speakers, or headphones connected to
the board, to record sound input from a microphone connected to the computer and to manipulate sound stored on the
disk.
I/O Ports:
A port is a socket on the computer into which an external device can be plugged. Ports provide a pathway for data
exchange between computer and peripheral devices such as keyboards, monitors, and printers. Examples of port
include : PS/2 port, VGA port, Ethernet port, serial ports, parallel port, USB port and firewire port.
THE I/O INTERFACE
The I/O interface provides a method for transferring information between internal and external devices. Peripherals
connected to a computer need special communication links for interfacing them with the Central Processing Unit. The
purpose of the communication link is to resolve the differences that exist between the CPU and each peripheral. The
major differences are:
• Peripherals are electromechanical and electromagnetic devices while the CPU and memory are electronic devices.
Therefore, conversion of signal values may be needed.
• The data transfer rate of peripherals is usually slower than the transfer rate of the CPU and consequently, a
synchronization mechanism may be needed.
• Data codes and formats in the peripherals differ from the word format in the CPU and memory.
• The operating modes of peripherals are different from each other and must be controlled so as not to disturb the
operation of other peripherals connected to the CPU.
To resolve these differences, computer systems include special hardware components between the CPU and peripherals
to supervise and synchronize all input and output transfers.

14
The I/O Bus
The buses that connect peripheral devices to the CPU are called I/O or expansion buses.
On modern PCs, the following types of expansion buses are found: the AGP (Accelerated Graphics Port.) bus, ISA
(Industry Standard Architecture.) bus, PCI (Peripheral Component Interconnect) bus, PCIe (PCI Express) bus, USB
(Universal Serial Bus.) and FireWire bus. AGP, ISA, PCI, and PCIe are parallel buses while USB and FireWire are serial
buses.
The I/O Controller
An I/O controller, also called device controller, connects input and output (I/O) devices to the bus system of the CPU
and manages the exchange of data between the CPU and the device or devices they control.

Block Diagram of I/O Controller

15
Motherboard with Memory Hub Controller and Bus Bridge

I/O Control Methods


Data transfer between the CPU and I/O devices can be done in any of three possible ways: programmed I/O, interrupt-
initiated I/O and direct memory access.
Programmed I/O
Programmed I/O is a way of controlling input/output activity in which the processor is programmed to interrogate a
peripheral device to see if it is ready for a data transfer.
Interrupt-driven I/O
Interrupt driven I/O is a way of controlling input/output activity whereby a peripheral device that needs to make or
receive a data transfer sends a signal to the CPU. The signal sent to the CPU is called an interrupt.
Direct Memory Access
In DMA, the CPU issues a command to the DMA controller including a read or write command, the address of the I/O
device involved, the starting location in memory and number of words to be read or written. The CPU then continues
with other tasks while the transfer is done. When the task is complete, the DMA controller sends an interrupt to the
CPU. Thus, the CPU is involved only at the beginning and end of the transfer. When large volumes of data are to be
moved, direct memory access is used.
D INSTRUCTION SET ARCHITECTURE
The instruction set architecture (ISA) of a processor is the combination of the instruction set and all the resources
needed for their execution, such as the registers, the addressing modes and the memory.
An instruction set is a collection of all possible machine language commands that are understood and can be executed
by a processor.
Types of Basic Processor Instructions
Data Movement Instructions: These are used to transfer or copy data from one location to another either in the registers
or in the main memory. e.g. LOAD, STORE, MOVE
Arithmetic Instructions: These instructions are used to perform operations on numerical data. (e.g. ADD, SUB, SHIFT,)
Logical Instructions: These are used to perform Boolean operations on non-numerical data. (e.g. CMP, AND, OR, XOR
instructions)
16
Program Control Instructions: These are used to change the sequence of a program execution. E.g. CALL, RETURN, HALT
Input–output Instructions: These are used to transfer data from and to I/O devices.
Machine Control Instructions: Machine control instructions include HALT (Stop processing and wait) and NOP (No
operation) instructions.
Instruction Formats:
Every machine instruction is made up of TWO parts: OPCODE and OPERAND. The Opcode (operation code) denotes the
basic machine operations like ADD, STORE or XOR. The Operand (one or more) provides the data which the instruction
manipulates. ADD A, #10 //ADD is the Opcode; A and #10 are Operands.
The number of operands in an instruction format depends on the internal organization of a CPU. The three most
common CPU organizations are accumulator, stack, and general-purpose register (GPR) architectures. General purpose
register organization is subdivided into three: register-memory, load/store and memory-memory organizations.
Accumulator Architecture
In this architecture, one of the operands is IMPLICITLY the accumulator register and need not be specified in the
instructions. E.g. LOAD X // AC←M[X] Meaning copy the value contained in the memory location X to the accumulator
register.
Stack Architecture
In the stack architecture, operands are IMPLICIT. They are at the top of a stack (TOS). Instructions read their operands
from and write their results to the stack.
Memory-Register Architecture
This is a type of general-purpose register architecture in which operands are either memory operands or register
operands. Here, operands are EXPLICIT and any instruction can access memory.
Load/Store Architecture
This is a type of general-purpose register architecture that allows ONLY LOAD and STORE instructions to access
memory. All other instructions use registers. A Load/Store architecture has instructions to do either ALU operations or
to access memory but not both. It is also referred to as register-register architecture

0-address instruction (stack architecture)


1-address instruction (accumulator architecture)

1.Three-address instruction: Computers with three-address instruction format can use each address field to
specify two sources and a destination, which can be either a processor register or a memory operand. It results
in short program but requires too many bits to specify three addresses. Example: ADD R1, A, B (R1← M[A] +
M[B])
2. Two-address instruction: Each address field can specify either a processor register or a memory word.
Example: MOV R1, A (R1← M[A]);
MUL R1, R2(R1← R1*R2)
3. One-address instruction: It used an implied accumulator (AC) register for all data manipulation. The other
operand is in register or memory.
Example: LOAD A (AC ← M[A]);
ADD B (AC ← AC + M[B])
4. Zero-address instruction: A stack organized computer does not use an address field for the instruction ADD and
MUL
Examples
1. Write the code sequence for the statement E←(A+B) * (C-D) in stack architecture (0-address)
PUSH A //TOS ← M[A]
PUSH B // TOS ← M[B]
ADD / /TOS ← (A+B)
PUSH C // TOS ← M[C]
PUSH D // TOS ← M[D]
SUB //TOS ← (C-D)
17
MUL // TOS ← (C-D) *(A+B)
POP E // E←[TOS]
2. Write the 0-address, 1-address, 2-address and 3-address instructions to evaluate X =(P +Q) x (R +S)
2-address instructions One-address instructions: Use an implied accumulator
(AC) register for all data manipulations.
MOV R1, P LOAD P
ADD R1, Q ADD Q
MOV R2, R STORE T
ADD R2, S LOAD R
MUL R1, R2 ADD S
MOV X, R1. MUL T
STORE X.
Here ‘T ’ is a temporary memory location required to
store the intermediate result

Zero- address instructions: 3-address instruction format


PUSH P ADD R1, P, Q
PUSH Q ADD R2, R, S
ADD MUL X, R1, R2.
PUSH R Here R1, R2 are processor registers.
PUSH S
ADD
MUL
POP X.

Addressing Mode
The different ways in which the location of an operand is specified in an instruction are referred to as addressing modes.
An addressing mode therefore indicates to the processor how to locate the operands associated with an instruction.
Computers use various addressing mode techniques to
• Provide programming flexibility to users through use of pointers to memory, counter for loop control, data indexing
and program relocation.
• Reduce the size of the addressing field of the instruction.
Effective address:
The effective address defined to be the memory address obtained from the computation based on the addressing mode,
consists the ACTUAL ADDRESS of the operand.
Types of Addressing Modes
The most common addressing techniques are:
Implied Addressing mode, Immediate Addressing mode, Direct Addressing Mode, Indirect Addressing Mode, Register
(direct) Addressing Mode, Register Indirect Addressing mode, Auto-increment or Auto-decrement mode, Displacement
addressing mode (PC relative addressing mode, Indexed addressing mode, Base register addressing mode)
Let us suppose [x] means contents at location x for all the addressing modes.
Implied addressing mode
In this mode, the operands are implicitly stated in the instruction. For example, register reference instructions such as
CMA (complement accumulator), CLA (clear accumulator) and zero-address instructions that use stack organization.
Immediate Addressing mode
In immediate addressing mode the value of the operand is held within the instruction itself. Instruction format in
immediate mode is Opcode Data Operand

Example: ADD R2, R3, #10


This instruction adds the number 10 to the content of register R3 and stores the result in R2. The operand 10 has been
addressed immediately. The # symbol used is to indicate immediate addressing.
These are useful for initializing registers to constant value or to set initial values of variables. No memory reference is
required other than the instruction fetch. The size of number is restricted to the size of the address (operand) field.

18
Register (direct) Addressing Mode
In register addressing mode, the operand is the content of a processor register;
the name of the register is given in the instruction. The effective address (EA)
of the operand is the register and not the content of the register.
For example: ADD R1, R2, R3
The above instruction uses three registers to hold all operands. Registers R2
and R3, hold the two source operands while register R1 holds the result of the
computation. The content of R2 and R3 are added and the result is stored in R1.
Register Indirect addressing mode
In this mode the instruction specifies a register in the CPU whose
contents give the address of the operand in memory. Address field
of the instruction uses fewer bits to select a register than would have
required to specify a memory address directly.

Direct Addressing Mode


In direct addressing mode, also called ABSOLUTE ADDRESSING
MODE, the operand is in a memory location; the address of this location is given
explicitly in the instruction. The effective address (actual address) of the operand
is given in the instruction.
Example: ADD A
Instruction adds content of memory cell A to accumulator, that is, ACC = [ACC] +
M[A]
Required only one memory reference and no special calculation required. Limitation is limited address space.

Indirect Addressing Mode


Indirect Addressing gives the address of the address of the data in the
instruction. (The address specified in the instruction is not the address of the
operand. It is the address of a memory location that contains the address of the
operand.)
(Here EA is effective address of operand)
The advantage of this approach is that for a word length of N, an address space
of 2𝑁 is available.
The disadvantage is that the instruction execution requires two memory references to fetch the operand.
Auto increment and auto decrement addressing mode: This is similar to register indirect mode except that the register
is incremented or decremented after (or before) its value is used to access memory.
Displacement Addressing
A very powerful mode of addressing combines the capabilities of direct addressing and register indirect addressing.
Displacement addressing requires that the instruction have two address fields, at least one of which is explicit. The value
contained in one address field (value = A) is used directly. The other address field, or an implicit reference based on
opcode, refers to a register whose contents are added to A to produce the effective address. Three displacement
addressing include:
Relative addressing, Base-register addressing, and Index addressing.
Relative addressing
PC Relative address mode: In this mode, the effective address of an operand is
obtained by adding the content of a program counter to the address part of the
instruction. The address part of the instruction is usually a signed number
which can be either positive or negative. The result obtained after adding the
content of the program counter to the address field produces an effective
address whose position in memory is relative to the address of the next
instruction

19
Effective address = PC + address part

Index address mode: In this mode, the effective address of an operand


is obtained by adding the content of an index register to the address
part of the instruction. The index register is a special CPU register that
stores an index value and the address field of the instruction stores the
base address of a data array in the memory. The distance between the
base address and the address of the operand is the index value that is
stored in the index register. The index register can be incremented to
facilitate access to consecutive operands stored in arrays using the
same instruction.
Base register addressing mode:
In this mode, the effective address of an operand is obtained by adding
the content of a base register to the address part of the instruction. Base
register has a base or beginning address and the address field of the
instruction gives a displacement relative to the base address.
This addressing mode is used in computers to facilitate the relocation of
programs in memory, i.e., when programs and data are moved from one
segment of memory to another, as required in multiprogramming
system, the address values of instructions must reflect this change of
position. With a base register, the displacement values of instructions do not have to change. Only the value of the base
register requires updating to reflect the beginning of a new memory segment.
Examples
1 (a) What is relocatable code?
(b) Explain base register addressing and its role in a multi-programming operating system.

HINT
(a) Code which can be loaded into any area in memory OR into a different area in memory each time it is run;
(b) Effective address = content of base register + offset/displacement/number. In multi-programming OS all
address references use base register addressing. When a program is loaded into main memory,
the base register is loaded with address of base of memory block containing program. Every address is relative to
the base register;
2 (a) ISA of a processor consists of 64 registers, 125 instructions (opcodes) and 8 bits for immediate mode. In a given program,
30% of the instructions take one input register and have one output register, 30% have two input registers and one output register,
20% have one immediate input, and one output register, and remaining have two immediate input, 1 register input and one output
register. Calculate the number of bits required for each instruction type. Assume that the ISA requires that all instructions be a
multiple of 8 bits in length.
(b) Compare the memory space required with that of variable length instruction set.

(a) Since there are 125 instructions so we need 7 bits to differentiate them as 64 < 125 < 128. For 64
registers, we need 6 bits and 8 bits for immediate mode.
For Type 1, 1 reg in, 1 reg out: 7 + 6 + 6 = 19 bits ~ 32 bits
For Type 2, 2 reg in, 1 reg out: 7+ 6 + 6 + 6 = 26 bits ~ 32 bits
For Type 3, 1 imm in, 1 reg out: 7 + 6 + 8 = 21 bits ~ 32 bits
For Type 4, reg in, 2 imm in, 1 reg out: 7 + 6 + 8 + 8 + 6 = 35 bits ~48 bits
(b) As the largest instruction type requires 48 bits instructions, the fixed-length instruction format uses 48
bits per instruction. Variable length instruction format uses 0.3 x 32 + 0.3 x 32 + 0.2 x 32 + 0.2 x 48 = 36 =
bits on average, that is, 25% less space.

4. A two-word instruction LOAD is stored at location 300 with its address field in the next location. The address field has value
600 and value stored at 600 is 500 and at 500 is 650. The words stored at 900, 901 and 902 are 400, 401 and 402, respectively. A
processor register R contains the number 800 and index register has value 100. Evaluate the effective address and operand if
addressing mode of the instruction is as follows: Direct, Indirect, Relative, Immediate, Register indirect, Index
Memory layout is as follows

20
Addressing Mode Effective Address Operand
Direct 600 500
Indirect 500 650
Relative 902 402
Immediate 301 600
Register indirect 800 700
Index 700 900

Flynn’s Taxonomy of Computer Architecture


There exist 4 distinct categories
Single Instruction, Single Data Stream (SISD)
A SISD machine is a uniprocessor (single processor) machine which receives a single stream of instructions that operate
on a single stream of data. At any given moment, a single instruction is being executed on a given data set. In SISD
machines, instructions are executed sequentially. That is, instructions are executed one after the other, one at a time.
Hence, these machines are also called sequential or serial processor machines. They are typical examples of VON
NEUMANN’S COMPUTER MODEL

Single Instruction, Multiple Data Stream (SIMD)


SIMD uses many processors. Each processor executes the same instruction but uses different data inputs – they are all
doing the same calculations but on different data at the same time. SIMD machines have n identical processors which
operate under the control of a single instruction stream issued by a central control unit on different data sets. SIMD
machines are also called ARRAY PROCESSOR MACHINES.

Applications:
They have a particular application in graphics cards.
Other applications include sound sampling – or any application where a large number of items need to be altered by the
same amount (since each processor is doing the same calculation on each data item).
Multiple Instruction, Single Data Stream (MISD)
MISD machines are multiprocessor machines capable of executing different instructions on different processors, but all
of them operating on the same data set. They have n processors, each with its own control unit, that share a common
memory. Each processor receives a distinct instruction stream but all operate on the same data stream. Such machines
no longer exist
Multiple Instruction, Multiple Data Stream (MIMD)
MIMD machines are multiprocessors machines capable of executing multiple instructions on multiple data sets. They
have n processors, n streams of instructions and n streams of data. Each processor element in this model has a separate
instruction stream and data stream hence such machines are well suited for any kind of application

21
RISC AND CISC MACHINES
Reduced Instruction Set Computers
RISC is a CPU design with a small number of basic and simple machine language instructions, from which more complex
instructions can be composed.
Implementing a processor with a simplified instruction set design provides several advantages over implementing a
comparable CISC design
Higher Performance:
Since a simplified instruction set allows for a pipeline processing, RISC processors often achieve 2 to 4 times the
performance of CISC processors using comparable semiconductor technology and the same clock rates
Lower per-chip cost:
Because the instruction set of a RISC processor is so simple, it uses up much less chip space. Smaller chips allow a
semiconductor manufacturer to place more parts on a single silicon wafer, which can lower the per-chip cost
dramatically.
Shorter Design Cycle:
Since RISC processors are simpler than corresponding CISC processors, they can be designed more quickly, and can take
advantage of other technological developments sooner than corresponding CISC designs, leading to greater leaps in
performance between generations.
Complex Instruction Set Computers
CISC is a CPU design with a large number of different and complex instructions. One key difference between RISC and
CISC is that CISC instruction sets are not constrained to the load/store architecture, in which arithmetic and logic
operations can be performed only on operands that are in processor registers. Another key difference is that instruction
do not necessarily have to fit into a single word. Some instructions may occupy a single word, but others may span
multiple words- variable length instructions.

RISC CISC
Fewer instructions More instructions
Simpler instructions More complex instructions
Small number of instruction formats Many instruction formats
Single-cycle instructions whenever possible Multi-cycle instructions
Fixed-length instructions Variable-length instructions
Only load and store instructions to address memory Many types of instructions to address memory
Fewer addressing modes More addressing modes
Multiple register sets Fewer registers
Hard-wired control unit Microprogrammed control unit
Pipelining easier Pipelining more difficult

The following are some points to note.


• For RISC the term ‘reduced’ affects more than just the number of instructions. A reduction in the number of
instructions is not the major driving force for the use of RISC.
• The reduction in the complexity of the instructions is a key feature of RISC.
• The typical CISC architecture contains many specialized instructions.
• The specialized instructions are designed to match the requirement of a high-level programming language.
• The specialized instructions require multiple memory accesses which are very slow compared with register
accesses.
• The simplicity of the instructions for a RISC processor allows data to be stored in registers and manipulated in them
with no resource to memory access other than that necessary for initial loading and possible final storing.
• The simplicity of RISC instructions makes it easier to use hard-wiring inside the control unit.
• The complexity of many of the CISC instructions makes hard-wiring much more difficult, so microprogramming is
the norm.
PARALLELISM
Parallelism can be achieved through parallel processing and pipelining.

22
PARALLEL PROCESSING
Parallel processing is the use of multiple processors simultaneously to execute a single program or task. Some personal
computers implement parallel processing with multiprocessors while others have multicore processors.
Multicore Processor
A multi-core processor is a computer processor on a single integrated circuit with two (dual) or more separate
processing units, called cores, each of which reads and executes program instructions. The instructions are ordinary
CPU instructions (such as ADD, MOVE etc.) but the single processor can run instructions on separate cores at the same
time, increasing overall speed for programs that support multithreading or other parallel computing technique
Doubling the number of cores will not simply double a computer’s speed. CPU cores have to communicate with each
other through channels and this uses up some of the extra speed. In addition, improvement in performance gained by
the use of a multi-core processor depends on the software development algorithms used and their implementation.
Multiprocessors
Multiprocessing is the use of two or more central processing units (CPUs) within a single computer system.
Multiprocessor systems (containing many processors, each possibly containing multiple cores) either execute a number
of different applications tasks in parallel, or they execute subtasks of a single large task in parallel.
Pipeline Processing (instruction-level parallelism)
Pipeline processing (Pipelining) is an implementation technique whereby multiple instructions are overlapped in
execution.
Multiple Issue – Superscalar
A superscalar CPU architecture implements a form of parallelism called instruction level parallelism within a single
processor. It therefore allows faster CPU throughput than would otherwise be possible at a given clock rate.
INTERRUPT
Interrupt:
An interrupt is a signal sent by a hardware device or program or internal clock to the processor requesting its attention.
Interrupts can be caused by, for example
• A timing signal
• Input/output processes (a disk drive is ready to receive more data, for example)
• A hardware fault (an error has occurred such as a paper jam in a printer, for example)
• User interaction (the user pressed a key to interrupt the current process, such as <CTRL><ALT><BREAK>, for
example)
• A software error that cannot be ignored (if an .exe file could not be found to initiate the execution of a program OR
an attempt to divide by zero, for example).
Interrupts are triggered regularly by a timer, to indicate that it is the turn of the next process to have processor time
(Processor Scheduling). It is because a processor can be interrupted that multi-tasking can take place.

Interrupt Service Routines or Interrupt Handler


The interrupt is managed by the interrupt handler
An interrupt condition alerts the processor and serves as a request for the processor to interrupt the currently executing
code when permitted, so that the event can be processed in a timely manner. If the request is accepted, the processor
responds by suspending its current activities, saving its state, and executing a function called an interrupt handler (or
an interrupt service routine, ISR) to deal with the event. This interruption is temporary, and, unless the interrupt
indicates a fatal error, the processor resumes normal activities after the interrupt handler finishes.
Interrupts are also commonly used to implement COMPUTER MULTITASKING, especially in REAL-TIME COMPUTING.
Systems that use interrupts in these ways are said to be interrupt-driven
Interrupts are assigned PRIORITIES, AND LOWER PRIORITY INTERRUPTS may be disabled while higher priority
interrupt is being service.
The table below shows some of the processes that can generate an interrupt, and the priority level that is attached to
that interrupt. Level 1 is the highest priority, 5 the lowest. Interrupts with the same priority level are dealt with on a
first-come first-served basis.

23
Level Type Possible causes
1 Hardware Power failure – this could have catastrophic consequences if it is not dealt with
failure immediately so it is allocated the top priority.
2 Reset Some computers have a reset button or routine that literally resets the computer
interrupt to a start-up position.
3 Program error The current application is about to crash so the OS will attempt to recover the
situation. Possible errors could be variables called but not defined, division by
zero, overflow, misuse of command word, etc.
4 Timer Some computers run in a multitasking or multiprogramming environment. A
timer interrupt is used as part of the time slicing process.
5 Input/ Output Request from printer for more data, incoming data from a keyboard to a mouse
key press, etc.

Once the interrupt has been serviced, the original values of the registers are retrieved from the stack and the process
resumes from the point that it left off. A test for the presence of interrupts is carried out at the end of each fetch-decode-
execute cycle.
How the Interrupt Works
An additional step is added to the fetch–execute cycle. This extra step fits between the completion of one execution and
the start of the next. After each execution the processor checks to see if an interrupt has been sent by looking at the
contents of the interrupt register.

The fetch–execute cycle with interrupts

24
When the processor receives an interrupt signal, it suspends execution of the running program or process and disables
all interrupts of a lower priority in order to service the interrupt. It does this using the Interrupt Service Routine (ISR)
which calls the routine required to handle the interrupt. Most interrupts are only temporary so the processor needs to
be able to put aside the current task before it can start on the interrupt. It does this by placing the contents of the
registers, such as the PC and CIR on to the system stack. Once the interrupt has been processed the CPU will retrieve the
values from the stack, put them back in the appropriate registers and carry on.
Types of Interrupts
There are three types of interrupts:
Hardware Interrupts are generated by hardware devices to signal that they need some attention from the OS. They may
have just received some data (e.g., keystrokes on the keyboard or a data on the ethernet card); or they have just
completed a task which the operating system previously requested, such as transferring data between the hard drive
and memory.
Software Interrupts are generated by programs when they want to request a system call to be performed by the
operating system. The most common use of software interrupt is associated with a supervisor call instruction. This
instruction provides means for switching from a CPU user mode to the supervisor mode. Certain operations in the
computer may be assigned to the supervisor mode only, as for example, a complex input or output transfer procedure.
Traps: Internal interrupts are also called traps. Examples of interrupts caused by internal error conditions are register
overflow, attempt to divide by zero, an invalid operation code, stack overflow, and protection violation.
Typical uses of interrupts
• Interrupts are commonly used to service hardware timers, transfer data to and from storage (e.g., disk I/O) and
communication interfaces (e.g. Ethernet), handle keyboard and mouse events, and to respond to any other time-
sensitive events as required by the application system.
• Non-maskable interrupts are typically used to respond to high-priority requests such as watchdog timer timeouts,
power-down signals and traps.
• Hardware timers are often used to generate periodic interrupts. In some applications, such interrupts are counted
by the interrupt handler to keep track of absolute or elapsed time, or used by the OS task scheduler to manage
execution of running processes, or both.
• A disk interrupt signals the completion of a data transfer from or to the disk peripheral; this may cause a process to
run which is waiting to read or write.
• A power-off interrupt predicts imminent loss of power, allowing the computer to perform an orderly shut-down
while there still remains enough power to do so.
An interrupt is said to be masked when it has been disabled, or when the CPU has been instructed to ignore it. A non-
maskable interrupt (NMI) cannot be ignored, and is generally used only for critical hardware errors.

25

You might also like