Computer Arch. and Org. USS
Computer Arch. and Org. USS
Introduction
Computer Computer architecture refers to those attributes of a system that have a direct impact on the logical
Architecture execution of a program like the instruction set, the number of bits used to represent various data types,
I/O mechanisms and techniques for addressing memory.
Computer It refers to the physical organization of the hardware components of a computer system, and to the
Organization characteristics of these components, i.e. the operational units and their interconnections that realize
the architectural specifications.
For a given architecture, there could be many different models, each representing a different type of
organization, depending on the cost, physical size and technology used.
1
A STRUCTURE AND ROLE OF THE PROCESSOR AND ITS COMPONENTS
2
Examples of some registers are the instruction register, program counter, accumulator, status register, memory data
register and memory address register.
− Program Counter
The program counter (PC) holds (stores) the memory location (address) of the next instruction to be executed
− Instruction Register
The instruction register (IR) holds (stores) the instruction that is currently being executed by the processor. PC
holds the address of an instruction while IR holds an actual instruction
− Accumulator
The accumulator (AC) holds (stores) results of computations performed by the ALU.
− Memory Address Register
The memory address register (MAR) holds (stores) the memory address where data is about to be read
(fetched) or written (stored).
− Memory Data Register
The memory data register (MDR) holds (stores) data that has just been read in from memory or data produced
by the ALU and waiting to be written to memory.
− Status Register
The status register holds(stores) information about the state of the processor such as whether an overflow has
been detected. Status register bits are called flag bits or flags and each flag has a unique purpose. For example,
there are flag bits for negative, zero, overflow, and carry.
The control unit is made up of: PC (Program Counter), IR (Instruction Register), and Instruction Decoder
Registers that INTERFACE the CPU with its memory system.
Buffer registers interface the processor with its memory system. The two standard buffer registers are the memory
address register (MAR) and the memory buffer register (MBR). The MBR, is also known as the memory data register.
The Size or Length of Each Register
The size or length of each register is determined by its function. For example, the memory address register, which holds
the address of the next location in memory to be accessed, must have the same number of bits as the memory address.
Instruction register holds the instruction being executed and, therefore, should be of the same number of bits as the
instruction.
The clock
All computers have an internal clock. The clock generates a signal that is used to synchronize the operation of the
processor and the movement of data around the other components of the computer.
The speed of a clock is measured in either megahertz (MHz – millions of cycles per second) or gigahertz (GHz – 1000
million cycles per second). Common clock speeds today range from 2GHz to 3 GHz. Intel’s 5.5GHz i9-12900KS is
currently [15/10/2022] the world’s fastest desktop processor.
Buses
A bus is an electronic path along which data travels from one system component to another. It consists of a set of parallel
lines that interconnects computer components together, allowing the exchange of data between them.
A bus can be unidirectional (transmission of data can be only in one direction) or bidirectional (transmission of data
can be in both directions).
A bus that connects to all the three system components (CPU, memory and I/O devices) is called a system bus. System
buses can be grouped into data bus, address bus and control bus.
Data Bus
− The data bus is a bidirectional path for moving data and instructions between system components.
− The number of lines present in a data bus is called the width of data bus. Therefore, bus width is the number of bits that
can be sent down a bus in one go. Data bus width limits the maximum number of bits, which can be transferred
simultaneously between two modules.
Address Bus
− The address bus is a unidirectional bus that carries address information from the CPU to system components.
− The CPU uses the address bus to send the address of the memory location, data is to be written or read from.
3
− Also, when the CPU reads data from or writes to a port, it sends the port address out on the address bus.
− The width of the address bus determines the maximum possible memory of the system that can be addressed.
− In other words, the size of the address bus determines the address space of the computer. A computer with 32-bit
address bus can address a maximum of 232 B (4GB) memory locations
Control Bus
The control bus is a bidirectional bus that transmits command, timing and specific status information between system
components. Typical control signals include: memory write, memory read, I/O read, I/O write, bus request, bus grant,
transfer ACK, interrupt request, interrupt ACK, clock and reset.
5
Von Neumann and Harvard architectures
− Von Neumann architecture
In the strictest definition, the term Von Neumann Computer refers to a specific type of computer architecture in which
INSTRUCTIONS AND DATA are stored together in a COMMON SHARED MEMORY. It is a stored-program computer
model based on the following 3 concepts:
− Data and instructions are stored in a SINGLE read-write memory
− The contents of this memory are addressable by location, without regard to the type of data contained there
− Execution occurs in a SEQUENTIAL fashion (unless explicitly modified) from one instruction to the next.
Stored Program Computer
Stored program concept: [Both program and the data on which it performs processing and calculations are stored in memory together]
− The program to be executed is resident in a memory directly accessible to the processor.
− Instructions are fetched one at a time (serially) from this memory and executed by the processor
− Data is resident in a memory directly accessible to the processor which can change it, if instructed to by the
executing program.
− Thus, the same data can be accessed repeatedly if so desired and the same instructions can be executed repeatedly
if so required
A Von-Neumann Computer Consists of Five Major Units:
Input unit, arithmetic/logic unit, control unit, memory unit and output unit.
CPU
Arithmetic/Logic Unit
Output Unit
Input Unit
Control Unit
Memory Unit
6
Harvard Architecture
The Harvard computer architecture stores program instructions and data in separate memories. The program memory
and data memory have different communication pathways to the CPU.
The key difference between this and von Neumann is that separate buses are used for data and instructions, both of
which address different parts of memory. So rather than storing data and instructions in the same memory and then
passing them through the same bus, there are two separate buses addressing two different memories
Harvard architecture.
The advantage of this is that the instructions and data are handled more quickly as they do not have to share the same
bus. Therefore, a program running on Harvard architecture can be executed faster and more efficiently. Harvard
architecture is widely used on EMBEDDED COMPUTER SYSTEMS such as mobile phones, burglar alarms etc. where
there is a specific use, rather than being used within general purpose PCs.
B THE STORAGE UNIT
− The storage medium is the surface or substrate or physical material that holds actual data e.g. hard disk, floppy
disks, CDs, DVDs,
− Storage device is the computer hardware that reads from or writes data onto the storage medium e.g. floppy disk
drives, and CD or DVD drive.
Computer storage is classified basically into two: Primary storage and secondary storage.
Primary Storage:
Use:
Primary storage is computer storage used for holding programs and data that the CPU is currently working on.
Description:
It is also called Immediate Access Storage (IAS) as it can be directly accessed by the CPU.
Examples:
Main memory (RAM), Cache memory and ROM
Random Access Memory:
Use:
RAM is used to store data, files, part of an application or part of the operating system CURRENTLY IN USE
Characteristics:
• RAM is volatile (memory contents are lost on powering off the computer).
• RAM provides random access to the stored bytes, words, or larger data units. This means that it requires the same
amount of time to access information from RAM, irrespective of where it is located in it.
• RAM can also be written to or read from, and the data stored can be changed by the user or by the computer
Why RAMs with higher storage capacity improve computer performance:
• With more RAM, more of program instructions and data can be loaded into RAM, and there is less need to keep
swapping data in and out to the swap file on the hard disk drive. (Swapping of data slows down the speed at which
applications can run)
• Also, an increase in RAM will improve the multitasking capabilities of the computer as the instructions of several
programs will be able to be stored in RAM at the same time.
Types of RAM: DYNAMIC RAM (DRAM) AND STATIC RAM (SRAM)
DYNAMIC RAM (DRAM):
Characteristics:
• DRAM must be refreshed every few milliseconds to prevent data loss. This is because DRAM is made up of micro
capacitors that slowly leak their charge over time.
7
• It uses 1 TRANSISTOR AND 1 CAPACITOR PER BIT
Use:
DRAM is mostly used as MAIN MEMORY because, DRAM:
− Are much less expensive to manufacture than SRAMs
− Consume less power than SRAMs
− Have a higher memory capacity than SRAMs.
STATIC RAM (SRAM):
Characteristics:
• It makes use of flip flops (a bistable circuit composed of four or six transistors) which hold each bit of memory. It
does NOT have a capacitor in each cell.
• SRAM is more expensive than DRAM, and it takes up more space.
• SRAM does not need to be constantly refreshed.
• An SRAM memory cell has more parts so it takes more space on a chip than DRAM cell.
• SRAM is much faster than DRAM when it comes to data access
Use (application):
SRAM chip is usually used in CACHE MEMORY due to its HIGH SPEED.
Memory modules can be grouped into SIMM and DIMM.
The principal difference between SIMM (Single Inline Memory Module) and DIMM (Dual Inline Memory Module) is that
pins on opposite sides of a SIMM are ‘tied together’ to form one electrical contact while on a DIMM, opposing pins remain
electrically isolated to form two separate contacts.
Read Only Memory (ROM)
Use
• ROM has specialized uses for the storage of data or programs that are going to be used unchanged over and over
again
• In a general-purpose computer system, the most important use is in storing the bootstrap program. This is a
program that runs immediately when a system is switched on.
• The ROM memory chip stores the Basic Input Output System (BIOS).
Characteristics:
• ROM shares the random-access or direct-access properties of RAM.
• ROMs are non-volatile (the contents are not lost after powering off the computer)
• ROM are permanent memory devices (the contents cannot be changed)
• ROM, as the name implies, has only read capability and no write capability.
Types of ROM
Programmable ROM: PROM can be programmed with a special tool, but after it has been programmed the contents
cannot be changed. The manufacturer of the chip supplies blank PROM chips to a system builder. The system builder
installs their programs or data into the chips. The program or data once installed cannot be changed.
Erasable Programmable ROM (EPROM): The installed data or program can be ERASED (USING ULTRAVIOLET LIGHT)
and new data or a new program can be installed. However, this reprogramming usually requires the chip to be removed
from the circuit (or from the computer or device using it).
Electrically Erasable PROM (EEPROM). An electrical signal can be used to remove existing data. This has the major
advantage that the chip can remain in the circuit while the contents are changed.
Flash Memory is a kind of semiconductor-based non-volatile, rewritable computer memory that can be
electrically erased and reprogrammed. It is a specific type of EEPROM.
Flash memory is used in devices such as digital camera, mobile phone, printer, laptop computer, and record and play
back sound devices, such as MP3 players.
Cache Memory
Use
The cache is a smaller, faster memory which stores copies of the data and instructions from the MOST FREQUENTLY
used main memory locations so that they are immediately available to the CPU when needed. Cache memory is used by
the central processing unit of a computer to reduce the average time to access memory.
Why Cache improves computer performance:
8
Cache memory is faster than main memory (RAM). This means that the CPU can access cache memory more quickly
than it can access RAM. Therefore, retrieving frequently requested data and instructions from RAM and storing them
into cache memory, speeds up memory accesses thereby increasing the performance of the computer.
Another advantage of cache memory is that the CPU does not have to use the motherboard’s system bus for data transfer.
Whenever data must be passed through the system bus, the data transfer speed slows to the motherboard’s capability.
The CPU can process data much faster by avoiding the bottleneck created by the system bus
Cache Memory Operation
Cache memory uses the principle of LOCALITY OF REFERENCE, which states that over a short interval of time, the
addresses generated by a typical program refer to a few localized areas of memory repeatedly, while the remainder of
memory is accessed relatively infrequently.
When the processor attempts to read a word of memory, a check is made to determine if the word is in the cache. If so,
the word is delivered to the processor. If not, a block of main memory, consisting of some fixed number of words, is read
into the cache and then the word is delivered to the processor. Because of the phenomenon of LOCALITY OF
REFERENCE, when a block of data is fetched into the cache to satisfy a single memory reference, it is likely that there
will be future references to that same memory location or to other words in the block. If the information is present in
the cache, it is called a CACHE HIT. If the information is not present in cache, then it is called a CACHE MISS.
ACCESS TIME is the time interval between the read/write request and the availability of data. The LESSER the access
time, the faster is the speed of memory
10
Storage Capacity: Registers < Cache Memory < Primary Memory < Magnetic Disk < Magnetic Tape
Access Speed: Magnetic Tape < Magnetic Disk < Primary Memory < Cache Memory <Register
Cost Per Bit: Magnetic Tape < Magnetic Disk < Primary Memory < Cache Memory < Register
Access Time: Registers < Cache Memory < Primary Memory < Magnetic Disk < Magnetic Tape
11
Main memory is usually larger than one RAM chip. Consequently, these chips are
combined into a single memory of the desired size. For example, suppose you need
to build a 32K × 8 byte-addressable memory and all you have are 2K × 8 RAM
chips. You could connect 16 rows of chips together as shown in the figure adjacent
Each chip addresses 2K bytes. Addresses for this memory must have 15 bits (there
are 32K = 25 × 210 bytes to access).
But each chip requires only 11 address lines (each chip holds only 211 bytes). In
this situation, a decoder is needed to decode either the leftmost or rightmost 4 bits
of the address to determine which chip holds the desired data. Once the proper
chip has been located, the remaining 11 bits are used to determine the offset on
that chip. Whether we use the 4 leftmost or 4 rightmost bits depends on how the
memory is interleaved. (Note: We could also build a 16K × 16 memory using 8
rows of 2 RAM chips each. If this memory were word addressable, assuming 16-
bit words, an address for this machine would have only 14 bits.)
Examples
Example 1: A RAM chip has a capacity of 32K ×16
1. How many memory addresses does this RAM have?
2. How many address lines will be needed for this RAM?
1 RAM has 32K memory addresses.
32K=32× 1024 = 𝟑𝟐𝟕𝟔𝟖 𝑴𝒆𝒎𝒐𝒓𝒚 𝑨𝒅𝒅𝒓𝒆𝒔𝒔𝒆𝒔
2 𝐍 = 𝐥𝐨𝐠 𝟐 𝟑𝟐𝟕𝟔𝟖 = 𝟏𝟓
To address 32K memory locations, we require 15 lines
Example 3: The capacity of 2K × 16 PROM is to be expanded to 16 K × 16. Find the number of PROM chips required
and the number of address lines in the expanded memory.
3 Required capacity =16k x 16
Available chip (PROM) =2k x 16
16𝐾×16
The no of chips = 2𝐾×16
= 8
Thus, the address line required for the single chip = 11
In the expanded memory, the word capacity 16k = 214
Now the address lines required are 14. Among then 11 will be common and 3 will be connected to 3 x8
decoder.
Example 4: A certain memory has a capacity of 4K×8
1. How many data input and data output lines does it have?
2. How many address lines does it have?
3. What is its capacity in bytes?
As in the 4Kx8, the second number represents the number of bits in each word so the number of data input lines will
be 8(also the data output lines)
12
Example 5: A microprocessor uses RAM chips of 1024x1 capacity.
1. How many chips will be required and how many address lines will be connected to provide capacity of 1024
bytes.
2. How many chips will be required to obtain a memory of capacity of 16 K bytes.
1 Available chips = 1024 x 1 capacity
Required capacity = 1024 x 8 capacity
No of Chips= 8
Number of address lines are required = 10
As the word capacity is same (1024) so same address lines will be connected to all chips.
2 No of Chips Required = 128
Example 6:
1. How many 128× 8 RAM chips are required to provide a memory capacity of 2048 bytes.
2. How many lines of address bus must be used to access 2048 bytes of memory. How many lines of these will be
common to each chip?
3. How many bits must be decoded for chip select? What is the size of decoder?
Example 7: For the following memory units (specified by the number of words the number of bits per word) determine
the number of address lines, input/output lines and the number of bytes that can be stored in the specified memory
i. 64K x 8
ii. 16M x 32
iii. 4G x 64
iv. 2K x 16
i 64K x 8
i/p, o/p lines = 8
Address lines = 16
Mem = 64K
ii 16M x 32
i/p, o/p lines = 32
Add = 24
Mem = 64M (16M x 4)
iii 4G x 64
i/p, o/p lines = 64
Add = 32
Mem = 32GB (4G x 8)
iv 2K x 16
i/p, o/p = 16
Add = 11
Mem = 4K
13
C THE INPUT/OUTPUT UNIT
Input Devices:
An input device is a hardware that allows the computer user to enter data and commands into the computer. Examples
of input devices are keyboard, mouse, scanner, joystick, light pen, touch pad, trackball and microphone.
Output Devices:
Output devices are used to communicate the results of computations to the user in a form they understand. Examples
of output devices are monitors, printers, speakers, plotters and projectors.
Monitors:
A monitor is a device that displays computer output on a screen. Another name for monitor is visual display unit (VDU).
Characteristics
Monitors are characterized by the TECHNOLOGY THEY USE, their RESOLUTION, their REFRESH RATE and their SIZE.
Technology
Based on technology used monitors are of two main types: Cathode Ray Tube (CRT) and Liquid Crystal Display Monitors.
Refresh rate:
The refresh rate of a monitor refers to the number of times an image is redrawn on the screen per second. This number
is measured in hertz (Hz)
Screen Size:
The screen size refers to the DIAGONAL DISTANCE from one corner of the display to the opposite corner. It is measured
in INCHES.
Input/ Output Devices
They allow data and commands to be entered into the computer AND at the same time convey information out of the
computer. Examples include touch screen and electronic whiteboard
Expansion Cards:
An expansion card is a circuit board that is inserted into an expansion slot on the motherboard to add functionality to
the computer. Expansion cards are also called expansion boards, plug-in boards, add-on cards, controller cards, adapter
cards, or interface cards. Examples are the graphics card, sound card, and network interface card
Graphics Card
A graphics card, also called video card, or graphics adapter, is an expansion card that controls and produces video on
the monitor. It controls and calculates an image’s appearance on the screen.
Sound Card
A sound card also known as audio card, or audio adapter is an expansion card that enables a computer to manipulate
and output sound. Sound cards enable the computer to output sound through speakers, or headphones connected to
the board, to record sound input from a microphone connected to the computer and to manipulate sound stored on the
disk.
I/O Ports:
A port is a socket on the computer into which an external device can be plugged. Ports provide a pathway for data
exchange between computer and peripheral devices such as keyboards, monitors, and printers. Examples of port
include : PS/2 port, VGA port, Ethernet port, serial ports, parallel port, USB port and firewire port.
THE I/O INTERFACE
The I/O interface provides a method for transferring information between internal and external devices. Peripherals
connected to a computer need special communication links for interfacing them with the Central Processing Unit. The
purpose of the communication link is to resolve the differences that exist between the CPU and each peripheral. The
major differences are:
• Peripherals are electromechanical and electromagnetic devices while the CPU and memory are electronic devices.
Therefore, conversion of signal values may be needed.
• The data transfer rate of peripherals is usually slower than the transfer rate of the CPU and consequently, a
synchronization mechanism may be needed.
• Data codes and formats in the peripherals differ from the word format in the CPU and memory.
• The operating modes of peripherals are different from each other and must be controlled so as not to disturb the
operation of other peripherals connected to the CPU.
To resolve these differences, computer systems include special hardware components between the CPU and peripherals
to supervise and synchronize all input and output transfers.
14
The I/O Bus
The buses that connect peripheral devices to the CPU are called I/O or expansion buses.
On modern PCs, the following types of expansion buses are found: the AGP (Accelerated Graphics Port.) bus, ISA
(Industry Standard Architecture.) bus, PCI (Peripheral Component Interconnect) bus, PCIe (PCI Express) bus, USB
(Universal Serial Bus.) and FireWire bus. AGP, ISA, PCI, and PCIe are parallel buses while USB and FireWire are serial
buses.
The I/O Controller
An I/O controller, also called device controller, connects input and output (I/O) devices to the bus system of the CPU
and manages the exchange of data between the CPU and the device or devices they control.
15
Motherboard with Memory Hub Controller and Bus Bridge
1.Three-address instruction: Computers with three-address instruction format can use each address field to
specify two sources and a destination, which can be either a processor register or a memory operand. It results
in short program but requires too many bits to specify three addresses. Example: ADD R1, A, B (R1← M[A] +
M[B])
2. Two-address instruction: Each address field can specify either a processor register or a memory word.
Example: MOV R1, A (R1← M[A]);
MUL R1, R2(R1← R1*R2)
3. One-address instruction: It used an implied accumulator (AC) register for all data manipulation. The other
operand is in register or memory.
Example: LOAD A (AC ← M[A]);
ADD B (AC ← AC + M[B])
4. Zero-address instruction: A stack organized computer does not use an address field for the instruction ADD and
MUL
Examples
1. Write the code sequence for the statement E←(A+B) * (C-D) in stack architecture (0-address)
PUSH A //TOS ← M[A]
PUSH B // TOS ← M[B]
ADD / /TOS ← (A+B)
PUSH C // TOS ← M[C]
PUSH D // TOS ← M[D]
SUB //TOS ← (C-D)
17
MUL // TOS ← (C-D) *(A+B)
POP E // E←[TOS]
2. Write the 0-address, 1-address, 2-address and 3-address instructions to evaluate X =(P +Q) x (R +S)
2-address instructions One-address instructions: Use an implied accumulator
(AC) register for all data manipulations.
MOV R1, P LOAD P
ADD R1, Q ADD Q
MOV R2, R STORE T
ADD R2, S LOAD R
MUL R1, R2 ADD S
MOV X, R1. MUL T
STORE X.
Here ‘T ’ is a temporary memory location required to
store the intermediate result
Addressing Mode
The different ways in which the location of an operand is specified in an instruction are referred to as addressing modes.
An addressing mode therefore indicates to the processor how to locate the operands associated with an instruction.
Computers use various addressing mode techniques to
• Provide programming flexibility to users through use of pointers to memory, counter for loop control, data indexing
and program relocation.
• Reduce the size of the addressing field of the instruction.
Effective address:
The effective address defined to be the memory address obtained from the computation based on the addressing mode,
consists the ACTUAL ADDRESS of the operand.
Types of Addressing Modes
The most common addressing techniques are:
Implied Addressing mode, Immediate Addressing mode, Direct Addressing Mode, Indirect Addressing Mode, Register
(direct) Addressing Mode, Register Indirect Addressing mode, Auto-increment or Auto-decrement mode, Displacement
addressing mode (PC relative addressing mode, Indexed addressing mode, Base register addressing mode)
Let us suppose [x] means contents at location x for all the addressing modes.
Implied addressing mode
In this mode, the operands are implicitly stated in the instruction. For example, register reference instructions such as
CMA (complement accumulator), CLA (clear accumulator) and zero-address instructions that use stack organization.
Immediate Addressing mode
In immediate addressing mode the value of the operand is held within the instruction itself. Instruction format in
immediate mode is Opcode Data Operand
18
Register (direct) Addressing Mode
In register addressing mode, the operand is the content of a processor register;
the name of the register is given in the instruction. The effective address (EA)
of the operand is the register and not the content of the register.
For example: ADD R1, R2, R3
The above instruction uses three registers to hold all operands. Registers R2
and R3, hold the two source operands while register R1 holds the result of the
computation. The content of R2 and R3 are added and the result is stored in R1.
Register Indirect addressing mode
In this mode the instruction specifies a register in the CPU whose
contents give the address of the operand in memory. Address field
of the instruction uses fewer bits to select a register than would have
required to specify a memory address directly.
19
Effective address = PC + address part
HINT
(a) Code which can be loaded into any area in memory OR into a different area in memory each time it is run;
(b) Effective address = content of base register + offset/displacement/number. In multi-programming OS all
address references use base register addressing. When a program is loaded into main memory,
the base register is loaded with address of base of memory block containing program. Every address is relative to
the base register;
2 (a) ISA of a processor consists of 64 registers, 125 instructions (opcodes) and 8 bits for immediate mode. In a given program,
30% of the instructions take one input register and have one output register, 30% have two input registers and one output register,
20% have one immediate input, and one output register, and remaining have two immediate input, 1 register input and one output
register. Calculate the number of bits required for each instruction type. Assume that the ISA requires that all instructions be a
multiple of 8 bits in length.
(b) Compare the memory space required with that of variable length instruction set.
(a) Since there are 125 instructions so we need 7 bits to differentiate them as 64 < 125 < 128. For 64
registers, we need 6 bits and 8 bits for immediate mode.
For Type 1, 1 reg in, 1 reg out: 7 + 6 + 6 = 19 bits ~ 32 bits
For Type 2, 2 reg in, 1 reg out: 7+ 6 + 6 + 6 = 26 bits ~ 32 bits
For Type 3, 1 imm in, 1 reg out: 7 + 6 + 8 = 21 bits ~ 32 bits
For Type 4, reg in, 2 imm in, 1 reg out: 7 + 6 + 8 + 8 + 6 = 35 bits ~48 bits
(b) As the largest instruction type requires 48 bits instructions, the fixed-length instruction format uses 48
bits per instruction. Variable length instruction format uses 0.3 x 32 + 0.3 x 32 + 0.2 x 32 + 0.2 x 48 = 36 =
bits on average, that is, 25% less space.
4. A two-word instruction LOAD is stored at location 300 with its address field in the next location. The address field has value
600 and value stored at 600 is 500 and at 500 is 650. The words stored at 900, 901 and 902 are 400, 401 and 402, respectively. A
processor register R contains the number 800 and index register has value 100. Evaluate the effective address and operand if
addressing mode of the instruction is as follows: Direct, Indirect, Relative, Immediate, Register indirect, Index
Memory layout is as follows
20
Addressing Mode Effective Address Operand
Direct 600 500
Indirect 500 650
Relative 902 402
Immediate 301 600
Register indirect 800 700
Index 700 900
Applications:
They have a particular application in graphics cards.
Other applications include sound sampling – or any application where a large number of items need to be altered by the
same amount (since each processor is doing the same calculation on each data item).
Multiple Instruction, Single Data Stream (MISD)
MISD machines are multiprocessor machines capable of executing different instructions on different processors, but all
of them operating on the same data set. They have n processors, each with its own control unit, that share a common
memory. Each processor receives a distinct instruction stream but all operate on the same data stream. Such machines
no longer exist
Multiple Instruction, Multiple Data Stream (MIMD)
MIMD machines are multiprocessors machines capable of executing multiple instructions on multiple data sets. They
have n processors, n streams of instructions and n streams of data. Each processor element in this model has a separate
instruction stream and data stream hence such machines are well suited for any kind of application
21
RISC AND CISC MACHINES
Reduced Instruction Set Computers
RISC is a CPU design with a small number of basic and simple machine language instructions, from which more complex
instructions can be composed.
Implementing a processor with a simplified instruction set design provides several advantages over implementing a
comparable CISC design
Higher Performance:
Since a simplified instruction set allows for a pipeline processing, RISC processors often achieve 2 to 4 times the
performance of CISC processors using comparable semiconductor technology and the same clock rates
Lower per-chip cost:
Because the instruction set of a RISC processor is so simple, it uses up much less chip space. Smaller chips allow a
semiconductor manufacturer to place more parts on a single silicon wafer, which can lower the per-chip cost
dramatically.
Shorter Design Cycle:
Since RISC processors are simpler than corresponding CISC processors, they can be designed more quickly, and can take
advantage of other technological developments sooner than corresponding CISC designs, leading to greater leaps in
performance between generations.
Complex Instruction Set Computers
CISC is a CPU design with a large number of different and complex instructions. One key difference between RISC and
CISC is that CISC instruction sets are not constrained to the load/store architecture, in which arithmetic and logic
operations can be performed only on operands that are in processor registers. Another key difference is that instruction
do not necessarily have to fit into a single word. Some instructions may occupy a single word, but others may span
multiple words- variable length instructions.
RISC CISC
Fewer instructions More instructions
Simpler instructions More complex instructions
Small number of instruction formats Many instruction formats
Single-cycle instructions whenever possible Multi-cycle instructions
Fixed-length instructions Variable-length instructions
Only load and store instructions to address memory Many types of instructions to address memory
Fewer addressing modes More addressing modes
Multiple register sets Fewer registers
Hard-wired control unit Microprogrammed control unit
Pipelining easier Pipelining more difficult
22
PARALLEL PROCESSING
Parallel processing is the use of multiple processors simultaneously to execute a single program or task. Some personal
computers implement parallel processing with multiprocessors while others have multicore processors.
Multicore Processor
A multi-core processor is a computer processor on a single integrated circuit with two (dual) or more separate
processing units, called cores, each of which reads and executes program instructions. The instructions are ordinary
CPU instructions (such as ADD, MOVE etc.) but the single processor can run instructions on separate cores at the same
time, increasing overall speed for programs that support multithreading or other parallel computing technique
Doubling the number of cores will not simply double a computer’s speed. CPU cores have to communicate with each
other through channels and this uses up some of the extra speed. In addition, improvement in performance gained by
the use of a multi-core processor depends on the software development algorithms used and their implementation.
Multiprocessors
Multiprocessing is the use of two or more central processing units (CPUs) within a single computer system.
Multiprocessor systems (containing many processors, each possibly containing multiple cores) either execute a number
of different applications tasks in parallel, or they execute subtasks of a single large task in parallel.
Pipeline Processing (instruction-level parallelism)
Pipeline processing (Pipelining) is an implementation technique whereby multiple instructions are overlapped in
execution.
Multiple Issue – Superscalar
A superscalar CPU architecture implements a form of parallelism called instruction level parallelism within a single
processor. It therefore allows faster CPU throughput than would otherwise be possible at a given clock rate.
INTERRUPT
Interrupt:
An interrupt is a signal sent by a hardware device or program or internal clock to the processor requesting its attention.
Interrupts can be caused by, for example
• A timing signal
• Input/output processes (a disk drive is ready to receive more data, for example)
• A hardware fault (an error has occurred such as a paper jam in a printer, for example)
• User interaction (the user pressed a key to interrupt the current process, such as <CTRL><ALT><BREAK>, for
example)
• A software error that cannot be ignored (if an .exe file could not be found to initiate the execution of a program OR
an attempt to divide by zero, for example).
Interrupts are triggered regularly by a timer, to indicate that it is the turn of the next process to have processor time
(Processor Scheduling). It is because a processor can be interrupted that multi-tasking can take place.
23
Level Type Possible causes
1 Hardware Power failure – this could have catastrophic consequences if it is not dealt with
failure immediately so it is allocated the top priority.
2 Reset Some computers have a reset button or routine that literally resets the computer
interrupt to a start-up position.
3 Program error The current application is about to crash so the OS will attempt to recover the
situation. Possible errors could be variables called but not defined, division by
zero, overflow, misuse of command word, etc.
4 Timer Some computers run in a multitasking or multiprogramming environment. A
timer interrupt is used as part of the time slicing process.
5 Input/ Output Request from printer for more data, incoming data from a keyboard to a mouse
key press, etc.
Once the interrupt has been serviced, the original values of the registers are retrieved from the stack and the process
resumes from the point that it left off. A test for the presence of interrupts is carried out at the end of each fetch-decode-
execute cycle.
How the Interrupt Works
An additional step is added to the fetch–execute cycle. This extra step fits between the completion of one execution and
the start of the next. After each execution the processor checks to see if an interrupt has been sent by looking at the
contents of the interrupt register.
24
When the processor receives an interrupt signal, it suspends execution of the running program or process and disables
all interrupts of a lower priority in order to service the interrupt. It does this using the Interrupt Service Routine (ISR)
which calls the routine required to handle the interrupt. Most interrupts are only temporary so the processor needs to
be able to put aside the current task before it can start on the interrupt. It does this by placing the contents of the
registers, such as the PC and CIR on to the system stack. Once the interrupt has been processed the CPU will retrieve the
values from the stack, put them back in the appropriate registers and carry on.
Types of Interrupts
There are three types of interrupts:
Hardware Interrupts are generated by hardware devices to signal that they need some attention from the OS. They may
have just received some data (e.g., keystrokes on the keyboard or a data on the ethernet card); or they have just
completed a task which the operating system previously requested, such as transferring data between the hard drive
and memory.
Software Interrupts are generated by programs when they want to request a system call to be performed by the
operating system. The most common use of software interrupt is associated with a supervisor call instruction. This
instruction provides means for switching from a CPU user mode to the supervisor mode. Certain operations in the
computer may be assigned to the supervisor mode only, as for example, a complex input or output transfer procedure.
Traps: Internal interrupts are also called traps. Examples of interrupts caused by internal error conditions are register
overflow, attempt to divide by zero, an invalid operation code, stack overflow, and protection violation.
Typical uses of interrupts
• Interrupts are commonly used to service hardware timers, transfer data to and from storage (e.g., disk I/O) and
communication interfaces (e.g. Ethernet), handle keyboard and mouse events, and to respond to any other time-
sensitive events as required by the application system.
• Non-maskable interrupts are typically used to respond to high-priority requests such as watchdog timer timeouts,
power-down signals and traps.
• Hardware timers are often used to generate periodic interrupts. In some applications, such interrupts are counted
by the interrupt handler to keep track of absolute or elapsed time, or used by the OS task scheduler to manage
execution of running processes, or both.
• A disk interrupt signals the completion of a data transfer from or to the disk peripheral; this may cause a process to
run which is waiting to read or write.
• A power-off interrupt predicts imminent loss of power, allowing the computer to perform an orderly shut-down
while there still remains enough power to do so.
An interrupt is said to be masked when it has been disabled, or when the CPU has been instructed to ignore it. A non-
maskable interrupt (NMI) cannot be ignored, and is generally used only for critical hardware errors.
25