StudyMaterial_CSE_3RD_Computer-System-Architecture
StudyMaterial_CSE_3RD_Computer-System-Architecture
LECTURE NOTES
ON
Compiled by
CONTENTS
CHAPTER-1
Basic structure of computer hardware
Computer Architecture
In computer engineering, computer architecture is a set of rules and methods that
describe the functionality, organization, and implementation of computer systems
Functional unit
• A computer consists of five functionally independent main parts input, memory,
arithmetic logic unit (ALU), output and control unit.
• Input device accepts the coded information as source program i.e., high level
language.
• This is either stored in the memory or immediately used by the processor to perform
the desired operations.
• The program stored in the memory determines the processing steps.
• Basically, the computer converts one source program to an object program. i.e., into
machine language.
• Finally, the results are sent to the outside world through output device. All of these
actions are coordinated by the control unit.
Input unit: -
• The source program/high level language program/coded information/simply data is fed
to a computer through input devices keyboard is a most common type.
• Whenever a key is pressed, one corresponding word or number is translated into its
equivalent binary code over a cable & fed either to memory or processor.
• Example –Keyboard, Joysticks, trackballs, mouse, scanners etc are other input devices.
Memory unit: -
Its function is to store programs and data.
It is basically to two types 1. Primary memory 2. Secondary memory
Primary memory: -
• Is the one exclusively associated with the processor and operates at high speed.
• The memory contains a large number of semiconductors storage cells.
• These are processed in a group of fixed size called word.
• Programs must reside in the memory during execution. Instructions and data can be
written into the memory or read out under the control of processor.
• Secondary memory: -
• This type of memory is used where large amounts of data & programs have to be
stored, particularly information that is accessed infrequently.
• Examples: - Magnetic disks & tapes, optical disks (i.e., CD-ROM’s), floppies etc.
Control unit: -
The operations of all the units are coordinated by the control unit i.e., it acts as a nerve centre
that sends signals to other units and senses their states. The actual timing signals that govern
the transfer of data between input unit, processor, memory and output unit are generated
by the control unit.
Output unit:
These are actually are the counterparts of input unit. Its basic function is to send the
processed results to the outside world after processing. Examples: - Printer, speakers, monitor
etc.
Bus Structure
A bus is a communication system that transfers information (in any form like data, address or
control information) between components, inside a computer, or between computers.
• It is the group of wires carrying a group of bits in parallel.
• There are three kinds of bus according to the type of information they carry like
1. Data Bus
2. Address Bus
3. Control Bus
• A bus which carries a word from or to memory ate called Data bus. It carries the data
from one system module to other. Data bus may consist of 32,64, 128 or even more
numbers of separate lines. This number of lines decides the width of the data bus.
Each line can carry one bit at a time. So, a data bus with 32 lines can carry 32bit at a
time.
• Address Bus is used to carry the address of source or destination of the data on the
data bus.
• Control Bus is used to control the access, processing and information transferring.
• In this method bus architecture, the processor will completely supervise and participate
in the transformation.
• The information will be first taken to the processor register and then to the memory such
that transfer is known as program-controlled transfer.
• The interconnection between i/o unit, processor and memory accomplished by two
independent system bus is known as two-way bus interconnection structure.
• The system bus between i/o unit and processor consist of DAB (Device address bus), DB
(Data bus), CB (Control bus). Similarly, the system bus between memory processor
consists of MAB (Memory address bus), DB, CB.
• The communication exists between
✓ Memory to processor
✓ Processor to memory
✓ I/o to processor
✓ processor to i/o
✓ I/o to memory
Performance measures: -
Performance is the ability of the computer to quickly execute a program.
• The speed at which the computer executes a program is decided by the design of its
hardware and machine language instruction.
• Computer performance measures is of very big term when used in context of the
computer system.
• System that executes program in less time are called to have higher performance.
Response Time: -
• Response time is the time spend to complete an event or an operation.
• It is also called as execution time or latency.
Throughput: -
Throughput is the amount of work done per unit of time. i.e., the amount of processing that
can be accomplished during a given interval of time.
• It is also called as bandwidth of the system.
• In general, faster response time leads to better throughput.
Elapsed time: -
• Elapsed time is a time spent from the start of execution of the program to its
completion is called elapsed time.
• This performance measure is affected by the clock speed of the processor and the
concerned input output device.
MIPS
• A nearly measure of computer performance has the rate at which a given machine
executed instruction.
• This is calculated by dividing the no. of instruction and the time required to run the
program.
CPI/IPC
• CPI – Clock cycle per Instruction, IPC – Instruction per cycle.
• It is another measuring that which is calculated as the number of clock cycle required
to execute one instruction (cycle per instruction) by the instruction executed per cycle.
Speedup: -
• Computer architecture use the speed up to describe the performance of architectural
changes as different improvement are made to the system.
• It is defined as ratio of execution time before to the execution time after the charge.
• Speed up = execution time before /Execution time after
Amdahl’s law: -
• This law states that “performance improvement to be gained by using a faster mode
of execution is limited by the fraction of time the faster mode can be used”.
• Amdahl’s law defines the term speed up.
• Speed up = performance of entire task using enhancement/ Performance of entire
task without using enhancement
• Performance = 1 / Execution time
• Speed up = execution time without using enhancement/Execution time with using
enhancement
• Factors affecting speedup are as follows:
1) The fraction of computation time in the original machine can be modified to
use the advantage of the enhancement. This is called fraction enhanced which is
always less than or equal to one.
Fraction enhanced ≤ 1
2) Improvement granted by the enhanced execution made is the speed with
which the task could run faster using the enhancement.
Speed up > 1
• Speed up enhanced = time in original mode /Time in enhance mode
• Ex. Let us a program takes 5 second in enhanced mode while it takes 10 second earlier.
So, speedup enhanced = 10/5 = 2
Memory addressing
The maximum size of the memory that can be used in any computer is determined by the
addressing scheme.
For ex- a 16-bit computer that generates a 16-bit address is capable of addressing up to 216 k
memory location.
The no of location represents the size of the address space of computer.
Most modern computers are byte addressable computer.
CHAPTER- 2
Instructions & instruction Sequencing
Introduction
The instruction set defines many of the function performed by the CPU and plays a
significant role in the implementation of the CPU.
The factors to be taken into account while designing an instruction set are:
1. Operation Repertoire
This gives an idea how many & what kind of operation need to be provided
and also the complexity of such operation.
2. Data type
Information needs to be provided on the various types of data & operation
to be performed on them.
3. Format of instruction
This includes the length of the instruction in bits, number of addresses to be
used with the instruction and the size of each field in the instruction.
4. Register
The number of CPU register that can be accessed by instruction for storage
of data & operands.
5. Addressing mode
The instruction set also specifies addressing methods for accessing operands
either in the memory or in the processor register.
Types of Operands
Machine operation depend on the types of data being processed.
The different format of data to be used with assembly and high-level language
program are as follows
1. Address
2.Number
3.Character
4.Logical data
Address
• Address is a form of numbers that represent specific location in the memory.
• Address may be considered as unsigned integer.
Number
• Numeric data types are used by all machine language.
• Three types of numerical data are commonly used :
1. Integer or Fixed point
2. Floating point
3. Decimal
Character
Another common form of data is represented in a program are character string. As
it cannot be stored or processed so these form of data are coded by any of the
conversion method like IRA,ASCII,EBCDIC.
Logical data
• This is the bit-oriented view of data.
• These types of data can be represented as an array of Boolean or binary data
items (1 for T and 0 for F).
Addressing Modes
• The operand field of an instruction specifies the address from where the data
has to be fetched.
• This may be a memory address , register or may be a direct value.
• The operand chosen is dependent on the addressing mode of the instruction.
Immediate Addressing:
The simplest form of addressing is immediate addressing, in which the
operand is actually present in the instruction:
OPERAND = A
The advantage of immediate addressing is that no memory reference other than
the instruction fetch is required to obtain the operand.
Direct Addressing:
• A very simple form of addressing is direct addressing, in which the address
field specifies the address of the memory location.
• It requires only one memory reference and no special calculation.
Register Addressing:
• Register addressing is similar to direct addressing.
• The only difference is that the address field refers to a register rather than a
main memory address
Ex- MOV A B
Displacement Addressing:
A very powerful mode of addressing combines the capabilities of direct
addressing and register indirect addressing, which is broadly categorized as
displacement addressing:
EA = A + (R)
Three of the most common use of displacement addressing are:
• Relative addressing
• Base-register addressing
• Indexing
Relative Addressing:
• In this mode the instruction specifies the operand address as the relative
position of the current instruction address i.e., content of PC.
• The current instruction address is added to the address field to produce EA.
Base-Register Addressing:
In this mode the reference register contains a memory address, and the address
field contains a displacement from that address.
Index addressing:
• In this mode an index register contains the offset value.
• The instruction contains the address that should be added to the offset in the
index register to get the effective address.
Instruction format
An instruction format defines the layout of the bits of an instruction in an
instruction set.
Usually there are several instruction formats in an instruction set.
1. An operation code:-This specifies the operation to be performed.
2. Source operand :- The operand specified by the opcode may involve one or
more sources for operand, these act as the input for the operation.
3. Resultant operand :-After processing the result must have to be stored in an
operand.
4. Next reference instruction :-This field tells the CPU from where the next
instruction is to be fetched after the execution is completed.
Three address Instruction
Computer with three addresses instruction format can use each address field to
specify either processor register or memory operand.
ADD R1, A, B R1 ® M [A] + M [B]
ADD R2, C, D R2 ® M [C] + M [D] X = (A + B) * (C + A)
MUL X, R1, R M [X] R1 * R2
• The advantage of the three address formats is that it results in short program
when evaluating arithmetic expression.
• The disadvantage is that the binary-coded instructions require too many bits
to specify three addresses.
• All operations are done between the AC register and a memory operand.
• It’s the address of a temporary memory location required for storing the
intermediate result.
LOAD C AC ® M (C)
ADD D AC ® AC + M (D)
ML T AC ® AC * M (T)
STORE X M [X] ® AC
Zero Address Instruction
A stack organized computer does not use an address field for the instruction ADD
and MUL. The PUSH & POP instruction, however, need an address field to specify
the operand that communicates with the stack (TOS ® top of the stack)
PUSH A TOS ® A
PUSH B TOS ® B
ADD TOS ® (A + B)
PUSH C TOS ® C
PUSH D TOS ® D
ADD TOS ® (C + D)
MUL TOS ® (C + D) * (A + B)
POP X M [X] TOS
CHAPTER- 3
Processor System
INTRODUCTION
• The part of the computer that performs the bulk of data processing operation is called the
central processing unit, CPU.
• The CPU is made up of three major parts-
1. ALU: - Arithmetic Logic Unit performs the required microoperation for executing
the instruction.
2. Register Set: - Stores the intermediate data used during the execution of instruction.
3. Control Unit: - Supervises the transfer of information among register and ALU by
sending suitable control signal.
Control unit
The Control Unit (CU) is the heart of the CPU. Every instruction that the CPU supports has to
be decoded by the CU and executed appropriately. Every instruction has a sequence of
microinstructions behind it that carries out the operation of that instruction
• Any digital system consists of 2 units
✓ Data processor
✓ Control logic
• Data processor consists of individual register and all functional unit.
• Control logic initiates all micro-operation in the data processor.
• Control unit generates control signal which initiates sequence of micro-operation.
• By issuing control signal which micro-operation are activated are represented as
sequence of 1s and which are not activated represented as 0s.
• Traditionally there are two general approaches to implement a control and decode unit
for a CPU :
✓ Hardwired control Unit
✓ Micro programmed control Unit
• Main memory is available for storing program. content of main memory may alter by
changing the program.
• Each machine instruction in main memory initiates a series of micro instruction in
control memory.
• Microinstruction generates micro-operation such as fetch instruction from main
memory, calculate effective address, fetch operand, execute the operation etc.
• Each control word in control memory contains within it a microinstruction.
• A sequence of microinstruction constitutes a micro program.
• For a particular control signal a particular micro program is written, since there is no
need to change the micro program stored in control memory, the control memory can
be stored in ROM.
• So, to execute a micro program stored in control memory to get the appropriate control
word for generation of control signal, the following operation is to be done.
• The control memory address specifies the address of micro instruction & the control
data register holds the microinstruction read from memory.
• The microinstruction contains a control word that specifies one or more micro-operation
for the data processor, once these operations are executed , the control must determine
the next address.
• The next address may also be a function of external input condition.
• The next address generator is a circuit that generates the next address which is then
transferred into the control address register to read the micro instruction.
• The next address generator sometimes called as micro program sequencer.
REGISTER FILES: -
• Most modern CPUs have a set of general purpose (GPRs) register R0 to Rn-1 called
register files.
• Each register Rm in RF is individually addressable with address subscript m.
Example: - R2=f (R1, R2)
• This way the processor is able to retain intermediate results in fast and accessible
registers rather than external memory M.
• For accessing RF needs several ports for simultaneous reading and writing purposes, so
it is often realized as a ‘multiport RAM.’
• A multiport RF is built using a set of registers of proper size and multiplexer-
demultiplexer.
• The read operation can take place through several devices reading from the same
register using different ports though the writing is normally done through one port only.
• The above RF shows a three port, where simultaneous read can occur from port A and
port B and writing takes place using port C.
DESIGN OF ALU: -
• The circuits which carry out the data processing instructions, is the ALU.
• The complexity of the ALU depend upon how the instruction are realized.
• The ALU using combinational circuits can perform fixed point arithmetic as well as word
based logical operation.
• Some extra control logic and some extensive data processing circuits called as co-
processor are employed to perform floating point operation.
COMBINATIONAL ALU: -
• The simple ALU combines the both features of 2’s complement added subtractor and
word-based logic unit.
• The combinational ALU is nothing but a combination of combinational circuits and
multiplexer.
ADVANTAGE: -
This type of ALU is much simpler.
DISADVANTAGE: -
1. It is more expensive.
2. It is much slower.
CHAPTER- 4
Memory System
Memory characteristics
The memory unit is the essential component in any computer since it is needed for storing the
program and data.
The memory system is classified according to their key characteristics like :
1. Location
2. Capacity
3. Unit of Transfer
4. Access Method
5. Performance
6. Physical type
7. Physical characteristics
1. Location:
It deals with the location of the memory device in the computer system. There are three possible
locations:
• CPU: This is often in the form of CPU registers and small amount of cache
• Internal or main: This is the main memory like RAM or ROM. The CPU can directly access the
main memory.
• External or secondary: It comprises of secondary storage devices like hard disks, magnetic
tapes. The CPU doesn’t access these devices directly. It uses device controllers to access
secondary storage devices.
2. Capacity:
The capacity of any memory device is expressed in terms of: i)word size ii)Number of words
Word size: Words are expressed in bytes (8 bits). A word can however mean any
number of bytes. Commonly used word sizes are 1 byte (8 bits), 2bytes (16 bits) and 4
bytes (32 bits).
Number of words: This specifies the number of words available in the particular
memory device. For example, if a memory device is given as 4K x 16.This means the
device has a word size of 16 bits and a total of 4096(4K) words in memory.
3. Unit of Transfer:
It is the maximum number of bits that can be read or written into the memory at a time. In case
of main memory, it is mostly equal to word size. In case of external memory, unit of transfer
is not limited to the word size; it is often larger and is referred to as blocks.
4. Access Methods:
It is a fundamental characteristic of memory devices. It is the sequence or order in which
memory can be accessed. There are three types of access methods:
6. Physical type: Memory devices can be either semiconductor memory (like RAM) or
magnetic surface memory (like Hard disks).
7.Physical Characteristics:
• Volatile/Non- Volatile: If a memory device continues hold data even if power is turned off.
The memory device is non-volatile else it is volatile.
• Both of these operations require a memory address. In addition, the write operation requires
specification of the data to be written.
• The address and data of the memory unit are connected to the address and data buses of the
system bus, respectively.
• The read and write signals come from the control bus.
• For controlling the movement of these words i.e., in and out two signals are used to
o Write
o Read respectively
• The words to be written or moved in are first enter to the register is called memory data
register (MDR).
• The location in the memory unit when a word is stored is called as address of the word. To
take out or retrieve a word from a memory unit one has to specify its address in another
special register which is called as memory address register (MAR).
• MAR of R bit especially the memory size is of 2k word. Similarly, MDR of n bit determines the
word size is off n bit (n number of cells present in a word).
Read operation
• For Read operation the address has to be sent to MAR which is being carried out by the
address bus then the READ signal is sent to the memory for read function.
• The memory transfers the corresponding word from the specified location to the MDR
through data bus.
• Drop the memory read control signal to terminate the read cycle.
• After the completion of the Read Operation the memory transfer MFC (Memory function
completed signal).
Write operation
• For write operation the CPU specify the location to MAR and data to MBR/MDR.
• Then it transfers the Write control signal after which the data which is present in MDR is
transfer to the memory.
• After completion of the write operation memory transfer MFC signal.
Access time
The duration of time between the initiation of read signal and the availability of required word in the
MBR is called as the access time or read time.
Write time
The duration between the write signal and storing of the word in the specified location is called as
write time.
It is necessary that the information should be written back from where it was read. The duration of
read and write operation is called as memory cycle time.
SEMICONDUCTOR RAM: -
Semiconductor memories are available in a wide range of speed and their cycle time ranges from
100 ns to less than 10ns.
Memory cells are organized in the form of an array of cells, each cell capable of storing one bit of
information.
• Each row of cells constitutes a memory word of 8bits b0-b7 and cells of a row are connected
to a common signal line called “word line”, which is driven by the address decoder.
• Two ‘bit lines ' connect the cells in each column to a sense /write circuit.
• The sense / write circuit are connected to the data I/O lines.
• During a 'Read' operation , these circuits read the information stored and transmits this
information to the output data line
• During a 'Write' operation , the sense/write circuit receive input information and store it in the
selected cell.
• This following organization of a very small memory chip consisting of 8bits of 16 words i.e.,
16×8 organization.
• The data I/O of each sense/write circuit are connected to a single bidirectional data line that
are connected to the data bus of the computer.
• In addition, there are also two control lines R/W and CS(chip select)
• The R/W signal line specifies ,the required operations and CS selects a given chip in multi -
chip memory system.
• This memory circuit stores 128 bit and it requires 14external connection for address data and
control line. It also needs two lines for power supply and ground connection.
• For a larger memory circuit let 1k(1024) memory cell can be organized as 128×8 memory
requires 19 external connection.
• The speed and efficiency of transfer of word or blocks between memory and processor
greatly affect the performance of a computer system.
• There are two parameter latency and bandwidth give an indication of performance of
computer system.
LATENCY: -
• The amount of time it takes to transfer a word of data to or from the memory is referred
to a LATENCY of a memory.
• Latency provides a complete indication of memory performance in case of a reading or
writing of a single word.
BANDWIDTH: -
When transferring a block of data as the block size can be variable, the performance Measures
in terms of the number of bits or bytes that can be transferred in one second is known as
BANDWIDTH of memory unit.
The bandwidth clearly depends upon the speed of access and no. of bits that can be
transferred in parallel ,so the bandwidth is the product of data transferred and the width of
data bus.
Both SRAM and DREAM chips are volatile in nature as they lose the stored information if the
power is removed.
• Many applications need memory devices to retain the stored information even if power
is removed.
• When a computer is turned on the operating system software has to be loaded from the
disk into the memory.
• The boot program is quite large and most of it is stored on the disk ,but the processor
should execute some instructions that loads the boot program; so, a small amount of
non-volatile memory that holds the instruction which helps in loading the boot program
from the disk.
• Since, it’s normal operation involves only reading of stored data , a memory of this
type is called Read only memory.
Cache memory: -
• The speed of main memory is very slow in compare to the speed of processor. So, for better
performance a high-speed memory is used in between main memory and CPU. That is known
as cache memory.
• The cache memory comes from the word cache meaning to hide.
• The basic idea behind a cache is simple i.e., the most heavily used memory words are kept in
the cache, when the CPU needs a word, it will first look in the cache, only if the word is not
there, it goes to main memory.
• Analysis of a program shows the maximum program execution time spent in those portions in
which many instructions were executed repeatedly as in loops, hence for the execution of the
programs forms a localize area, hence for the execution of the programs forms a localize area,
where the programs or instruction executed repeatedly and the remainder of the programs
are executed relatively less frequently that is called locality of reference.
PROCESSOR
CACHE
MEMORY
MAIN
Read operation: -
• When the CPU needs to access memory, the cache is first searched, if the word is found, it or
read, it is known as “cache hit”
• If the word is not found, then the main memory is referred as “cache miss”.
• When a “cache miss” occurs, it initiates to access main memory to transfer the required byte
or word from main memory to cache.
• The performance of the cache memory is known as hit ratio.
Problem: -
Write operation: -
During read operation, when the CPU finds a word in cache memory, then the main memory is not
involved in the transfer. But in case of write operation there are two ways of writing.
The simplest and most commonly used procedure is to update main memory with every memory write
operation with cache memory being update in parallel. This is called as write through policy.
Advantages: -
This method is the most important characteristics of direct memory access transfer.
• If the cache follows this policy, then the cache is updated during write operation and the
location is marked by a flag.
• When the block of the cache containing the flagged word is required to transfer the main
memory at that time it is updated in main memory.
Advantages
Whenever the word is updated several times, it is better to use write back policy.
Mapping
• The transformation of information from main memory to cache memory is known as
Mapping.
• There are three types of mapping function
✓ Direct Mapping
✓ Associative Mapping
✓ Set-associative Mapping
• To explain the mapping procedures, we consider 2K cache consisting of 128 blocks of 16
words each, and a 64K main memory addressable by a 16-bit address, 4096 blocks of 16
words each.
Direct Mapping
• Block m of the main memory maps onto block c of cache memory according to formula c =m
mod no of block of cache memory
c = m mod 128
• By this formula the block o,128,256…. Of main memory will be loaded to 0 block of cache
memory. Similarly block 1,129,257 …. of main memory will be loaded to 1 of CM and
likewise.
• For this mapping the CPU generates address for cache memory is 16 bits.
• The address is divided into 3 parts.
✓ Word
✓ Block
✓ Tag
• One block contains 16 words , so 4 bits are required to indicate word field, cache
contain 128 blocks so 7 bits are required to indicate block field and the rest 16-
(7+4)=5 bits are required to indicate tag bit.
• When CPU wants to read or write then the higher order 5 bits of the address are being
compared with the tag bit of cache memory,
• If it matches then the desired word is present and a cache hit occurs.
• If not, there will be a cache miss which leads to a write operation.
Advantage
Disadvantage
The contention problem may occur even though the cache is not full.
Associative mapping
• A main memory block can be placed into any cache block position.
• The 12 tag bits identify a memory block residing in the cache.
• The lower-order 4 bits select one of 16 words in a block.
• The cost of an associative cache is relatively high because of the need to search all 128 tags to
determine whether a given block is in the cache or not.
Disadvantage: Slow or expensive. A search through all the 128 CM blocks is needed to check
whether the 12 MSBs of the 16-bit address can be matched to any of the tags.
Set-Associative Mapping:
• Blocks of the cache are grouped into sets, and the mapping allows a block of the main memory to
reside in any block of a specific set.
• A cache that has k no of blocks per set is referred to as a k-way set-associative cache.
• The contention problem of the direct method is eased.
• The hardware cost of the associative method is reduced.
Interleaved Memory
• The two key factors in the success of computer are performance and cost.
• This can be achieved through parallelism.
• In parallel processing or pipeline environment, the main memory is the prime
system resources, which is normally shared by all processor or stages of the
pipeline.
• In such cases there may be memory interference, which as a result degrades
the performance. So, to avoid this problem, a new method is adopted which is
known as “Memory interleaving”.
• The Memory interleaving means the main memory of the computer into a no
of modules and distributing the address among those modules.
• Each memory module has its own Address Buffer Register (ABR)or Memory
Address Register (MAR) and Data Buffer Register (DBR) or Memory Buffer
Register (MBR).
• There are two memory address layouts:
• The address consists of :(1) (2) low-order m-bits point to a particular word in
that module
In this type of memory INTERLEAVING the memory is divided into M no. of modules where the
consecutive address lies in a single module.
• In this method the higher order bit of the address used for indicating the module no. and the
lower order bits are used for indicating the address in module.
• Let for example we have a memory having 16 words .
• In the above case the higher order bits used for indicating the module no. and the lower order
bits are used for the words in the module .
• In this case each memory address is of n bit out of which the higher order m bits are used for
interleaving and n-m bits are used for the words in particular module.
• The m bits are being decided by the decider which will specify the particular module no. and
n-m bit specify the words in the module .
• Every memory module has its own MAR and MBR.
Advantage: -
• It permits easy expansion of memory by addition of one or more memory module as needed
to a maximum of m-1.
• Better system reliability in case of a failed module as it affects only a localized area f address
space.
Disadvantage: -
In this memory interleaving the consecutive words are distributed in consecutive modules.
• Here the higher order n-m bits are used for address of words in a module while m lower bits
are used for module no. .
• This method is efficient way to address the module.
• Here any request for accessing consecutive words can keep several modules busy at the same
time ;this is faster than the previous one and so used frequently.
• Ex:-all have 16 words.
CHAPTER- 5
Input – Output System
Introduction
• I/O is an essential & integral part of computer system. Variety of I/O devices on a computer
system - keyboard, mouse, display, magnetic disk, network adaptor, etc.
• I/O devices differ widely in terms of function, size, mode ofoperation, transfer speed, power
consumption, etc.
• All devices must be connected to the processor and memory using the same basic
architecture.
• Over the years, different mechanisms have been developed to connect I/O devices to
systems, and to program I/O data transfers over the resulting connections.
I/O controllers
• An I/O device is connected to the computer system by using a
device controller.
• I/O devices vary in of some or all of the following characteristics:
o Representation of data: voltage, current, magnetic field,etc.
o Speed of operation and data transfer
o Timing and control requirements
o Need to detect physical events – e.g., mouse clicks or keypresses
o Need for error detection and correction.
• To resolve these problems the computer system, include special hardware circuits between
CPU & peripherals. These are calledas interface unit. Each device has its own interfacing
unit.
• The purpose of communicational link is to solve the following problems:
o Peripherals are electromechanical devices but CPU & memory are pure electronic
devices.
o Data transfer rate is slower than that of CPU, so synchronization mechanism is
required.
o Data codes and formats of peripheral devices different from the word format of
CPU and memory.
o Operating mode of peripheral are different from each other.Each other must be
controlled, so that it will not disturb the operation of others.
• The above diagram shows the communication link between the processor and
peripherals devices.
• They are connected via the I/O bus. i/o bus consist of address bus, control bus, and
data bus.
• Each peripheral consists of its own interface. each interface attached to the input
output bus which contains an addressdecoder, that monitors the address line.
• When the interface detects its own address then it activates the path between the bus
line and device. Rest lines are disabled.
• This is also called as input output command. The following command can be
received.
✓ Control command-This is issued to activate the peripheraland to inform
what to do.
✓ Test command-this is used to test various status conditionin the interface
and the peripherals.
✓ Read command-it causes the interface to respond bytransferring the data
from the bus into one of the registers.
✓ Write command-this is used to receive an item of datafrom peripheral
device and place it in a buffer register.
• All the above process being accompanied by the control circuit of i/o interface.
• After giving the control command in the control line the CPUactivate the data bus.
Then the data transfer takes place.
I\o operation can be done using three techniques.
1. Programmed I/O
2. Interrupt driven I/O
3. Direct memory access
1. Programmed I/O
• In programmed I/O, data are exchanged between the processor and the I/O module.
• The processor directly controls of the I/O operation, including sensing device status,
sending a read or write command, and transferring the data.
• When the processor issues a command to the I/O module, it must wait until the I/O
operation is completed .
• As the processor is faster than the I/O module, a lot of processor time is wasted.
Processing of interrupt
Interrupt
Hardware processing
The sequence of hardware events that occur when an i\o device completes an
operation are:
Software processing
✓ The stack pointer is updated to point to the new TOS andthe PC to
beginning of interrupt service routine.
✓ The interrupt handler will next process the interrupt.
✓ When the interrupt processing is completed the savedregister values
are restored .
✓ The PSW and PC values are finally restored from thestack.
✓ All the state of the program information about the ISR hasto be saved for
future reference.
Handling mechanism
Polling (S/W)
(S/W) Daisy
chaining(H/W)
Polling
• Polling is a software method which identifies the highestpriority source.
• The highest priority source is tested first & if its interrupt signal is on then
controls branches to a service routine for theservice. Otherwise, the next lower
priority is tested and so on.
• But this is slower.
Daisy chaining
• It is a hardware method.
• Daisy chaining consists of a several connections of all devicesthat request an
interrupt.
• The device with highest priority is placed in the first positionfollowed by
lower priority which is placed the last of the chain
Cycle stealing
In this mode a block of data is transferred by a sequence of DMA cycle.Here the I/O device
withdraws DMA request after transferring one or several bytes.
CHAPTER- 6
I/O Interface & Bus architecture
BUS INTERCONNECTION
• A bus is a communication pathway connecting two or more devices.
• A key characteristic of a bus is that it is a shared transmission medium.
• Multiple devices connect to the bus, and a signal transmitted by any one device isavailable
for reception by all other devices attached to the bus (broadcast).
• Typically, a bus consists of multiple communication pathways, or lines. Each line iscapable
of transmitting signals representing binary 1 and binary 0.
• Taken together, several lines of a bus can be used to transmit binary digits simultaneously
(in parallel). For example, an 8-bil unit of data can be transmitted overeight bus lines.
• Computer systems contain a number of different buses that provide pathwaysbetween
components at various levels of the computer system hierarchy.
• A bus that connects major computer components (processor, memory, I/O) is called a
system bus. The most common computer interconnection structures are based onthe use
of one or more system buses.
BUS STRUCTURE
A system bus consists, typically, of from about 50 to hundreds of separate lines. Each line is
assigned a particular meaning or function. Although there are many different bus designs, on
any bus the lines can be classified into three functional groups data, address, and control lines.
In addition, there may he power distribution lines that supply power to the attached modules.
Data Bus
• Provide a path for moving, data between system modules. These lines, collectively,are called
the data bus.
• The width of the data bus: The data bus may consist of from 32 to hundreds of separate lines,
the number of lines being referred to as the width of the data bus. Because each line can carry
only 1 bit at a time, the number of lines determines howmany bits can be transferred at a
lime. The width of the data bus is a key factor in determining overall system performance.
For example, if the data bus is 8 bits wide and each instruction is 16 bits long, then the
processor must access the memory module twice during each instruction cycle.
• As data bus carry information from and to the modules, so it is bidirectional in nature.
Address Bus
• Address lines are used to designate the source or destination of the data on the databus. For
example, if the processor wishes to read a word (8, 16. or 32 bits) of data from memory, it
puts the address of the desired word on the address lines.
• The width of the address bus: determines the maximum possible memory capacity ofthe
system. Furthermore, the address lines are generally also used to address I/O ports.
Control Bus
• Control bus are used to control the access to and the use of the data and address lines.
Because the data and address lines are shared by all components, there mustbe a
means of controlling their use.
• Control signals transmit both command and timing information between system
modules.
MULTIPLE-BUS ARCHITECTURE
If a great number of devices are connected to the bus, performance will suffer. There are two
main causes:
1. Propagation delay - the time it takes for devices to coordinate the use of the bus
2. The bus may become a bottleneck as the aggregate data transfer demandapproaches the
capacity of the bus (in available transfer cycles/second).
• Accordingly, most computer systems use multiple buses, generally laid out in a
hierarchy.
There are mainly two typical architecture: 1) Traditional Bus architecture,, 2)High performance Bus
architecture
• Due to the increasing need for adding more i/o devices the traditional architecturecannot
support so, it is necessary to build a high-performance bus.
• Similar to the traditional bus architecture, this too contains a local bus that connectsthe
processor to a cache controller, which in turn is connected to the main memory through the
system bus.
• The cache controller is integrated into a bridge or buffering device that connects tothe
high-speed bus, which sometimes referred to as Mezzanine Architecture.
• This bus supports to high-speed LANs, video and graphics work station controllersetc.
• The lower speed devices are still supported by the expansion bus with an interfacebuffering
traffic between the expansion bus and high-speed bus.
• This way the high-speed devices are more closely integrated with the processor through the
high-speed bus and at the same time leaving processor independent.
Bus type
• Bus lines can be separated into two generic types: dedicated and multiplexed.
• A dedicated bus line is permanently assigned either to one function or to physicalsubset
of computer components.
• Separate data & address lines are used.
• The use of the same lines for multiple purposes is known as Multiplexing.
Shared lines
• Address valid or data valid control lines are used.
Method of arbitration
It determining who can use the bus at a particular time
• Centralized - a single hardware device called the bus controller or arbiter allocatestime
on the bus
• Distributed - each module contains access control logic and the modules acttogether to
share the bus
• Both methods designate one device (either CPU or an I/O module) as master, whichmay
initiate a data transfer with some other device, which acts as a slave.
Timing
Synchronous Timing
• Bus includes a clock line upon which a clock transmits a regular sequence ofalternating 1’s
and 0’s of equal duration
• A single 1-0 transmission is referred to as a clock cycle or bus cycle
• All other devices on the bus can read the clock line, and all events start at thebeginning
of a clock cycle
Asynchronous Timing
• The occurrence of one event on a bus follows and depends on the occurrence of aprevious
event
Data transfer
A bus can support various type of data transfer such as-
• Read, Write, Read-modify-write, Read-after-write, Block
• All buses must support write (master to slave) and read (slave to master)transfers.
• Read-modify-write: A read followed immediately by a write to the sameaddress.
SCSI
The Small Computer System Interface (SCSI) is a set of parallel interface standards developed
by the American National Standards Institute (ANSI) for attaching printers, disk drives,
scanners and other peripherals to computers. SCSI (pronounced "skuzzy")is supported by all
major operating systems.
It has some versions developed –
SCSI-1 is the original SCSI standard developed back in 1986 as ANSI X3.131-1986.
SCSI-1 is capable of transferring up to eight bits a second.
SCSI-2 was approved in 1990, added new features such as Fast and Wide SCSI, and
support for additional devices.
SCSI-3 was approved in 1996 as ANSI X3.270-1996.
• Universal Serial Bus is a new connector that is introduced 1995 to replace Serial and Parallel ports.
• It is based on serial type architecture. However, it is much quicker than standard serial ports because
Serial architecture gives the interface a much higher clock rate than a parallel interface and serial
cables are much cheaperthan parallel cables.
• So, from 1995, the USB standard has been developed for connecting a wide range ofdevices like
scanners, keyboards, mice, joysticks, printers, modems and some CD- ROMs.
• USB is completely hot-swappable that means we can connect or disconnect anydevice when
the computer is running.
• Computer can recognize the device as soon as it plugged in, and the user can use ofthe device
immediately.
o Type B: This type of connectors is generally used for high-speed devices like
externalhard disks, etc. and shape is square.
PCI:
• Stands for "Peripheral Component Interconnect." It is a hardware bus designed by Intel around 1992
and is used in both PCs and Macs.
• It is an intermediate bus located between the processor bus (Northbridge) and the I/O bus
(Southbridge).
• Most add-on cards such as SCSI, Firewire, and USB controllers use a PCI connection. Some
graphics cards use PCI, but most new graphics cards connect to the AGP slot.
• PCI slots are found in the back of the computer. The PCI interface exists in 32 bits with a 124-pin
connector or in 64 bits with a 188-pin connector.
• There are also two signaling voltage levels i.e., 3.3V for laptop computers and 5V for desktop
computers. The 64-bit PCI connectors offer additional pins and can accommodate 32-bit PCI cards.
• There are 2 types of 64-bit connectors. They are 64-bit PCI connector, 5V and 64-bitPCI connector,
3.3V.
CHAPTER- 7
Parallel Processing
With the increased use of computers in every sphere of human activity, computerscientists are
faced with two crucial issues today.
▪ Parallel processing or computing is a form of computation in which many instructions are carried
out simultaneously operating on the principle that large problems can often be divided into smaller
ones, which are then solvedconcurrently (in parallel).
▪ Instead of processing each instruction sequentially a parallel processingsystem is able to perform
concurrent data processing to achieve faster execution time
▪ Then add the 8 higher-order bits using an add-with-carry instruction and the carry bitfrom
the lower order addition
The instructions given to a computer for processing can be divided into groups, or re-ordered
and then processed without changing the final result. This is known as instruction-level
parallelism i.e., ILP.
1. e = a + b
2. f = c + d
3. g = e * f
Here, instruction 3 is dependent on instruction 1 and 2 . However, instruction 1 and 2 canbe
independently processed.
Task parallelism
Task Parallelism focuses on distribution of tasks across different processors. It is alsoknown
as functional parallelism or control parallelism
Data Parallelism
Data parallelism focuses on distributing the data across different parallel computing nodes.
Itis also called as loop-level parallelism.
Linear pipeline
The process of execution of instruction can be divided into 4 major steps.
In unpipelined computer all these four steps must be completed before starting ofthe next
instruction.
But in case of a pipelined computer successive instruction are executed in anoverlapped
fashion.
FLYNN’S TAXONOMY
• In general, digital computers may be classified into four categories, according to themultiplicity of
instruction and data streams.
• This scheme for classifying computer organizations was introduced by Michael J. Flynn. The
essential computing process is the execution of a sequence of instructions on a set of data.
• The term stream is used here to denote a sequence of items (instructions or data) asexecuted or
operated upon by a single processor.
• An instruction stream is a sequence of instructions as executed by the machine
• A data stream is a sequence of data including input, partial, or temporary results, called for the
instruction stream. Listed below are Flynn’s four machine organizations:
SISD computer organization This organization represents most serial computers available
today. Instructions are executed sequentially but may be overlapped in their execution
stages.
SIMD computer organization: In this organization, there are multiple processing elements
supervised by the same control unit. All PE’(processing element) receive the same
instruction broadcast from the control unit but operate on different data sets from distinct
data streams.
MISD computer organization There are n processor units, each receiving distinct instructions
operating over the same data stream. The results (output) of one processor become the input
(operands) of the next processor .This approach has no practical implementation.
can be classified in this category. MIMD computer implies interactions among the n processors
because all memory streams are derived from the same data space shared by allprocessors.
MULTIPROCESSOR
• A multiprocessor system is an interconnection of two or more CPUs with memory & i\o
equipment. The processor in multiprocessor can have either CPU or i\o processor.
• Computers are interconnected with each other by means of communication lines to form a
computer network. The network consists of several autonomous computer that may or may
not communicate with each other.
• A multiprocessor system is controlled by one OS that provides interaction between processor
& all the components of the system cooperate in the solution of a problem.
• Multiprocessor are classified by the way their memory is organized.
✓ Tightly coupled
✓ Loosely coupled.
Tightly coupled
These systems contain multiple CPUs that are connected at the bus level. These CPUs may
have access to a central shared memory (SMP or UMA), or may participate in a memory
hierarchy with both local and shared memory (NUMA).
Loosely coupled
These systems are based on multiple standalone single or dual processor or commodity
computers interconnected via a high-speed communication system .
Each PE has its own private local memory. The processor is tied together by a switching
scheme designed to route information from one processor to another through a message
passing system.