Computer Architecture Unit1
Computer Architecture Unit1
Unit 1
INTRODUCTION
Computer architecture is a specification detailing how a set of software and
hardware technology standards interact to form a computer system or
platform. It refers to how a computer system is designed and what
technologies it is compatible with.
REVIEW OF BASIC COMPUTER ARCHITECTURE
The main components in a typical computer systems are:-
Processor : The central processor of a computer is also known as the
CPU, or Central Processing Unit. This processor handles all the basic systems
instructions such as processing mouse and keyboard input and running
applications.
Example: Intel, Advanced Micro Devices (AMD), Celeron, Pentium, Core,
Sempron, Athlon, Phenom.
Memory : It is just like a human brain. It is used to store data and
instructions. Computer memory is the storage space in the computer, where
data is to be processed and instructions required for processing are stored.
Example: Primary memory [RAM->volatile, ROM->non-volatile], Secondary
memory [hard drive, CD].
Input/Output devices : An input device sends information to a
computer system for processing, and an output device reproduces or
displays the result of that processing.
Example: Input device= mouse, keyboard etc. Output device= printer,
monitor etc.
Communication channels : A communication channel refers either
to a physical transmission medium such as a wire, or to a logical connection
over a multiplexed medium such as a radio channel in telecommunications
and computer networking. Communicating data from one location to
another requires some form of pathway or medium.
2
1. The fraction of the computation time in the original machine that can
be converted to take advantage of the enhancement
Example: If 20 seconds of the execution time of a program that takes 60
seconds in total can use an enhancement, the fraction is 20/60. This value,
which we will call Fractionenhanced, is always less than or equal to 1.
2. The improvement gained by the enhanced execution mode; that is, how
much faster the task would run if the enhanced mode were used for the
entire program. This value is the time of the original mode over the
time of the enhanced mode.
4
Example: If the enhanced mode takes 2 seconds for some portion of the
program that can completely use the mode, while the original mode took
5 seconds for the same portion, the improvement is 5/2.
We will call this value, which is always greater than 1, Speedupenhanced.
A clock speed of 3.5 GHz to 4.0 GHz is generally considered a good clock
speed for gaming.
CPI(Clock cycles Per Instruction): It is one aspect of a
processor’s performance: the average number of clock cycles per instruction
for a program or program fragment.
CPI= ∑i(ICi)(CCi)/IC
Where, ICi=no of instruction for a given instruction type i.
IC=∑i(ICi) is the total instruction count.
CCi=clock cycles
Example:
(Q) If a program has 50% load instruction,25% store instruction, 15% R-
type instruction, 8% branch instruction and 2% jump instruction. Find CPI?
Soln: CPI= {(5*50) +(4*25) +(4*15) +(3*8) +(3*2)}/100
= (250+100+60+24+6)/100
= 440/100
= 4.4
Instruction Count: The total no of instructions that get executed for a
particular task, algorithm, workload, or program is referred to the
instruction count.
The instruction count forms the basis for various performance aspects
of the microprocessor such as Instructions Per Cycle (IPC) or Cycles Per
Instruction. (CPI)
6
Measuring performances
•Response time: how long does it take to execute a certain application / a
certain amount of work.
•Given two platforms X and Y, X is n times faster than Y for a certain
application, if
n=Timey/Timex
•Performance of X is n times faster than the performance of Y, if
n=Timey/Timex
=(1/Perfy)/(1/Perfx)
=Perfx/Perfy
Timing how long an application takes
7
SPEC
• The Standard Performance Evaluation Corporation (SPEC) is a non-
profit corporation formed to establish, maintain and endorse(support)
a standardized set of relevant benchmarks that can be applied to the
newest generation of high- performance computers.
8
PIPELINING
Definition
▪ Pipelining is the process of arrangement of hardware elements of CPU
such that its overall performance is increased.
9
Basic concepts
• Pipelining is the process of accumulating instruction from the
processor through a pipeline.
• It allows storing and executing instructions in an orderly process. It is
known as pipeline processing.
• Pipelining is a technique where multiple instructions are overlapped
during execution.
t0 t1 t2 t3 t4 t1 t1 t1 t1
Ins 1 IF ID IE MEM WB
Ins 2 IF ID IE MEM WB
Ins 3 IF ID IE MEM WB
Ins 4 IF ID IE MEM WB
Ins 5 IF ID IE MEM WB
IF=Instruction Fetch
ID=Instruction Decode
IE=Instruction Execute
MEM=Memory Access
WB=Write Back
• Pipeline is divided into stages and these stages are connected with one
another to form a pipe like structure. Instructions enter from one end
and exit from another end.
10
Stages of Pipeline
There are 5 stages instruction pipeline to execute all the instructions in the
RISC instruction set.
• Stage 1 (Instruction Fetch): In this stage the CPU fetches the
instructions from the address present in the memory location whose
value is stored in the program counter.
• Stage 2 (Instruction Decode): In this stage, the instruction is decoded
and register file is accessed to obtain the values of registers used in the
instruction.
• Stage 3 (Instruction Execute): In this stage some of activities are
done such as ALU operations.
• Stage 4 (Memory Access): In this stage, memory operands are read
and written from/to the memory that is present in the instruction.
• Stage 5 (Write Back): In this stage, computed/fetched value is written
back to the register present in the instructions.
11
Types of pipeline
It is divided into two categories:
1)Arithmetic pipeline:
▪ It is usually found in most of the computers.
▪ They are used for floating point operation, multiplication of fixed-point
numbers etc.
▪ Example: The input to the floating-point adder pipeline is:
X=A*2a A & B are mantissa.
Y=B*2b a & b are exponents.
2)Instruction pipeline:
▪ In this stream of instructions can be executed by overlapping fetch,
decode and execute phase of an instruction cycle.
▪ This type of technique is used to increase the throughput of the
computer system.
▪ An instruction pipeline reads instruction from the memory while
previous instructions are being executed in other segments of the
pipeline. Thus, we can execute multiple instructions simultaneously.
▪ The pipeline will be more efficient if the instruction cycle is divided
into segments of equal duration.
What is Throughput?
• It measures number of instructions completed per unit time.
• It represents overall processing speed of pipeline.
• Higher throughput indicates processing speed of pipeline.
• Calculated as, throughput= number of instruction executed/ execution
time.
• It can be affected by pipeline length, clock frequency. efficiency of
instruction execution and presence of pipeline hazards or stalls.
What is Latency?
12
Pipeline conflicts
There are some factors that cause the pipeline to derivate its normal
performance. Some of these factors are given below:
1)Timing Variations:
All stages cannot take same amount of time. This problem generally
occurs in instructions have different operands requirements and thus
different processing time.
2)Data Hazards:
When several instructions are in parallel execution, and if they
reference same data then problem arises. We must ensure that next
instruction does not attempt to access data before the current instruction,
because this will lead to incorrect results.
3)Branching:
In order to fetch and execute the next instruction must know what that
instruction is. If the present instruction is a conditional branch, and its result
will lead us to the next instruction, then the next instruction may not be
known until the current one is processed.
4)Interrupts:
Interrupts set unwanted instruction into the instruction stream.
Interrupts effect the execution of instruction.
13
5)Data Dependency:
It arises when an instruction depends upon the result of a previous
instruction but this is not yet available.
Advantages of Pipelining
➢ The cycle time of the processor is reduced.
➢ It increases the throughput of the system.
➢ It makes the system reliable.
Disadvantages of Pipelining
➢ The design of pipelined processor is complex and costly to
manufacture.
➢ The instruction latency is more.
HAZARDS
In pipelining CPI should be 1, i.e. every clock should have one instruction as
the output. But it is difficult to achieve. So, the problems which are created in
achieving this are called Hazards.
Data Hazards
Data Hazards occur when an instruction depends on the result of previous
instruction and that result of instruction has not yet been computed.
whenever two different instructions use the same storage. the location must
appear as if it is executed in sequential order.
Consider the pipelined execution of these instructions:
14
All the instructions after the ADD use the result of the ADD instruction(in
R1).
The ADD instruction writes the value of R1 in the WB stage, and the SUB
instructions read the value during ID stage (IDSUB). This problem is called a
data hazard.
There are four types of data dependencies: Read after Write (RAW), Write
after Read (WAR), Write after Write (WAW), and Read after Read (RAR).
These are explained as follows below.
• Read after Write (RAW) :
It is also known as True dependency or Flow dependency. It occurs
when the value produced by an instruction is required by a subsequent
instruction.
• Write after Read (WAR) :
It is also known as anti-dependency. These hazards occur when the
output register of an instruction is used right after read by a previous
instruction.
• Read after Read (RAR) :
It is also known as output dependency. It occurs when the instruction
both read from the same register.
Structural Hazards
Multiple instructions but limited resource.
A structural hazard occurs when two (or more) instructions that are already
in pipeline need the same resource. The result is that instruction must be
15
EXCEPTION HANDLING
17
Resume VS Terminate
❖ Some exceptions may lead to the program to be continued after the
exception and some of them may lead to termination. Things are much
more complicated if we have to restart.
❖ Exceptions that lead to termination are easier, since we just have to
terminate and need not to restore the original status.
➢ Process to maximize the rendering speed, then allow stages that are
not bottlenecks to consume as much as the bottle-neck.
➢ Pipelining is a technique used to improve the execution throughput of a
CPU by using the processor resources in a more efficient manner. The
basic idea is to split the processor instruction into a series of small
independent stages. Each stage is designed to perform a certain part of
the instructions.
➢ The optimizing technique can greatly reduce the conflict of shared data
bus and improve the performance of applications with inherent data
pipeline characteristics.
Pipeline Optimization
▪ Stages execute in parallel.
▪ Always the slowest stage is the bottleneck of the pipeline.
▪ The bottleneck determines throughput (i.e. maximum speed).
▪ The bottleneck is the average bottleneck over a frame.
▪ Cannot measure intra-frame bottlenecks easily.
▪ Bottlenecks can change over a frame.
▪ Most important: Find bottleneck, then optimize that stage.