0% found this document useful (0 votes)

8 views20 pages

Computer Architecture Unit1

The document provides an overview of computer architecture, detailing the main components such as the CPU, memory, input/output devices, and communication channels. It discusses the principles of computer design, including Amdahl's Law for performance improvement, CPU performance equations, and the importance of pipelining in enhancing execution efficiency. Additionally, it covers performance measurement techniques, benchmarks, and potential pipeline conflicts that can affect processing speed.

Uploaded by

Turbo Addict

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views20 pages

Computer Architecture Unit1

Uploaded by

Turbo Addict

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 20

1

Unit 1
INTRODUCTION
Computer architecture is a specification detailing how a set of software and
hardware technology standards interact to form a computer system or
platform. It refers to how a computer system is designed and what
technologies it is compatible with.
REVIEW OF BASIC COMPUTER ARCHITECTURE
The main components in a typical computer systems are:-
Processor : The central processor of a computer is also known as the
CPU, or Central Processing Unit. This processor handles all the basic systems
instructions such as processing mouse and keyboard input and running
applications.
Example: Intel, Advanced Micro Devices (AMD), Celeron, Pentium, Core,
Sempron, Athlon, Phenom.
Memory : It is just like a human brain. It is used to store data and
instructions. Computer memory is the storage space in the computer, where
data is to be processed and instructions required for processing are stored.
Example: Primary memory [RAM->volatile, ROM->non-volatile], Secondary
memory [hard drive, CD].
Input/Output devices : An input device sends information to a
computer system for processing, and an output device reproduces or
displays the result of that processing.
Example: Input device= mouse, keyboard etc. Output device= printer,
monitor etc.
Communication channels : A communication channel refers either
to a physical transmission medium such as a wire, or to a logical connection
over a multiplexed medium such as a radio channel in telecommunications
and computer networking. Communicating data from one location to
another requires some form of pathway or medium.
2

Von Neumann Architecture : Von Neumann Architecture design

consists of a Control Unit, Arithmetic & Logic Unit, Registers and
Inputs/Outputs.
It is based on stored program computer concept, where instruction
data and program data are stored in the same memory.

QUANTITIVE TECHNIQUES IN COMPUTER DESIGN

The most important and pervasive principle of computer design
is to make the common case fast. In applying this simple principle, we have
to decide what the frequent case is and how much performance can be
improved by making the case faster.
A fundamental law, called Amdahl’s Law can be used to quantify this
principle.
Amdahl’s Law: The performance gain that can be obtained by improving
some portion of a computer can be calculated using Amdahl’s Law.
Amdahl’s Law states that the performance improvement to be gained
from using some faster mode of execution is limited by the fraction of the
time the faster mode can be used.
3

Amdahl’s Law defines the speedup that can be gained by using a

particular feature.
Speedup= Performance for entire task using enhancement when
possible / Performance for entire task without using enhancement
Speedup=Serial execution time/Parallel execution time
Example:
(Q)If the serial application executes in 6720 seconds and a corresponding
parallel application runs in 126.7 seconds (using 64 threads and cores). Find
speedup of parallel application?
Soln: Speedup=serial execution time/Parallel execution time
=6720/126.7
=53.038
=53x
Speedup tells us how much faster a task will run using the machine with
the enhancement as opposed to the original machine. Amdahl’s Law gives us
a quick way to find the speedup from some enhancement, which depends on
two factors:

1. The fraction of the computation time in the original machine that can
be converted to take advantage of the enhancement
Example: If 20 seconds of the execution time of a program that takes 60
seconds in total can use an enhancement, the fraction is 20/60. This value,
which we will call Fractionenhanced, is always less than or equal to 1.

2. The improvement gained by the enhanced execution mode; that is, how
much faster the task would run if the enhanced mode were used for the
entire program. This value is the time of the original mode over the
time of the enhanced mode.
4

Example: If the enhanced mode takes 2 seconds for some portion of the
program that can completely use the mode, while the original mode took
5 seconds for the same portion, the improvement is 5/2.
We will call this value, which is always greater than 1, Speedupenhanced.

CPU Performance Equation: Essentially all computers are

constructed using a clock running at a constant rate. These discrete time
events are called ticks, clock ticks, clock periods, clocks, cycles, or clock
cycles. Computer designers refer to the time of a clock period by its duration
(e.g., 1 ns) or by its rate (e.g., 1 GHz). CPU time for a program can then be
expressed.
CPU Time = CPU Clock Cycles Per a Program X Clock Cycle Time
In addition to the number of clock cycles needed to execute a program, we
can also count the number of instructions executed—the instruction path
length or instruction count (IC). If we know the number of clock cycles and
the instruction count we can calculate the average number of clock cycles
per instruction (CPI).
CPI= CPU Clock Cycles Per a Program / Instruction Count
CPU time = Instruction count X Clock Cycle Time X Cycles per
Instruction
Clock cycle time: Clock cycle is the speed of a computer processor, or
CPU, is determined by the clock cycle, which is the amount of time between
two pulses of an oscillator. Generally, the higher number of pulses per
second, the faster the computer processor will be able to process
information.
• Clock period or Cycle time (Tc): It is the time between rising edges of a
respective clock signal.
• Clock Frequency (Fc): Reciprocal of clock time.
Fc =1/Tc
Increasing the clock frequency, increasing the work that a digital
system can accomplish per unit time.
5

A clock speed of 3.5 GHz to 4.0 GHz is generally considered a good clock
speed for gaming.
CPI(Clock cycles Per Instruction): It is one aspect of a
processor’s performance: the average number of clock cycles per instruction
for a program or program fragment.
CPI= ∑i(ICi)(CCi)/IC
Where, ICi=no of instruction for a given instruction type i.
IC=∑i(ICi) is the total instruction count.
CCi=clock cycles

There are five types of instructions in multi-cycle MIPS.

i. Load(5 cycles)
ii. Store(4 cycles)
iii. R-type(4 cycles)
iv. Branch(3 cycles)
v. Jumps(3 cycles)

Example:
(Q) If a program has 50% load instruction,25% store instruction, 15% R-
type instruction, 8% branch instruction and 2% jump instruction. Find CPI?
Soln: CPI= {(5*50) +(4*25) +(4*15) +(3*8) +(3*2)}/100
= (250+100+60+24+6)/100
= 440/100
= 4.4
Instruction Count: The total no of instructions that get executed for a
particular task, algorithm, workload, or program is referred to the
instruction count.
The instruction count forms the basis for various performance aspects
of the microprocessor such as Instructions Per Cycle (IPC) or Cycles Per
Instruction. (CPI)
6

MEASURING & REPORTING PERFORMANCE

Two ways to measure the performance are:-
1. The speed measure: which measures how fast a computer
completes a single task.
Ex: SPECint95 is used for comparing the ability of a computer to complete
single tasks.
2. The throughput measure: which measures how many tasks a
computer can complete in certain amount of time.
The computer user is interested in reducing response time (the time
between the start and the completion of an event) also referred to an
execution time. The manager of a large data processing center maybe
interested in increasing throughput (the total amount of work done in a
given time).
Even execution time can be defined in different ways depending on what we
count. The most straight forward definition of time is called wall-clock time,
response time, or elapsed time, which is the latency to complete a task,
including disk accesses, memory accesses, input/output activities, operating
system overhead.

Measuring performances
•Response time: how long does it take to execute a certain application / a
certain amount of work.
•Given two platforms X and Y, X is n times faster than Y for a certain
application, if
n=Timey/Timex
•Performance of X is n times faster than the performance of Y, if
n=Timey/Timex
=(1/Perfy)/(1/Perfx)
=Perfx/Perfy
Timing how long an application takes
7

• Wall-clock time/ elapsed time: time to complete a task as seen by the

user. Might include operating system overhead or potentially
interfering other applications. [When we are working on one
application, sometimes it is observed that some other applications
start working by itself, then time will vary.]
• CPU time: does not include time slices introduced by external
sources (e.g. running other applications). CPU time can be further.
(It will not include the time of other application. )
➢ User CPU time: CPU time spent in the program.
➢ System CPU time: CPU time spent in the OS performing tasks
requested by the program.

Choosing the right programs to test a system

• Real application: Use the target application for the machine in
order to evaluate its performance. (Best solution if application
available)
• Modified application: Real application has been modified in order
to measure a certain feature. (Remove I/O parts of an application
in order to focus on the CPU performance)
• Application Kernels: Focus on the most time-consuming parts of
an application. (Eg: Extract the matrix- vector multiply of an
application, since this uses 80% of the user CPU time)
• Toy benchmarks: Very small code segments which produce a
predictable result. (Eg: Sieve of Eratosthenes, quicksort)
• Synthetic benchmarks: Try to match the average frequency of
operations and operands for a certain program. (Code does not do
any useful work)[When more than one operations are used in one
program then the average frequency of the operations are
calculated.]

SPEC
• The Standard Performance Evaluation Corporation (SPEC) is a non-
profit corporation formed to establish, maintain and endorse(support)
a standardized set of relevant benchmarks that can be applied to the
newest generation of high- performance computers.
8

• SPEC develops suites of benchmarks and also reviews and publishes

submitted results from our member organizations and other
benchmark licenses.

Why do we need benchmarks?

• Identify problems: Measure machine properties.
• Time evaluation: Verify that we make progress.
• Coverage: Help vendors to have representative codes
Increase competition by transparency
Drive future development
• Relevance: Help consumers to choose the right computer.
Reporting results
• SPEC produces a minimal set of representative numbers:
➢ Reduces complexity to understand correlations.
➢ Eases comparison of different systems.
➢ Loss of information.
• Results have to be compliant to the SPEC benchmarking rules in order
to be approved as an official SPEC report.
➢ All components have to available at least 3 months after the
publication. (including a runtime environment for C/C++/Fortran
applications)
➢ Usage of SPEC tools for compiling and reporting.
➢ Each individual benchmark has to be executed at least three
times.
➢ Verification of the benchmark output.
➢ A maximum of four optimization flags are allowed for the base
run. (including preprocessor and link directives)
➢ Disclosure report containing all relevant data has to be available.

PIPELINING
Definition
▪ Pipelining is the process of arrangement of hardware elements of CPU
such that its overall performance is increased.
9

▪ Simultaneous execution of more than one instruction takes place in

pipelined processor.
▪ In pipelining multiple instructions are overlapped in execution.

Basic concepts
• Pipelining is the process of accumulating instruction from the
processor through a pipeline.
• It allows storing and executing instructions in an orderly process. It is
known as pipeline processing.
• Pipelining is a technique where multiple instructions are overlapped
during execution.

t0 t1 t2 t3 t4 t1 t1 t1 t1
Ins 1 IF ID IE MEM WB
Ins 2 IF ID IE MEM WB
Ins 3 IF ID IE MEM WB
Ins 4 IF ID IE MEM WB
Ins 5 IF ID IE MEM WB
IF=Instruction Fetch
ID=Instruction Decode
IE=Instruction Execute
MEM=Memory Access
WB=Write Back

• Pipeline is divided into stages and these stages are connected with one
another to form a pipe like structure. Instructions enter from one end
and exit from another end.
10

• In pipeline system, each segment consists of an input register followed

by a combinational circuit.
• The Register is used to hold data and combinational circuit performs
operations on it.
• The output of combinational circuit is applied to the input register of
the next segment.

Stages of Pipeline
There are 5 stages instruction pipeline to execute all the instructions in the
RISC instruction set.
• Stage 1 (Instruction Fetch): In this stage the CPU fetches the
instructions from the address present in the memory location whose
value is stored in the program counter.
• Stage 2 (Instruction Decode): In this stage, the instruction is decoded
and register file is accessed to obtain the values of registers used in the
instruction.
• Stage 3 (Instruction Execute): In this stage some of activities are
done such as ALU operations.
• Stage 4 (Memory Access): In this stage, memory operands are read
and written from/to the memory that is present in the instruction.
• Stage 5 (Write Back): In this stage, computed/fetched value is written
back to the register present in the instructions.
11

Types of pipeline
It is divided into two categories:
1)Arithmetic pipeline:
▪ It is usually found in most of the computers.
▪ They are used for floating point operation, multiplication of fixed-point
numbers etc.
▪ Example: The input to the floating-point adder pipeline is:
X=A*2a A & B are mantissa.
Y=B*2b a & b are exponents.
2)Instruction pipeline:
▪ In this stream of instructions can be executed by overlapping fetch,
decode and execute phase of an instruction cycle.
▪ This type of technique is used to increase the throughput of the
computer system.
▪ An instruction pipeline reads instruction from the memory while
previous instructions are being executed in other segments of the
pipeline. Thus, we can execute multiple instructions simultaneously.
▪ The pipeline will be more efficient if the instruction cycle is divided
into segments of equal duration.

What is Throughput?
• It measures number of instructions completed per unit time.
• It represents overall processing speed of pipeline.
• Higher throughput indicates processing speed of pipeline.
• Calculated as, throughput= number of instruction executed/ execution
time.
• It can be affected by pipeline length, clock frequency. efficiency of
instruction execution and presence of pipeline hazards or stalls.
What is Latency?
12

• It measures time taken for a single instruction to complete its

execution.
• It represents delay or time it takes for an instruction to pass through
pipeline stages.
• Lower latency indicates better performance.
• It is calculated as, Latency= Execution time/ Number of instruction
executed.
• It in influenced by pipeline length, depth, clock cycle time, instruction
dependencies and pipeline hazards.

Pipeline conflicts
There are some factors that cause the pipeline to derivate its normal
performance. Some of these factors are given below:
1)Timing Variations:
All stages cannot take same amount of time. This problem generally
occurs in instructions have different operands requirements and thus
different processing time.
2)Data Hazards:
When several instructions are in parallel execution, and if they
reference same data then problem arises. We must ensure that next
instruction does not attempt to access data before the current instruction,
because this will lead to incorrect results.
3)Branching:
In order to fetch and execute the next instruction must know what that
instruction is. If the present instruction is a conditional branch, and its result
will lead us to the next instruction, then the next instruction may not be
known until the current one is processed.
4)Interrupts:
Interrupts set unwanted instruction into the instruction stream.
Interrupts effect the execution of instruction.
13

5)Data Dependency:
It arises when an instruction depends upon the result of a previous
instruction but this is not yet available.

Advantages of Pipelining
➢ The cycle time of the processor is reduced.
➢ It increases the throughput of the system.
➢ It makes the system reliable.

Disadvantages of Pipelining
➢ The design of pipelined processor is complex and costly to
manufacture.
➢ The instruction latency is more.

HAZARDS
In pipelining CPI should be 1, i.e. every clock should have one instruction as
the output. But it is difficult to achieve. So, the problems which are created in
achieving this are called Hazards.
Data Hazards
Data Hazards occur when an instruction depends on the result of previous
instruction and that result of instruction has not yet been computed.
whenever two different instructions use the same storage. the location must
appear as if it is executed in sequential order.
Consider the pipelined execution of these instructions:
14

All the instructions after the ADD use the result of the ADD instruction(in
R1).
The ADD instruction writes the value of R1 in the WB stage, and the SUB
instructions read the value during ID stage (IDSUB). This problem is called a
data hazard.

There are four types of data dependencies: Read after Write (RAW), Write
after Read (WAR), Write after Write (WAW), and Read after Read (RAR).
These are explained as follows below.
• Read after Write (RAW) :
It is also known as True dependency or Flow dependency. It occurs
when the value produced by an instruction is required by a subsequent
instruction.
• Write after Read (WAR) :
It is also known as anti-dependency. These hazards occur when the
output register of an instruction is used right after read by a previous
instruction.
• Read after Read (RAR) :
It is also known as output dependency. It occurs when the instruction
both read from the same register.
Structural Hazards
Multiple instructions but limited resource.
A structural hazard occurs when two (or more) instructions that are already
in pipeline need the same resource. The result is that instruction must be
15

executed in series rather than parallel for a portion of pipeline. Structural

hazards are sometimes referred to as resource hazards.
Solution for Structural Hazards:
1.Resource Duplication: Increase the number of resources/Use multiple
resources.
2.Resource Pipelining: As instructions are used in pipeline, in the same way
use resources in pipeline. But in this case complexity will increase.
3.Change the ordering: The instruction which is taking much time, we can
execute that later.
Control Hazards
A control hazard in pipelining occurs when the processor can't decide which
instruction to fetch next in time. This can lead to delays in instruction
fetching. It is also known as branch hazards; these occur when the pipeline
makes the wrong decision about which instruction to fetch.
Branch prediction:
The most common approach to handle control hazards is using a branch
prediction unit that tries to guess whether a branch will be taken or not,
minimizing the need for flushing or stalling.
Solution for Control Hazards:
1.Flushing: Happens when a branch prediction is completely wrong, causing
instructions that were fetched based on the wrong prediction to be
discarded.
2.Stalling: Occurs when the pipeline pauses execution until the branch
decision is made, allowing the processor to fetch the correct instructions
based on the actual branch outcome.
Generally preferred over flushing as it only delays the pipeline for a few
cycles, not completely restarting it.

Techniques for Handling Hazards

1. Stalling: Stalling involves delaying the execution of an instruction until

the hazard is resolved. This can be done by inserting bubbles into the
pipeline or by stalling the entire pipeline.
2. Forwarding: Forwarding involves bypassing the result of an instruction
from one stage to another stage, rather than waiting for the result to be
written back to the register file.
3. Register Renaming: Register renaming involves assigning a new register
name to an instruction that is dependent on a previous instruction, thereby
avoiding the hazard.
4. Reordering: Reordering involves reordering the instructions in the
pipeline to avoid hazards. This can be done using techniques such as
instruction-level parallelism (ILP) or out-of-order execution (OoOE).
5. Hazard Detection and Resolution: Hazard detection and resolution
involves detecting hazards and resolving them using techniques such as
stalling, forwarding, or register renaming.
6. Pipeline Flush: Pipeline flush involves flushing the entire pipeline when a
hazard is detected, and restarting the pipeline from the beginning.
7. Branch Prediction: Branch prediction involves predicting the outcome of
a branch instruction and speculatively executing the instructions following
the branch. If the prediction is incorrect, the pipeline is flushed and the
correct instructions are executed.
8. Delayed Branch: Delayed branch involves delaying the execution of a
branch instruction until the pipeline is empty, thereby avoiding the hazard.
9. Speculative Execution: Speculative execution involves speculatively
executing instructions that are dependent on a previous instruction, and
discarding the results if the speculation is incorrect.
10. Tomasulo's Algorithm: Tomasulo's algorithm involves using a
combination of register renaming, forwarding, and stalling to handle hazards
in a pipelined processor.

EXCEPTION HANDLING
17

➢ Exceptions and interrupts are unexpected events that disrupt the

normal flow of instruction execution.
➢ An exception is an unexpected event within the processor.
➢ An interrupt is an unexpected event from outside the processor.
➢ Exceptions generally refer to events that arise within the CPU.
Example: undefined opcode, overflow, system call etc.
➢ Interrupts point to requests coming from an external I/O controller or
device to the processor.

Some examples of exceptions are:

• I/O device request.
• Invoking an OS service from a user program.
• Tracing instruction execution.
• Breakpoint.
• Integer arithmetic overflow.
• FP arithmetic anomaly.
• Page fault.
• Misaligned memory access.
• Memory protection violation.
• Using an undefined or unimplemented instruction.
• Hardware malfunctions.
• Power failure.

There are different characteristics for exceptions. They are as follows:

Synchronous VS Asynchronous
❖ Some exceptions may be synchronous, whereas others may be
asynchronous. If the same exception occurs in the same place with the
same data and memory allocation, then it is a synchronous exception.
They are more difficult to handle.
❖ Devices external to the CPU and memory cause asynchronous
exceptions. They can be handled after the current instructions and
hence easier than synchronous exceptions.
18

User requested VS Coerced

❖ Some exceptions may be user requested and not automatic. Such
exceptions are predictable and can be handled after the current
instruction.
❖ Coerced exceptions are generally raised by hardware and not under the
control of the user program. They are harder to handle.

User maskable VS unmaskable

❖ Exceptions can be maskable or unmaskable. They can be masked or
unmasked by a user task. This decides whether the hardware responds
to the exception or not. We may have instructions that enable or disable
exceptions.
Within VS Between instructions
❖ Exceptions may have to be handled within the instruction or between
the instruction. Within exceptions are normally synchronous and are
harder since the instruction has to be stopped and restarted.
Catastrophic exceptions like hardware malfunction will normally cause
termination.
❖ Exceptions that can be handled between two instructions are easier to
handle.

Resume VS Terminate
❖ Some exceptions may lead to the program to be continued after the
exception and some of them may lead to termination. Things are much
more complicated if we have to restart.
❖ Exceptions that lead to termination are easier, since we just have to
terminate and need not to restore the original status.

PIPELINE OPTIMIZATION TECHNIQUES

➢ Process to maximize the rendering speed, then allow stages that are
not bottlenecks to consume as much as the bottle-neck.
➢ Pipelining is a technique used to improve the execution throughput of a
CPU by using the processor resources in a more efficient manner. The
basic idea is to split the processor instruction into a series of small
independent stages. Each stage is designed to perform a certain part of
the instructions.
➢ The optimizing technique can greatly reduce the conflict of shared data
bus and improve the performance of applications with inherent data
pipeline characteristics.

Pipeline Optimization
▪ Stages execute in parallel.
▪ Always the slowest stage is the bottleneck of the pipeline.
▪ The bottleneck determines throughput (i.e. maximum speed).
▪ The bottleneck is the average bottleneck over a frame.
▪ Cannot measure intra-frame bottlenecks easily.
▪ Bottlenecks can change over a frame.
▪ Most important: Find bottleneck, then optimize that stage.

Locating the Bottleneck

Two bottleneck location techniques:
❖ Technique 1:
• Make a certain stage work less.
• If performance is better, then that stage is the bottleneck.
❖ Technique 2:
• Make the other two stages work less or (better) not at all.
• If the performance is the same, then the stages not included above
is the bottleneck.
20

COMPILER TECHNIQUES FOR IMPROVING PERFORMANCE

1. Instruction Level Parallelism (ILP): execute multiple
instructions simultaneously
2. Pipeline Optimization: minimize pipeline stalls
3. Loop Unrolling for Cache: reduce cache misses in loops
4. Register Allocation for Renaming: eliminate register
hazards
5. Branch Prediction Assistance: help CPU predict branches
6. Data Alignment for SIMD: optimize data for Single Instruction
Multiple Data
7. Dead Store Elimination for Cache: remove unnecessary
cache stores

VLSM Subnetting Worksheet Example
No ratings yet
VLSM Subnetting Worksheet Example
1 page
h19639 Powerprotect DD Commvault Configuration WP
No ratings yet
h19639 Powerprotect DD Commvault Configuration WP
50 pages
Computer Organization & Design The Hardware/Software Interface, 2nd Edition Patterson & Hennessy
80% (5)
Computer Organization & Design The Hardware/Software Interface, 2nd Edition Patterson & Hennessy
118 pages
CompArch Studcopy4units
No ratings yet
CompArch Studcopy4units
22 pages
Computer Architecture and Performance
No ratings yet
Computer Architecture and Performance
33 pages
CSC232 - Chp1 (Compatibility Mode)
No ratings yet
CSC232 - Chp1 (Compatibility Mode)
50 pages
Chapter 1 Introduction
No ratings yet
Chapter 1 Introduction
17 pages
1ACA_L1
No ratings yet
1ACA_L1
35 pages
Quatitative Principle
No ratings yet
Quatitative Principle
56 pages
Computer Architecture Unit 1
No ratings yet
Computer Architecture Unit 1
12 pages
L-2 (Computer Performance)
No ratings yet
L-2 (Computer Performance)
47 pages
William Stallings Computer Organization and Architecture 8 Edition Computer Evolution and Performance
No ratings yet
William Stallings Computer Organization and Architecture 8 Edition Computer Evolution and Performance
28 pages
COMPORGA - Module 2
No ratings yet
COMPORGA - Module 2
13 pages
Module 3.3 - Problems On Performance
No ratings yet
Module 3.3 - Problems On Performance
54 pages
Computer Organization The Role of Performance
No ratings yet
Computer Organization The Role of Performance
45 pages
Cpu Performance
No ratings yet
Cpu Performance
13 pages
Performance
No ratings yet
Performance
51 pages
Performance Measures For Computers
No ratings yet
Performance Measures For Computers
53 pages
Measuring Computer Performance
No ratings yet
Measuring Computer Performance
26 pages
Lecture 16 Technology ,Performance,Powerwall
No ratings yet
Lecture 16 Technology ,Performance,Powerwall
9 pages
Chapter 1
No ratings yet
Chapter 1
18 pages
Computer Performance
No ratings yet
Computer Performance
22 pages
CH 02a-Computer Performance
No ratings yet
CH 02a-Computer Performance
22 pages
1 - Introduction To Computer System
No ratings yet
1 - Introduction To Computer System
31 pages
L5-L6-Performance Issues
No ratings yet
L5-L6-Performance Issues
47 pages
SEN307-Lecture-5
No ratings yet
SEN307-Lecture-5
34 pages
LEC 2
No ratings yet
LEC 2
31 pages
CHAPTER 1 and 2
No ratings yet
CHAPTER 1 and 2
25 pages
Chapter 1 Performance
No ratings yet
Chapter 1 Performance
32 pages
Computer Organization and Architecture (AT70.01)
No ratings yet
Computer Organization and Architecture (AT70.01)
29 pages
02 Performance
No ratings yet
02 Performance
23 pages
Ilovepdf_merged (4) 36 274 Converted
No ratings yet
Ilovepdf_merged (4) 36 274 Converted
120 pages
Lec 1
No ratings yet
Lec 1
32 pages
Computer Architecture Measuring Performance
No ratings yet
Computer Architecture Measuring Performance
33 pages
Lect 1
No ratings yet
Lect 1
56 pages
Lect 1
No ratings yet
Lect 1
54 pages
Computer Architecture (Ceng 201)
No ratings yet
Computer Architecture (Ceng 201)
32 pages
Module 2 [26-10-2024]
No ratings yet
Module 2 [26-10-2024]
50 pages
CS-3006_4_PerformanceAnalysis
No ratings yet
CS-3006_4_PerformanceAnalysis
62 pages
Measuring Performance: Chris Clack B261 Systems Architecture
No ratings yet
Measuring Performance: Chris Clack B261 Systems Architecture
19 pages
M116C 1 M116C 1 Lect02-Performance
No ratings yet
M116C 1 M116C 1 Lect02-Performance
23 pages
Advanced Computer Architecture
No ratings yet
Advanced Computer Architecture
18 pages
Puter Performance
No ratings yet
Puter Performance
15 pages
Week 10 Part 02 - Processor Performance (Q Only) - Tagged 2
No ratings yet
Week 10 Part 02 - Processor Performance (Q Only) - Tagged 2
23 pages
Difference Between Main Memory and Secondary Memory
100% (1)
Difference Between Main Memory and Secondary Memory
6 pages
The Role of Performance: Chapter - 2
No ratings yet
The Role of Performance: Chapter - 2
40 pages
Chapter 1 Lecture 2 & 3 - Performance
No ratings yet
Chapter 1 Lecture 2 & 3 - Performance
36 pages
DA_CI
No ratings yet
DA_CI
13 pages
CSA
No ratings yet
CSA
68 pages
L14 Introduction To Performance Evaluation
No ratings yet
L14 Introduction To Performance Evaluation
48 pages
Lec10 Performance
No ratings yet
Lec10 Performance
22 pages
Performances of Computer Systems: CSE 675.02: Introduction To Computer Architecture
No ratings yet
Performances of Computer Systems: CSE 675.02: Introduction To Computer Architecture
52 pages
Unit 2 Performance
No ratings yet
Unit 2 Performance
6 pages
CS5204/EE5364 - Advanced Computer Architecture - Performance
No ratings yet
CS5204/EE5364 - Advanced Computer Architecture - Performance
56 pages
Co Unit1 Part3
No ratings yet
Co Unit1 Part3
11 pages
Lecture # 2
No ratings yet
Lecture # 2
33 pages
COD Ch. 2 The Role of Performance
No ratings yet
COD Ch. 2 The Role of Performance
13 pages
Defining Performance
No ratings yet
Defining Performance
6 pages
Unit I-Basic Structure of A Computer: System
No ratings yet
Unit I-Basic Structure of A Computer: System
64 pages
Computer Performance
No ratings yet
Computer Performance
17 pages
Intro
No ratings yet
Intro
14 pages
Operating Systems Interview Questions You'll Most Likely Be Asked
From Everand
Operating Systems Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet
42 Finding Files-Find
No ratings yet
42 Finding Files-Find
3 pages
How To Set Up A Hadoop Cluster in Docker
No ratings yet
How To Set Up A Hadoop Cluster in Docker
8 pages
AES MidSem 2011-12Q&A
No ratings yet
AES MidSem 2011-12Q&A
2 pages
Module 1.2
No ratings yet
Module 1.2
19 pages
Labview Interface With Arduino Uno Arduino Mega by S K Rai
No ratings yet
Labview Interface With Arduino Uno Arduino Mega by S K Rai
21 pages
Patt and Patel LC3 Instruction Set Architecture
No ratings yet
Patt and Patel LC3 Instruction Set Architecture
26 pages
5G Arc HLD V1
No ratings yet
5G Arc HLD V1
25 pages
2.3.2.2 Packet Tracer - Configuring Rapid PVST+ Instructions IG
No ratings yet
2.3.2.2 Packet Tracer - Configuring Rapid PVST+ Instructions IG
4 pages
Robot Ask
No ratings yet
Robot Ask
392 pages
Questions About Motherboard PDF
No ratings yet
Questions About Motherboard PDF
27 pages
ASI Hosting Areas of Responsibilites
No ratings yet
ASI Hosting Areas of Responsibilites
4 pages
Com.vip.Loader Logcat
No ratings yet
Com.vip.Loader Logcat
6 pages
17119010A03 005 NDS Network Design Specification v04.02
No ratings yet
17119010A03 005 NDS Network Design Specification v04.02
11 pages
Lab VPN
No ratings yet
Lab VPN
9 pages
CH01 COA Testbank
No ratings yet
CH01 COA Testbank
5 pages
Terraform Basic Architecture
No ratings yet
Terraform Basic Architecture
7 pages
SIMD Array Processor
No ratings yet
SIMD Array Processor
25 pages
Ansible EC2 Example - Create EC2 Instance With
No ratings yet
Ansible EC2 Example - Create EC2 Instance With
27 pages
Tibco Hawk Microagent For Tibco Activematrix Businessworks 6 Installation Guide
No ratings yet
Tibco Hawk Microagent For Tibco Activematrix Businessworks 6 Installation Guide
15 pages
Computer System Servicing NCII: Marvin B. Broñoso
No ratings yet
Computer System Servicing NCII: Marvin B. Broñoso
17 pages
Operating Systems Basics
No ratings yet
Operating Systems Basics
3 pages
The North Bridge
No ratings yet
The North Bridge
2 pages
Operating System Notes For MCA
No ratings yet
Operating System Notes For MCA
83 pages
05 Chapter5 Is Is Configuration
No ratings yet
05 Chapter5 Is Is Configuration
56 pages
sahil_es
No ratings yet
sahil_es
27 pages
ReadMe KMSnano
No ratings yet
ReadMe KMSnano
1 page
Installation Manual and Basic Linux/Unix Commands With Examples
No ratings yet
Installation Manual and Basic Linux/Unix Commands With Examples
29 pages
Wasp assembly reference
No ratings yet
Wasp assembly reference
9 pages

Computer Architecture Unit1

Uploaded by

Computer Architecture Unit1

Uploaded by

1

Von Neumann Architecture : Von Neumann Architecture design

QUANTITIVE TECHNIQUES IN COMPUTER DESIGN

Amdahl’s Law defines the speedup that can be gained by using a

CPU Performance Equation: Essentially all computers are

There are five types of instructions in multi-cycle MIPS.

MEASURING & REPORTING PERFORMANCE

• Wall-clock time/ elapsed time: time to complete a task as seen by the

Choosing the right programs to test a system

• SPEC develops suites of benchmarks and also reviews and publishes

Why do we need benchmarks?

▪ Simultaneous execution of more than one instruction takes place in

• In pipeline system, each segment consists of an input register followed

• It measures time taken for a single instruction to complete its

executed in series rather than parallel for a portion of pipeline. Structural

Techniques for Handling Hazards

1. Stalling: Stalling involves delaying the execution of an instruction until

➢ Exceptions and interrupts are unexpected events that disrupt the

Some examples of exceptions are:

There are different characteristics for exceptions. They are as follows:

User requested VS Coerced

User maskable VS unmaskable

PIPELINE OPTIMIZATION TECHNIQUES

Locating the Bottleneck

COMPILER TECHNIQUES FOR IMPROVING PERFORMANCE

You might also like