0% found this document useful (0 votes)

19 views34 pages

SEN307 Lecture 5

The document outlines a lecture on computer performance, covering key topics such as performance metrics, benchmarks, and calculation techniques. It emphasizes the importance of performance in user experience, productivity, and cost efficiency, and introduces various performance metrics like CPI, MIPS, and FLOPS. Additionally, it discusses pipeline performance, hazards, and cache performance, providing examples and calculations to illustrate these concepts.

Uploaded by

hauwafaruk81

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views34 pages

SEN307 Lecture 5

Uploaded by

hauwafaruk81

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 34

Lecture 5

Introduction to Computer
Performance

Computer Architecture- NUN 2024 Austin Olom Ogar

MODULE OUTLINE

Introduction to Computer Architecture Performance

Performance Metrics and Benchmarks

Performance Calculation Techniques

Advanced Performance Modeling

.
Power and Energy Efficiency

Case Studies and Real-world Applications

Hands-On Performance Analysis

Trade-offs in Memory System Design

Computer Architecture- NUN 2024 Austin Olom Ogar

MODULE OBJECTIVE

By the end of this course, students will be able to:

Understand the fundamentals of computer performance.

.
Learn about CPU performance metrics and optimization techniques.

Gain skills in performance measurement and benchmarking

Explore real-world case studies on performance enhancement

Computer Architecture- NUN 2024 Austin Olom Ogar

What is Performance in Computer Architecture?
Performance in computer architecture refers to the measure of how effectively a computer system
executes tasks or processes. It is often quantified by how quickly and efficiently a system can
perform a given workload, such as executing instructions or running applications.
Key Considerations:
Speed: How fast the system can complete tasks.
Efficiency: How well the system utilizes its resources (CPU, memory, etc.).
Scalability: The system’s ability to maintain performance under increased workloads.

.
Importance: Why Performance Matters in Computing:
1.User Experience: Faster systems lead to better user experiences, especially in applications requiring real-time
processing (e.g., gaming, video editing).
2.Productivity: High-performance systems can handle more tasks in less time, increasing overall productivity in business
and research environments.
3.Cost Efficiency: Systems that perform well are more cost-effective, reducing the need for additional hardware or
resources.
4.Competitiveness: In industries like cloud computing or high-performance computing (HPC), superior performance can
provide a competitive edge.
5.Energy Consumption: Better performance often correlates with more efficient energy usage, important in mobile
devices and large-scale data centers.

Computer Architecture - NUN 2024 Austin Olom Ogar

Common Metrics in Performance Evaluation:
Clock Speed
•Definition: The speed at which a processor executes instructions, measured in Hertz (Hz).
•Importance: Higher clock speeds usually indicate a faster processor, though it’s not the only factor in performance.
•Example: A processor with a clock speed of 3.5 GHz performs 3.5 billion cycles per second.

CPI (Cycles Per Instruction)

•Definition: The average number of clock cycles each instruction takes to execute.
•Formula: CPI = Total Clock Cycles / Total Instructions Executed
•Importance: Lower CPI values typically indicate better performance, as fewer cycles are needed per instruction.
•Example: If a program takes 500 million cycles to execute 200 million instructions, CPI = 500M / 200M = 2.5.

.
MIPS (Million Instructions Per Second)
•Definition: A measure of a computer's processor speed, indicating how many millions of instructions a CPU can process per
second.
•Formula: MIPS = (Instruction Count / Execution Time) / 10^6
•Importance: Useful for comparing the performance of different processors when running the same instruction set.
•Example: A CPU executing 1 billion instructions in 2 seconds has a MIPS rating of 500.

FLOPS (Floating Point Operations Per Second)

•Definition: A metric used to measure the performance of a computer in executing floating-point calculations, essential for tasks
involving complex mathematical computations.
•Importance: FLOPS is crucial in scientific computing, machine learning, and other areas requiring high precision arithmetic.
•Example: A supercomputer performing at 1 petaflop can handle one quadrillion (10^15) floating-point operations per second.
Latency vs. Throughput
Latency
•Definition: Latency is the time delay between the initiation of a task and its completion. It
represents the time taken to process a single task from start to finish.
• In Computing, Latency is often associated with the delay in data transfer, memory access,
or instruction execution.
• Example: If a processor takes 5 milliseconds to retrieve data from memory, this 5 ms is the
latency of the memory access.
•Key Concept: Lower latency is generally better, as it means tasks are completed faster.
.
Throughput
•Definition: Throughput is the rate at which tasks are completed over a specific period of time. It
measures the number of tasks that can be processed or executed within a given timeframe.
• In Computing, Throughput is typically used to measure how much data or how many
instructions a system can process per unit of time.
• Example: If a server can handle 1000 requests per second, its throughput is 1000
requests/second.
•Key Concept: Higher throughput is generally better, as it means more tasks are completed in
less time.
Amdahl's Law
Amdahl's Law is used to predict the theoretical maximum speedup that can be achieved by
improving a specific part of a system or program, given that not all parts can be improved
equally.

.
Amdahl's Law cont..
Example 1: If 20% of a program is enhanced and that portion is sped up by a factor of 5, what is
the overall speedup according to Amdahl's Law?
Solution

.
Amdahl's Law cont..
Example 2: A program is enhanced by speeding up 60% of the code by a factor of 8. Calculate the
overall speedup. Then, determine the theoretical maximum speedup if the entire program could
be enhanced by the same factor?
Solution

.
Amdahl's Law cont..
Example 3: You have three programs, A, B, and C, each with different portions enhanced: 10%,
50%, and 90%, respectively. The speedup for the enhanced portion is 5x in all cases. Calculate
the overall speedup for each program and discuss the results?
Solution

.
Amdahl's Law cont..
Example 4: A program is 60% parallelizable and 40% sequential. If it runs on a system with 8
processors, calculate the theoretical speedup using Amdahl’s Law. Then, consider an overhead
factor due to communication between processors that reduces efficiency by 10% and recalculate
the effective speedup.

.
CPI (Cycles Per Instruction)
CPI is a crucial metric in evaluating the efficiency of a CPU. A lower CPI indicates that the CPU can
execute instructions more quickly, leading to better performance

.
CPI cont..
Example 1: A processor executes a program consisting of 200,000 instructions, and it takes
500,000 clock cycles to complete. Calculate the CPI for this program?
Solution

.
CPI cont..
Example 2: A processor executes three types of instructions: Type A, Type B, and Type C. The instruction counts and
their respective CPI values are as follows:
•Type A: 100,000 instructions, CPI = 2
•Type B: 50,000 instructions, CPI = 4
•Type C: 50,000 instructions, CPI = 3 Solution

.
CPI cont..
Example 3: A processor has a CPI of 4 and a clock cycle time of 250 ps. If a program consists of 500,000 instructions,
calculate the total execution time in seconds.
Solution

.
CPI cont..
Example 4: A processor executes a mix of three types of instructions in a workload:
• 30% arithmetic instructions with a CPI of 1
• 50% memory instructions with a CPI of 2
• 20% branch instructions with a CPI of 3
Calculate the overall CPI of the workload. Then, if the memory CPI can be reduced to 1.5 by improving the cache,
recalculate the overall CPI and discuss the performance impact.
Solution

.
MIPS (Million Instructions Per Second)
A measure of a computer's processor speed, indicating how many millions of instructions a CPU
can process per second.

.
Relation to CPI and Clock Speed
A measure of a computer's processor speed, indicating how many millions of instructions a CPU
can process per second.

.
MIPS cont..
Example 1: A processor executes a program with 2,000,000 instructions in 1 second. Calculate
the MIPS for this processor.?
Solution

.
MIPS cont..
Example 2: A processor has a clock speed of 2 GHz and a CPI of 4. Calculate the MIPS rating of the processor.Answer:

Solution

.
MIPS cont..
Example 3: A processor has a MIPS rating of 10. If a program consists of 2,500,000 instructions, how long will it take
to execute the program?.
Solution

.
MIPS cont..
Example 4: A processor with a clock rate of 4 GHz and a base CPI of 1.5 executes 1 billion instructions. However, due
to pipeline stalls, the CPI increases by 20%. Calculate the MIPS rating before and after the pipeline stalls and analyze
the percentage decrease in MIPS performance.
Solution

.
Pipeline Performance
In pipelined processors, multiple instructions are overlapped during execution. Each stage in the
pipeline processes a different instruction simultaneously, improving the overall throughput of the
processor.

Pipeline Stages:
• Fetch: Retrieves the instruction from memory.

.
• Decode: Interprets the fetched instruction and prepares the necessary signals for execution.

• Execute: Performs the operation specified by the instruction (e.g., arithmetic operation).

• Memory: Accesses memory for load or store operations.

• Write-back: Writes the result of the execution back to the register file.
Pipeline Hazards
Pipeline hazards are situations that prevent the next instruction in the pipeline from executing
during its designated clock cycle. These hazards can reduce the efficiency of the pipeline and
introduce delays (stalls).

Types of Pipeline Hazards:

•Data Hazards: Occur when instructions that exhibit data dependencies modify data in different
stages of the pipeline.
.
•Example: If one instruction is reading a value that another instruction is writing to, a data
hazard occurs.
• Control Hazards: Arise from the need to make a decision based on the outcome of a previous
instruction (e.g., branches).
• Example: A branch instruction that changes the flow of control can cause a delay in
fetching the correct instruction.
•Structural Hazards: Occur when hardware resources are insufficient to support all concurrent
operations in the pipeline.
•Example: If two instructions need to access memory simultaneously, but the system only has
one memory port, a structural hazard occurs.
Pipeline cont..
Example 1: A simple 5-stage pipeline (Fetch, Decode, Execute, Memory, Write-back) has a cycle
time of 1 ns. How long will it take to execute 50 instructions in the pipeline without any stall.?
Solution

.
Pipeline cont..
Example 2: In a 4-stage pipeline (Fetch, Decode, Execute, Write-back) with a cycle time of 2 ns, 30 instructions need
to be executed. If 5 stalls occur due to data hazards, calculate the total time to execute all instructions.:

Solution

.
Pipeline cont..
Example 3: Compare the execution time for 40 instructions on a non-pipelined processor and a pipelined processor
with 5 stages and a cycle time of 2 ns. Assume no stalls occur in the pipelined processor, and each instruction takes
5 cycles in the non-pipelined processor.
Solution

.
Pipeline cont..
Example 3: In a 5-stage pipeline (Fetch, Decode, Execute, Memory, Write-back), an instruction A is followed by
instruction B, where B depends on the result of A. Explain how a data hazard could occur and suggest one method
to resolve it.
Solution
• Data Hazard Explanation:
• A data hazard occurs if instruction B needs the result from instruction A before it can proceed. Since A is not
finished when B is in the pipeline, B might use an incorrect or incomplete value

• Resolution Method: .
• Forwarding (Data Bypassing): Pass the result of instruction A directly to instruction B from the execution
stage without waiting for it to go through the rest of the pipeline stages.
Pipeline cont..
Example 4: A 5-stage pipeline processor has a base CPI of 1. However, data hazards introduce an
average of 0.5 stalls per instruction, and branch hazards add an additional 1.5 stalls for every
branch instruction. In a workload where 30% of instructions are branches, calculate the effective
CPI and the pipeline speedup relative to a non-pipelined processor with a CPI of 5.
Solution

.
Cache Performance
Cache memory is a small, high-speed storage area located close to the CPU that stores copies of
frequently accessed data from main memory (RAM). The primary purpose of cache memory is to
reduce the time needed to access data, thereby improving overall system performance.

Key Concepts:
• Cache Hits: Occurs when the data requested by the CPU is found in the cache. This allows the
CPU to access the data quickly.
.
• Cache Misses: Occurs when the data requested by the CPU is not found in the cache,
requiring the CPU to retrieve the data from the slower main memory.
Cache Performance Metrics

.
Cache cont..
Example 1: A CPU has a cache with a hit rate of 90%. The access time for the cache is 2 cycles,
and the miss penalty (time to access data from main memory) is 40 cycles. Calculate the Effective
Access Time (EAT) for this cache.?
Solution

.
Cache cont..
Example 2: Suppose a CPU's cache has an initial hit rate of 85% with a cache access time of 2 cycles and a miss
penalty of 60 cycles. If an optimization improves the hit rate to 95%, calculate the difference in Effective Access Time
(EAT) before and after the optimization.
Solution

.
Cache cont..
Example 3: A cache memory system is designed with a hit rate of 97%. If the cache access time is 1 cycle and the
miss penalty is 100 cycles, calculate the Effective Access Time (EAT). Additionally, discuss the importance of
maintaining a high hit rate in real-world applications.
Solution

SSD
No ratings yet
SSD
11 pages
1.1.1.8 Packet Tracer - Using Traceroute To Discover The Network Instructions
No ratings yet
1.1.1.8 Packet Tracer - Using Traceroute To Discover The Network Instructions
6 pages
Lenovo B50-30/B50-30 Touch/ B50-45/B50-70/B50-80: Hardware Maintenance Manual
No ratings yet
Lenovo B50-30/B50-30 Touch/ B50-45/B50-70/B50-80: Hardware Maintenance Manual
118 pages
Change Request Management-chaRM
No ratings yet
Change Request Management-chaRM
50 pages
inBIO460 Installation Guide PDF
No ratings yet
inBIO460 Installation Guide PDF
2 pages
Temperature Control Module User's Manual: - Q64TCTT - Q64Tcttbw - Q64TCRT - Q64Tcrtbw - GX Configurator-TC (SW0D5C-QTCU-E)
No ratings yet
Temperature Control Module User's Manual: - Q64TCTT - Q64Tcttbw - Q64TCRT - Q64Tcrtbw - GX Configurator-TC (SW0D5C-QTCU-E)
178 pages
L14 Introduction To Performance Evaluation
No ratings yet
L14 Introduction To Performance Evaluation
48 pages
MICRO-EHV Plus Catalogue 20131122
No ratings yet
MICRO-EHV Plus Catalogue 20131122
16 pages
CSC232 - Chp1 (Compatibility Mode)
No ratings yet
CSC232 - Chp1 (Compatibility Mode)
50 pages
BSC (Hons) Computer Network With Security BSC (Hons) Computing With Information Systems
No ratings yet
BSC (Hons) Computer Network With Security BSC (Hons) Computing With Information Systems
7 pages
4 Performance
No ratings yet
4 Performance
27 pages
The Role of Performance: Chapter - 2
No ratings yet
The Role of Performance: Chapter - 2
40 pages
Cse - 321 - 2
No ratings yet
Cse - 321 - 2
37 pages
Performance Measures
No ratings yet
Performance Measures
25 pages
Energy-Efficient Neural Network Accelerator Based On Outlier-Aware Low-Precision Computation
No ratings yet
Energy-Efficient Neural Network Accelerator Based On Outlier-Aware Low-Precision Computation
11 pages
Prelim Quiz 2 - UGRD-IT6324A Mobile Programming 2
No ratings yet
Prelim Quiz 2 - UGRD-IT6324A Mobile Programming 2
3 pages
MENSAJES
No ratings yet
MENSAJES
816 pages
RGPV Mca V (5) Sem Grading Syllabus
No ratings yet
RGPV Mca V (5) Sem Grading Syllabus
14 pages
COMP 303 Computer Architecture
No ratings yet
COMP 303 Computer Architecture
34 pages
ZKTeco Biometric Readers Product Catalogue FINAL LRZ 2023
No ratings yet
ZKTeco Biometric Readers Product Catalogue FINAL LRZ 2023
16 pages
Lecture # 2
No ratings yet
Lecture # 2
33 pages
Performance Measures For Computers
No ratings yet
Performance Measures For Computers
53 pages
Quiz 01aaaae Floatingpoint Answers
No ratings yet
Quiz 01aaaae Floatingpoint Answers
6 pages
L-2 (Computer Performance)
No ratings yet
L-2 (Computer Performance)
47 pages
Gasq Remote Exam
No ratings yet
Gasq Remote Exam
10 pages
Computer Organization The Role of Performance
No ratings yet
Computer Organization The Role of Performance
45 pages
William Stallings Computer Organization and Architecture 8 Edition Computer Evolution and Performance
No ratings yet
William Stallings Computer Organization and Architecture 8 Edition Computer Evolution and Performance
28 pages
Chapter 1 Introduction
No ratings yet
Chapter 1 Introduction
17 pages
Computer Organization and Architecture (AT70.01)
No ratings yet
Computer Organization and Architecture (AT70.01)
29 pages
Lecture4 Performance Evaluation 2011
No ratings yet
Lecture4 Performance Evaluation 2011
34 pages
Chapter 1 Performance
No ratings yet
Chapter 1 Performance
32 pages
Lec 2 Performance
No ratings yet
Lec 2 Performance
28 pages
Assessing and Understanding Performance
No ratings yet
Assessing and Understanding Performance
31 pages
Module 3.3 - Problems On Performance
No ratings yet
Module 3.3 - Problems On Performance
54 pages
Lesson 3 - Computing For Performance
No ratings yet
Lesson 3 - Computing For Performance
38 pages
COMPORGA - Module 2
No ratings yet
COMPORGA - Module 2
13 pages
The ISCOM2100 Series: Complete Access Easy Management and Maintenance
No ratings yet
The ISCOM2100 Series: Complete Access Easy Management and Maintenance
2 pages
COD Ch. 2 The Role of Performance
No ratings yet
COD Ch. 2 The Role of Performance
28 pages
1 - Introduction To Computer System
No ratings yet
1 - Introduction To Computer System
31 pages
PGTR - Trafos Secos
No ratings yet
PGTR - Trafos Secos
2 pages
M116C 1 M116C 1 Lect02-Performance
No ratings yet
M116C 1 M116C 1 Lect02-Performance
23 pages
ThinkPad P50 Platform Specifications
No ratings yet
ThinkPad P50 Platform Specifications
1 page
Tecknet TK-WM389 Cordless Mouse EN
No ratings yet
Tecknet TK-WM389 Cordless Mouse EN
10 pages
09 Perf
No ratings yet
09 Perf
22 pages
Puter Performance
No ratings yet
Puter Performance
15 pages
4 Perfrmance
No ratings yet
4 Perfrmance
30 pages
Week 10 Part 02 - Processor Performance (Q Only) - Tagged 2
No ratings yet
Week 10 Part 02 - Processor Performance (Q Only) - Tagged 2
23 pages
Performance
No ratings yet
Performance
12 pages
Huawei
No ratings yet
Huawei
49 pages
Week 2 - Lecture 2 - Performance Measurement
No ratings yet
Week 2 - Lecture 2 - Performance Measurement
25 pages
Lecture 02 CH01 Performance Power
No ratings yet
Lecture 02 CH01 Performance Power
76 pages
2 RISC V Performance ISA
No ratings yet
2 RISC V Performance ISA
72 pages
Logo Verification Report
No ratings yet
Logo Verification Report
1 page
Lecture Ch4 Performance
No ratings yet
Lecture Ch4 Performance
25 pages
Inroduction and Performance Analysis
No ratings yet
Inroduction and Performance Analysis
29 pages
Marvel For Computers Laptop List
No ratings yet
Marvel For Computers Laptop List
13 pages
Msi MPG x570 Gaming Plus Datasheet
No ratings yet
Msi MPG x570 Gaming Plus Datasheet
1 page
Performance Matrices
No ratings yet
Performance Matrices
14 pages
PythonProgrammingTutorial Day01
No ratings yet
PythonProgrammingTutorial Day01
6 pages
Quatitative Principle
No ratings yet
Quatitative Principle
56 pages
Performance
No ratings yet
Performance
51 pages
Computer Performance
No ratings yet
Computer Performance
27 pages
2 CPU Performance
No ratings yet
2 CPU Performance
35 pages
Performance: Latency
No ratings yet
Performance: Latency
7 pages
Designing For Performance - Performance Metrics
No ratings yet
Designing For Performance - Performance Metrics
19 pages
Lec 3
No ratings yet
Lec 3
21 pages
Computer Performance
No ratings yet
Computer Performance
17 pages
LESSON PLAN Intro and Vocab Technology
No ratings yet
LESSON PLAN Intro and Vocab Technology
1 page
HPE Reference Configuration For Red Hat OpenShift Container Platform 4.14 On HPE ProLiant DL360 & DL380 Gen11 Servers-A50010318enw
No ratings yet
HPE Reference Configuration For Red Hat OpenShift Container Platform 4.14 On HPE ProLiant DL360 & DL380 Gen11 Servers-A50010318enw
25 pages
Ilovepdf - Merged (4) 36 274
No ratings yet
Ilovepdf - Merged (4) 36 274
120 pages
Measuring Computer Performance
No ratings yet
Measuring Computer Performance
26 pages
CS-3006 4 PerformanceAnalysis
No ratings yet
CS-3006 4 PerformanceAnalysis
62 pages
Module 2 (26-10-2024)
No ratings yet
Module 2 (26-10-2024)
50 pages
2 - Computer Organization and Architecture
No ratings yet
2 - Computer Organization and Architecture
21 pages
Computer Performance
No ratings yet
Computer Performance
22 pages
CS5204/EE5364 - Advanced Computer Architecture - Performance
No ratings yet
CS5204/EE5364 - Advanced Computer Architecture - Performance
56 pages
L7 Performance
No ratings yet
L7 Performance
11 pages
Comp Org Notes On Measuring Cpu Performance
No ratings yet
Comp Org Notes On Measuring Cpu Performance
4 pages
SEN307 Lecture 8
No ratings yet
SEN307 Lecture 8
16 pages
Microcontrollers Lab Manual - East Point
No ratings yet
Microcontrollers Lab Manual - East Point
42 pages
Lec10 Performance
No ratings yet
Lec10 Performance
22 pages
Turtle Race Documentation
No ratings yet
Turtle Race Documentation
4 pages
Da Ci
No ratings yet
Da Ci
13 pages
Performance of Processor1
No ratings yet
Performance of Processor1
9 pages
Computer Architecture Unit1
No ratings yet
Computer Architecture Unit1
20 pages
Samsung Firmware Download - Lastest Official Firmware Update
No ratings yet
Samsung Firmware Download - Lastest Official Firmware Update
1 page
Lec 2
No ratings yet
Lec 2
31 pages
Lec 2
No ratings yet
Lec 2
31 pages
3 Technology
No ratings yet
3 Technology
4 pages
Python Beyond Limits: Python, #3
From Everand
Python Beyond Limits: Python, #3
AnwaarX
No ratings yet

SEN307 Lecture 5

Uploaded by

SEN307 Lecture 5

Uploaded by

Lecture 5

Computer Architecture- NUN 2024 Austin Olom Ogar

Introduction to Computer Architecture Performance

Performance Metrics and Benchmarks

Performance Calculation Techniques

Advanced Performance Modeling

Case Studies and Real-world Applications

Hands-On Performance Analysis

Trade-offs in Memory System Design

Computer Architecture- NUN 2024 Austin Olom Ogar

By the end of this course, students will be able to:

Understand the fundamentals of computer performance.

Gain skills in performance measurement and benchmarking

Explore real-world case studies on performance enhancement

Computer Architecture- NUN 2024 Austin Olom Ogar

Computer Architecture - NUN 2024 Austin Olom Ogar

CPI (Cycles Per Instruction)

FLOPS (Floating Point Operations Per Second)

• Memory: Accesses memory for load or store operations.

Types of Pipeline Hazards:

You might also like