0% found this document useful (0 votes)

71 views12 pages

Performance

This document discusses various performance metrics for measuring computer systems. It provides examples of different metrics like execution time, throughput, component metrics, and discusses how to properly measure and analyze performance. Key points covered include defining good metrics, how to ensure reproducibility in experiments, and examples of metrics to use and avoid like MIPS and how averages are calculated.

Uploaded by

bijan shrestha

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

71 views12 pages

Performance

Uploaded by

bijan shrestha

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 12

Performance Metrics

Why study performance metrics?

• determine the benefit/lack of benefit of designs
• computer design is too complex to intuit performance &
performance bottlenecks
• have to be careful about what you mean to measure & how
you measure it

What you should get out of this discussion

• good metrics for measuring computer performance
• what they should be used for
• what metrics you shouldn’t use & how metrics are misused

perf
Performance of Computer Systems
Many different factors to take into account when determining
performance:
• Technology
• circuit speed (clock, MHz)
• processor technology (how many transistors on a chip)
• Organization
• type of processor (ILP)
• configuration of the memory hierarchy
• type of I/O devices
• number of processors in the system
• Software
• quality of the compilers
• organization & quality of OS, databases, etc.

perf
“Principles” of Experimentation

Meaningful metrics
execution time & component metrics that explain it

Reproducibility
machine configuration, compiler & optimization level, OS, input

Real programs
no toys, kernels, synthetic programs
SPEC is the norm (integer, floating point, graphics, webserver)
TPC-B, TPC-C & TPC-D for database transactions

Simulation
long executions, warm start to mimic steady-state behavior
usually applications only; some OS simulation
simulator “validation” & internal checks for accuracy

perf
Metrics that Measure Performance
Raw speed: peak performance (never attained)

Execution time: time to execute one program from beginning to

end
• the “performance bottom line”
• wall clock time, response time
• Unix time function: 13.7u 23.6s 18:27 3%

Throughput: total amount of work completed in a given time

• transactions (database) or packets (web servers) / second
• an indication of how well hardware resources are being used
• good metrics for chip designers or managers of computer
systems

(Often improving execution time will improve throughput & vice

versa.)

Component metrics: subsystem performance, e.g., memory

behavior
• help explain how execution time was obtained
• pinpoints performance bottlenecks

perf
Execution Time

Performancea = 1 / (Execution Timea)

Processor A is faster than processor B, i.e.,

Execution TimeA < Execution TimeB

PerformanceA > PerformanceB

Relative Performance

PerformanceA / PerformanceB
=n
= ExecutionTImeB / ExecutionTimeA

performance of A is n times greater than B

execution time of B is n times longer than A

perf
CPU Execution Time
The time the CPU spends executing an application
• no memory effects
• no I/O
• no effects of multiprogramming

CPUExecutionTime = CPUClockCycles * ClockCycleTime

Cycle time (clock period) is measured in time or rate

• clock cycle time = 1/clock cycle rate

CPUExecutionTime = CPUClockCycles / ClockCycleRate

• clock cycle rate of 1 MHz = cycle time of 1 µs

• clock cycle rate of 1 GHz = cycle time of 1 ns

perf
CPI

CPUClockCycles = NumberOfInstructions * CPI

Average number of clock cycles per instruction

• throughput metric
• component metric, not a measure of performance
• used for processor organization studies, given a fixed compiler
& ISA

Can have different CPIs for classes of instructions

e.g., floating point instructions take longer than integer
instructions

n
CPUClockCycles = ∑ (CPI i × C i )
1

where CPIi = CPI for a particular class of instructions

where Ci = the number of instructions of the ith class that have
been executed

Improving part of the architecture can improve a CPIi

• Talk about the contribution to CPI of a class of instructions

perf
CPU Execution Time

CPUExecutionTime =
numberofInstructions * CPI * clockCycleTime

To measure:
• execution time: depends on all 3 factors
• time the program
• number of instructions: determined by the ISA
• programmable hardware counters
• profiling
• count number of times each basic block is executed
• instruction sampling
• CPI: determined by the ISA & implementation
• simulator: interpret (in software) every instruction &
calculate the number of cycles it takes to simulate it
• clock cycle time: determined by the implementation & process
technology

Factors are interdependent:

• RISC: increases instructions/program, but decreases CPI &
clock cycle time because the instructions are simple
• CISC: decreases instructions/program, but increases CPI &
clock cycle time because many instructions are more complex

perf
Metrics Not to Use
MIPS (millions of instructions per second)
instruction count / execution time*10^6 =
clock rate / (CPI * 10^6)
- instruction set-dependent (even true for similar architectures)
- implementation-dependent
- compiler technology-dependent
- program-dependent
+ intuitive: the higher, the better

MFLOPS (millions of floating point operations per second)

floating point operations / (execution time * 10^6)
+ FP operations are independent of FP instruction
implementation
- different machines implement different FP operations
- different FP operations take different amounts of time
- only measures FP code

static metrics (code size)

perf
Means
Measuring the performance of a workload
• arithmetic: used for averaging execution times
n
  1


∑ timei  ×
i =1  n
• harmonic: used for averaging rates ("the average of", as
opposed to "the average statistic of")
p
p
 

 ∑ 1 
ratei 
 i =1 
• weighted means: the programs are executed with different
frequencies, for example:
n
  1


∑ timei × weighti  ×
i =1  n

perf
Means

FP Ops Time (secs)

Computer A Computer B Computer C

program 1 100 1 10 20
program 2 100 1000 100 20
total 1001 110 40
arith mean 500.5 55 20

FP Ops Rate (FLOPS)

Computer A Computer B Computer C

program 1 100 100 10 5
program 2 100 .1 1 5
harm mean .2 1.5 5
arith mean 50.1 5.5 5

Computer C is ~25 times faster than A when measuring execution

time

Still true when measuring MFLOPS(a rate) with the harmonic mean

perf
Speedup

Speedup = Execution TimebeforeImprovement /

ExecutionTimeafterImprovement
Amdahl’s Law:
Performance improvement from speeding up a part of a
computer system is limited by the proportion of time the
enhancement is used.

perf

Performance Matrices
No ratings yet
Performance Matrices
14 pages
Hierarchy of life forms at phenomenological level
100% (1)
Hierarchy of life forms at phenomenological level
6 pages
NAME: Phuong Anh 9.3 TEACHER: Mr. Fritz: CHAPTER 17: Document Production-Skills Checklist
No ratings yet
NAME: Phuong Anh 9.3 TEACHER: Mr. Fritz: CHAPTER 17: Document Production-Skills Checklist
4 pages
OpenStack Udemy Asad
100% (1)
OpenStack Udemy Asad
140 pages
Lecture # 2
No ratings yet
Lecture # 2
33 pages
Optimization of DEEC Routing Protocol Using Genetic Algorithm PDF
No ratings yet
Optimization of DEEC Routing Protocol Using Genetic Algorithm PDF
7 pages
DA_CI
No ratings yet
DA_CI
13 pages
Computer Performance
No ratings yet
Computer Performance
22 pages
Answers To C-Questions
No ratings yet
Answers To C-Questions
12 pages
IT401 Computer Organization and Architecture: Prasun Ghosal
No ratings yet
IT401 Computer Organization and Architecture: Prasun Ghosal
30 pages
Power Management PDF
No ratings yet
Power Management PDF
6 pages
Chapter 1 Performance
No ratings yet
Chapter 1 Performance
32 pages
Power Plant Equipment Lecture-4 (Numerical)
No ratings yet
Power Plant Equipment Lecture-4 (Numerical)
13 pages
Unit 1 Industrial Safety-1
No ratings yet
Unit 1 Industrial Safety-1
9 pages
Questions BIBC
No ratings yet
Questions BIBC
3 pages
Procurement Principles 1
No ratings yet
Procurement Principles 1
117 pages
OP Amps Solutions
No ratings yet
OP Amps Solutions
9 pages
1 - Number System and Digital Logic
No ratings yet
1 - Number System and Digital Logic
34 pages
3 History Evaluation and Types of OS
No ratings yet
3 History Evaluation and Types of OS
12 pages
String Instructions
100% (1)
String Instructions
7 pages
Black Out Test Procedure Generator Sa
No ratings yet
Black Out Test Procedure Generator Sa
3 pages
Presentation 2
No ratings yet
Presentation 2
19 pages
500Kv Grid Stations: Tariq Ahmed CMS-ID 35562 Power Transmission 2 Semester Assignment
No ratings yet
500Kv Grid Stations: Tariq Ahmed CMS-ID 35562 Power Transmission 2 Semester Assignment
5 pages
PLC Simulator
No ratings yet
PLC Simulator
74 pages
8051 Interrupts
No ratings yet
8051 Interrupts
32 pages
Lab Manual CS7001 Distributed System Powered by A2softech (A2kash)
No ratings yet
Lab Manual CS7001 Distributed System Powered by A2softech (A2kash)
30 pages
DB-405 Windows SCADA Report Database Editing Guide 1.1
No ratings yet
DB-405 Windows SCADA Report Database Editing Guide 1.1
63 pages
MPI Tutorial
No ratings yet
MPI Tutorial
23 pages
Construction of Circle Diagram
100% (1)
Construction of Circle Diagram
18 pages
Testing Fast Motor Bus Transfer Systems
No ratings yet
Testing Fast Motor Bus Transfer Systems
22 pages
Ee 1404 Power System Lab Manual
No ratings yet
Ee 1404 Power System Lab Manual
64 pages
DSP Processors and Architectures
No ratings yet
DSP Processors and Architectures
2 pages
Seminar Report
100% (3)
Seminar Report
28 pages
Fault Detection Classification and Prote PDF
No ratings yet
Fault Detection Classification and Prote PDF
173 pages
Unit - Iv
No ratings yet
Unit - Iv
48 pages
Fuzzy Logic Engineering Aplications
No ratings yet
Fuzzy Logic Engineering Aplications
652 pages
9770M1 3500 25 PDF
No ratings yet
9770M1 3500 25 PDF
77 pages
Organization of The 8086/8088 Microprocessor: Lecture#6
No ratings yet
Organization of The 8086/8088 Microprocessor: Lecture#6
19 pages
IJCRT1812264
No ratings yet
IJCRT1812264
12 pages
PRO40 Server Monitoring 01july2015
No ratings yet
PRO40 Server Monitoring 01july2015
61 pages
Cx31-P600-Lehw0008-03-Hoja de Datos
100% (1)
Cx31-P600-Lehw0008-03-Hoja de Datos
3 pages
Advanced CMOS Circuits
No ratings yet
Advanced CMOS Circuits
20 pages
Load Curve and Load Duration Curve
100% (1)
Load Curve and Load Duration Curve
4 pages
Computer Architecture and Organization Ch#2 Examples
No ratings yet
Computer Architecture and Organization Ch#2 Examples
6 pages
DIP Notes Unit-3
No ratings yet
DIP Notes Unit-3
57 pages
Fundamental PI Instruction Manual
No ratings yet
Fundamental PI Instruction Manual
18 pages
A Novel Single Phase Cascaded H-Bridge Inverter With Reduced Power Electronics Switches
No ratings yet
A Novel Single Phase Cascaded H-Bridge Inverter With Reduced Power Electronics Switches
8 pages
COA Midsem
No ratings yet
COA Midsem
376 pages
Unit 2 - Embedded System
100% (4)
Unit 2 - Embedded System
75 pages
02-Interview Questions Full
No ratings yet
02-Interview Questions Full
67 pages
RK96-User Manual
No ratings yet
RK96-User Manual
1 page
Introduction To Socket Details: Unix Network Programming
No ratings yet
Introduction To Socket Details: Unix Network Programming
22 pages
CH 1 - Introduction Electrical Installation Design 2010-2011 A4
No ratings yet
CH 1 - Introduction Electrical Installation Design 2010-2011 A4
10 pages
Audiometro Gsi 61 User Manual
No ratings yet
Audiometro Gsi 61 User Manual
98 pages
Marketing Management - Chap 13 - Sevices
No ratings yet
Marketing Management - Chap 13 - Sevices
37 pages
Tour Case Study 2
No ratings yet
Tour Case Study 2
14 pages
hp09640 (Diccionario)
No ratings yet
hp09640 (Diccionario)
606 pages
02 Whole
No ratings yet
02 Whole
145 pages
Balluff I-O Link Basics
No ratings yet
Balluff I-O Link Basics
28 pages
Psat-1 3 4
100% (2)
Psat-1 3 4
463 pages
EIOT Mini Project Report - Rutumbara Chakor
100% (1)
EIOT Mini Project Report - Rutumbara Chakor
17 pages
Io Module5
No ratings yet
Io Module5
80 pages
Operating System Introduction
No ratings yet
Operating System Introduction
35 pages
No 4 Guide
No ratings yet
No 4 Guide
306 pages
VC-6000 Monitoring System
No ratings yet
VC-6000 Monitoring System
8 pages
Utilization of Electrical Energy: CHAPTER 4: Electric Traction L-4-3
No ratings yet
Utilization of Electrical Energy: CHAPTER 4: Electric Traction L-4-3
21 pages
Decision Transer Material and Accessorie For DWDM100G in NOR, CEN, ART, OUE3
No ratings yet
Decision Transer Material and Accessorie For DWDM100G in NOR, CEN, ART, OUE3
5 pages
Organization of Multiprocessor Systems
No ratings yet
Organization of Multiprocessor Systems
87 pages
Manual English Movicon11 Programmer Guide
100% (1)
Manual English Movicon11 Programmer Guide
662 pages
Electronics World 1968 08
100% (1)
Electronics World 1968 08
84 pages
What Is "Datapath"?
No ratings yet
What Is "Datapath"?
15 pages
Ilovepdf Merged
No ratings yet
Ilovepdf Merged
430 pages
Computer-Architecture Hari Aryal Ioe
No ratings yet
Computer-Architecture Hari Aryal Ioe
163 pages
ADAMView Quick Start
No ratings yet
ADAMView Quick Start
30 pages
DLD Module 7 Print
No ratings yet
DLD Module 7 Print
13 pages
Io Organization
No ratings yet
Io Organization
20 pages
Woody 2000 Final Project
100% (4)
Woody 2000 Final Project
15 pages
Instruction Set Architecture and Design
No ratings yet
Instruction Set Architecture and Design
27 pages
RTK 725B Series Instruction Manual
No ratings yet
RTK 725B Series Instruction Manual
209 pages
IMREE - Practical
No ratings yet
IMREE - Practical
12 pages
ICT Skills For Business (ICT101)
No ratings yet
ICT Skills For Business (ICT101)
7 pages
DLD Module 7
No ratings yet
DLD Module 7
19 pages
How Do I Edit A Thesis
No ratings yet
How Do I Edit A Thesis
20 pages
In-Text Format - Concept Paper
No ratings yet
In-Text Format - Concept Paper
3 pages
Ar20 Ecad
No ratings yet
Ar20 Ecad
2 pages
Problem Set 2
No ratings yet
Problem Set 2
4 pages
7300 CXT SS
No ratings yet
7300 CXT SS
2 pages
Writenow! Isp Programmer Device List: Updated October 2018
No ratings yet
Writenow! Isp Programmer Device List: Updated October 2018
59 pages
Como Se Usa El Accessdiver para Sacar Pases de Adultos
No ratings yet
Como Se Usa El Accessdiver para Sacar Pases de Adultos
6 pages
3 Vlsi Design (Elective III)
No ratings yet
3 Vlsi Design (Elective III)
1 page
AD Numeric Display Beng
No ratings yet
AD Numeric Display Beng
2 pages
A
No ratings yet
A
9 pages
CSFB (Circuit Switch Fall Back) Explanation & Optimization - Our Technology Planet
No ratings yet
CSFB (Circuit Switch Fall Back) Explanation & Optimization - Our Technology Planet
6 pages
Amphion Level 1
100% (2)
Amphion Level 1
544 pages
Future Intelligent Mobility With Adaptive AUTOSAR - Transforming Vehicle E/E Architecture
No ratings yet
Future Intelligent Mobility With Adaptive AUTOSAR - Transforming Vehicle E/E Architecture
25 pages
Xii Cs 2m 3m 5m Important
No ratings yet
Xii Cs 2m 3m 5m Important
2 pages
Delivery Advices
No ratings yet
Delivery Advices
2 pages

Performance

Uploaded by

Performance

Uploaded by

Performance Metrics

Why study performance metrics?

What you should get out of this discussion

Execution time: time to execute one program from beginning to

Throughput: total amount of work completed in a given time

(Often improving execution time will improve throughput & vice

Component metrics: subsystem performance, e.g., memory

Performancea = 1 / (Execution Timea)

Processor A is faster than processor B, i.e.,

Execution TimeA < Execution TimeB

performance of A is n times greater than B

CPUExecutionTime = CPUClockCycles * ClockCycleTime

Cycle time (clock period) is measured in time or rate

CPUExecutionTime = CPUClockCycles / ClockCycleRate

• clock cycle rate of 1 MHz = cycle time of 1 µs

CPUClockCycles = NumberOfInstructions * CPI

Average number of clock cycles per instruction

Can have different CPIs for classes of instructions

where CPIi = CPI for a particular class of instructions

Improving part of the architecture can improve a CPIi

Factors are interdependent:

MFLOPS (millions of floating point operations per second)

static metrics (code size)

FP Ops Time (secs)

Computer A Computer B Computer C

FP Ops Rate (FLOPS)

Computer A Computer B Computer C

Computer C is ~25 times faster than A when measuring execution

Speedup = Execution TimebeforeImprovement /

You might also like