0% found this document useful (0 votes)

95 views30 pages

4 Perfrmance

Computer performance is affected by many factors including technology, organization, software, and number of processors. Performance can be measured in execution time, throughput, and CPU execution time. Execution time includes all time spent on a task while CPU execution time only includes time spent computing. Performance is defined as the inverse of execution time. Comparing performance between machines requires using the same program or benchmark. The clock rate, clock cycle time, instructions per program, and cycles per instruction all factor into calculating a computer's execution time and performance.

Uploaded by

3sfr3sfr

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

95 views30 pages

4 Perfrmance

Uploaded by

3sfr3sfr

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 30

Computer Performance

Performance of Computer Systems

 Many different factors among which:
 Technology
 Raw speed of the circuits (clock, switching time)
 Process technology (how many transistors on a
chip)
 Organization
 What type of processor (e.g., RISC vs. CISC)
 What type of memory hierarchy
 What types of I/O devices
 How many processors in the system
 Software
 O.S., compilers, database drivers etc
Definitions of Time
 Time can be defined in different ways, depending on
what we are measuring:
 Execution time : The time between the start and
completion of a task. It includes time spent executing on
the CPU, accessing disk and memory, waiting for I/O and
other processes, and operating system overhead.
 Throughput :The total amount of work done in a given
time.
 CPU execution time : Total time a CPU spends
computing on a given task (excludes time for I/O or
running other programs). This is also referred to as simply
CPU time.
Performance Definition
 For some program running on machine X,
Performance = 1 / Execution timeX
 "X is n times faster than Y" PerformanceX
/ PerformanceY = n
Problem:
 machine A runs a program in 20 seconds

 machine B runs the same program in 25

seconds
 how many times faster is machine A?

25
20 = 1.25
Basic Measurement
Comparing Machines
Metrics


 Metrics
 Execution time
 Throughput

 CPU time

 MIPS – millions of instructions per second

 MFLOPS – millions of floating point operations per second

 Comparing Machines Using Sets of Programs

 Arithmetic mean, weighted arithmetic mean

 Benchmarks

 When discussing processor performance, we will focus primarily on

execution time for a single job - why?
Because different programs have different characteristics (tasks)
Only compare processors using the same task.
Computer Clock
 A computer clock runs at a constant rate and determines
when events take placed in hardware.
Clk

clock period

 The clock cycle time is the amount of time for one

clock period to elapse (e.g. 5 ns).
 The clock rate is the inverse of the clock cycle time.
 For example, if a computer has a clock cycle time of 5
ns, the clock rate is:
1
---------------------- = 200 MHz
5 x 10-9 sec
How Many Cycles are Required for a
Program?
 Could assume that # of cycles = # of instructions

2nd instruction
3rd instruction
1st instruction

4th

5th

6th
...
time

 This assumption is incorrect, different instructions take different

amounts of time on different machines.
Different Numbers of Cycles for Different
Instructions

time

 Division takes more time than addition

 Floating point operations take longer than integer ones
 Accessing memory takes more time than accessing registers
Now That We Understand Cycles

 A given program will require

 some number of instructions (machine instructions)
 some number of clock cycles
 some number of seconds
 We have a vocabulary that relates these quantities:
 clock cycle time (seconds per cycle)
 clock rate (cycles per second)
 CPI (cycles per instruction)
 a floating point intensive application might have a higher CPI
Computing CPU Time
 The time to execute a given program can be computed as
CPU time = CPU clock cycles x clock cycle time
 Since clock cycle time and clock rate are reciprocals
CPU time = CPU clock cycles / clock rate
 The number of CPU clock cycles can be determined by
CPU clock cycles = (instructions/program) x (clock cycles/instruction)
= Instruction count x CPI
which gives
CPU time = Instruction count x CPI x clock cycle time
CPU time = Instruction count x CPI / clock rate
 The units for CPU time are
instructions clock
CPU time = cycles x x
program instruction clock cycle
seconds
Which factors are affected by each of the
following?
instr. Count CPI clock rate
Program X

Compiler X X

X
Instr. Set Arch. X

X
Organization X

Technology X

CPU time = Seconds = Instructions x Cycles x Seconds

Program Program Instruction Cycle
CPU Time Example
 Example 1:
 CPU clock rate is 1 MHz
 Program takes 45 million cycles to execute
 What’s the CPU time?

45,000,000 * (1 / 1,000,000) = 45 seconds

 Example 2:
 CPU clock rate is 500 MHz
 Program takes 45 million cycles to execute
 What’s the CPU time

45,000,000 * (1 / 500,000,000) = 0.09 seconds

Example
 Example: Let assume that a benchmark has 100 instructions:
25 instructions are loads/stores (each take 2 cycles)
50 instructions are adds (each takes 1 cycle)
25 instructions are square root (each takes 50 cycles)

The total number of cycles executed

is 2 * 25 + 50* 1 + 25 *50 = 1350 cycles.
Or we can compute average CPI which is

Average CPI = ((0.25 * 2) + (0.50 * 1) + (0.25 * 50)) = 13.5

Then the total number of cycles is Instruction count * CPI

13.5 * 100 = 1350 cycles.

If clock rate is 1 Khz then execution tine is 1350/1000 = 1.3

seconds.
Computing CPI
 The CPI is the average number of cycles per instruction.
 If for each instruction type, we know its frequency and
number of cycles need to execute it, we can compute the
overall average CPI as follows:

Average CPI = Σ CPI x F

 For example
Op F CPI CPI x F % Time
ALU 50% 1 .5 23%
Load 20% 5 1.0 45%
Store 10% 3 .3 14%
Branch 20% 2 .4 18%
Total 100% 2.2 100%
Performance
 Performance is determined by execution time
 Do you think any of the variables is sufficient enough to
determine computer performance?
 # of cycles to execute program?
 # of instructions in program?
 # of cycles per second?
 average # of cycles per instruction?
 average # of instructions per second

 It is not true to think that one of the variables is

indicative of performance.
Performance Example
 Suppose we have two implementations of the same
instruction set architecture (ISA).
For some program,
Machine A has a clock cycle time of 10 ns. and a
CPI of 2.0
Machine B has a clock cycle time of 20 ns. and a
CPI of 1.2

 Which machine is faster for this program, and by

how much? 9
CPU TimeA = 10 * 2.0 * 10 * 10 = 20 seconds
-9
Machine A is faster
CPU TimeB = 109 * 1.2 * 20 * 10-9 = 24 seconds
Assume that # of instructions in the program is 1,000,000,000.
24
20 = 1.2 times
Number of Instructions Example
 A compiler designer is trying to decide between two code sequences
for a particular machine. Based on the hardware implementation,
there are three different classes of instructions: Class A, Class B,
and Class C, and they require one, two, and three cycles
(respectively).
The first code sequence has 5 instructions: 2 of A, 1 of B, and 2 of C
The second sequence has 6 instructions: 4 of A, 1 of B, and 1 of C.
 Which sequence will be faster? How much?
 What is the CPI for each sequence?
# of cycles for first code = (2 * 1) + (1 * 2) + (2 * 3) = 10 cycles
# of cycles for second code = (4 * 1) + (1 * 2) + (1 * 3) = 9 cycles
CPI for first code = 10 / 5 = 2
10 / 9 = 1.11 times
CPI for second code = 9 / 6 = 1.5
Poor Performance Metrics
 Marketing metrics for computer performance included MIPS and
MFLOPS
 MIPS : millions of instructions per second
 MIPS = instruction count / (execution time x 106)
 = clock rate/(CPI x 106)
 For example, a program that executes 3 million instructions in 2

seconds has a MIPS rating of 1.5

 Advantage : Easy to understand and measure

 Disadvantages : May not reflect actual performance, since simple

instructions do better.
 MFLOPS : millions of floating point operations per second
 MFLOPS = floating point operations / (execution time x 106)
 For example, a program that executes 4 million fp. instructions in 5

seconds has a MFLOPS rating of 0.8

 Advantage : Easy to understand and measure

 Disadvantages : Same as MIPS, only measures floating point

MIPS
Example
 Two different compilers are being tested for a 500 MHz.
machine with three different classes of instructions:
Class A, Class B, and Class C, which require one, two,
and three cycles (respectively). Both compilers are
used to produce code for a large piece of software.
The first compiler's code uses 5 billions Class A
instructions, 1 billion Class B instructions, and 1 billion
Class C instructions.
The second compiler's code uses 10 billions Class A
instructions, 1 billion Class B instructions, and 1 billion
Class C instructions.
 Which sequence will be faster according to MIPS?
 Which sequence will be faster according to execution
time?
MIPS Example
(Con’t) Instruction counts (in billions)
for each instruction class
Code from A B C
Compiler 1 5 1 1
Compiler 2 10 1 1

CPU Clock cycles1 = (5 x 1 + 1 x 2 + 1 x 3) x 109 = 10 x 109

CPU Clock cycles2 = (10 x 1 + 1 x 2 + 1 x 3) x 109 = 15 x 109

CPU time1 = 10 x 109 / 500 x 106 = 20 seconds

CPU time2 = 15 x 109 / 500 x 106 = 30 seconds

MIPS1 = (5 + 1 + 1) x 109 / 20 x 106 = 350

MIPS2 = (10 + 1 + 1) x 109 / 30 x 106 = 400
Another Performance Example
 Computers M1 and M2 are two implementations of the same
instruction set.
 M1 has a clock rate of 50 MHz and M2 has a clock rate of
100 MHz.
 M1 has a CPI of 2.8 and M2 has a CPI of 3.2 for a given
program.
ExTimeM1 ICM1 x CPIM1 / Clock RateM1 2.8/50
 How many times faster is M2 than M1 for this program?
= = 1.75
=
ExTimeM2 ICM2 x CPIM2 / Clock RateM2 3.2/100

 What would the clock rate of M1 have to be for them to have the
same execution time?

2.8 / Clock RateM1 = 3.2 / 100 Clock RateM1 = 87.5 MHz

Performance Summary
 The two main measure of performance are
 execution time : time to do the task
 throughput : number of tasks completed per unit time
 Performance and execution time are reciprocals.
Increasing performance, decreases execution time.
 The time to execute a given program can be computed as:
CPU time = Instruction count x CPI x clock cycle time
CPU time = Instruction count x CPI / clock rate
 These factors are affected by compiler technology, the
instruction set architecture, the machine organization, and
the underlying technology.
 When trying to improve performance, look at what
occurs frequently => make the common case fast.
Computer Benchmarks

 A benchmark is a program or set of programs used to

evaluate computer performance.
 Benchmarks allow us to make performance comparisons
based on execution times
 Benchmarks should
 Be representative of the type of applications run on the computer
 Not be overly dependent on one or two features of a computer
 Benchmarks can vary greatly in terms of their complexity
and their usefulness.
Amdahl's
Law
Speedup due to an enhancement is defined as:


ExTime old
Performance new Speedup = =
ExTime new
Performance old
 Suppose that an enhancement
accelerates a fraction
enhanced of the task by a
 Fraction Fractionenhanced
factor
ExTimenew = ExTime
Speedup x (1 - Fractionenhanced) +
oldenhanced

Speedupenhanced

ExTimeold 1
Speedup =
ExTimenew Fractionenhanced
= (1 - Fractionenhanced ) +
Speedupenhanced
Example of Amdahl’s Law
 Floating point instructions are improved to run twice as fast,
but only 10% of the time was spent on these instructions
originally. How much faster is the new machine?

Speedup = ExTimeold 1
ExTimenew Fractionenhanced
= (1 - Fractionenhanced ) +
Speedupenhanced
1
Speedup = 1.053
= (1 - 0.1) + 0.1/2
 The new machine is 1.053 times as fast, or 5.3% faster.
 How much faster would the new machine be if floating
point instructions become 100 times faster?
1
Speedup = 1.109
= (1 - 0.1) + 0.1/100
Another Example
Execution Time After Improvement =
Execution time unaffected + (Execution time affected/ improvement)

Example:
program runs for 100 seconds on a machine with multiply
responsible for 80 seconds of this time. If the multiply unit is made 4 times
faster. Then what is the overall speedup.

new execution time = 20+80/4 = 40 seconds

speedup = 100/40 = 2.5 times.
How fast the multiply must be to get 4,5 and 6 times speedup?
The maximum speedup that can be a chieved is 1/fraction not affected
In this case 1/.2 = 5 times. That is only possible with infinite speedup for
the multiply (takes 0 time to execute)
Example Continues
 Assume that a program runs in 100 seconds on a
machine, with multiply operations responsible for 80
seconds. How much do I have to improve the speed of
multiplication if I want my program to run 2 times faster.
Execution time after improvement =
Execution time affected by improvement
+ Execution time unaffected
Amount of improvement

80 seconds
50 seconds = + (100 – 80 seconds)
n
80 seconds
n= = 2.67
30 seconds
Summary of Performance Evaluation
 Good benchmarks, such as the SPEC benchmarks, can
provide an accurate method for evaluating and
comparing computer performance.
 MIPS and MFLOPS are easy to use, but inaccurate
indicators of performance.
 Amdahl’s law provides an efficient method for
determining speedup due to an enhancement.
 Make the common case fast!

Okuma - Osp - E100m - E10m - Alarm Erros List - ME37005R4E100MAlarm883220150816 PDF
100% (3)
Okuma - Osp - E100m - E10m - Alarm Erros List - ME37005R4E100MAlarm883220150816 PDF
588 pages
COMP 303 Computer Architecture
No ratings yet
COMP 303 Computer Architecture
34 pages
Ilovepdf - Merged (4) 36 274
No ratings yet
Ilovepdf - Merged (4) 36 274
120 pages
L-2 (Computer Performance)
No ratings yet
L-2 (Computer Performance)
52 pages
Cse - 321 - 2
No ratings yet
Cse - 321 - 2
37 pages
Module 2 (26-10-2024)
No ratings yet
Module 2 (26-10-2024)
50 pages
C A Lecture-3
No ratings yet
C A Lecture-3
41 pages
Performance
No ratings yet
Performance
51 pages
Performance Measures For Computers
No ratings yet
Performance Measures For Computers
53 pages
Lecture 02 CH01 Performance Power
No ratings yet
Lecture 02 CH01 Performance Power
76 pages
DHXD - Chuong 8. Performance
No ratings yet
DHXD - Chuong 8. Performance
27 pages
CS322 - Computer Architecture (CA) : Spring 2019 Section V3
No ratings yet
CS322 - Computer Architecture (CA) : Spring 2019 Section V3
56 pages
Performances of Computer Systems: CSE 675.02: Introduction To Computer Architecture
No ratings yet
Performances of Computer Systems: CSE 675.02: Introduction To Computer Architecture
52 pages
Lecture4 Performance Evaluation 2011
No ratings yet
Lecture4 Performance Evaluation 2011
34 pages
Assessing and Understanding Performance
No ratings yet
Assessing and Understanding Performance
31 pages
Week 13 14 - Performance Evaluation
No ratings yet
Week 13 14 - Performance Evaluation
19 pages
Performance: Computer Architecture and Assembly Language Dr. Aiman El-Maleh
No ratings yet
Performance: Computer Architecture and Assembly Language Dr. Aiman El-Maleh
25 pages
2 CPU Performance
No ratings yet
2 CPU Performance
35 pages
23-Performance Parameters-21-02-2023
No ratings yet
23-Performance Parameters-21-02-2023
16 pages
CSE 332 L4 - 14 Nov 2020
No ratings yet
CSE 332 L4 - 14 Nov 2020
41 pages
Measuring Performance: Chris Clack B261 Systems Architecture
No ratings yet
Measuring Performance: Chris Clack B261 Systems Architecture
19 pages
The Role of Performance: Chapter - 2
No ratings yet
The Role of Performance: Chapter - 2
40 pages
Lecture # 2
No ratings yet
Lecture # 2
33 pages
09 Perf
No ratings yet
09 Perf
22 pages
M116C 1 M116C 1 Lect02-Performance
No ratings yet
M116C 1 M116C 1 Lect02-Performance
23 pages
Lect 1
No ratings yet
Lect 1
54 pages
Measuring Computer Performance
No ratings yet
Measuring Computer Performance
26 pages
Performance Measures
No ratings yet
Performance Measures
25 pages
Week 10 Part 02 - Processor Performance (Q Only) - Tagged 2
No ratings yet
Week 10 Part 02 - Processor Performance (Q Only) - Tagged 2
23 pages
CS104: Computer Organization: Lecture 08, 2 March 2020
No ratings yet
CS104: Computer Organization: Lecture 08, 2 March 2020
21 pages
Computer Performance
No ratings yet
Computer Performance
17 pages
ACA Lec2 New
No ratings yet
ACA Lec2 New
44 pages
2 - Computer Organization and Architecture
No ratings yet
2 - Computer Organization and Architecture
21 pages
SEN307 Lecture 5
No ratings yet
SEN307 Lecture 5
34 pages
Week 2 - Lecture 2 - Performance Measurement
No ratings yet
Week 2 - Lecture 2 - Performance Measurement
25 pages
Week 10 Part 02 - Processor Performance (Answers)
No ratings yet
Week 10 Part 02 - Processor Performance (Answers)
35 pages
Computer Performance
No ratings yet
Computer Performance
18 pages
02 Performance
No ratings yet
02 Performance
13 pages
Lecture4 Performance Evaluation
No ratings yet
Lecture4 Performance Evaluation
34 pages
Numerical Performance
No ratings yet
Numerical Performance
12 pages
Computer Performance
No ratings yet
Computer Performance
22 pages
CS322 - Computer Architecture (CA) : Spring 2019 Section V3
No ratings yet
CS322 - Computer Architecture (CA) : Spring 2019 Section V3
52 pages
Lesson 3 - Computing For Performance
No ratings yet
Lesson 3 - Computing For Performance
38 pages
SEN307 Lecture 8
No ratings yet
SEN307 Lecture 8
16 pages
Performance
No ratings yet
Performance
23 pages
PS1 Exercises
No ratings yet
PS1 Exercises
32 pages
Lect 1
No ratings yet
Lect 1
56 pages
550 12 6 2011 PDF
No ratings yet
550 12 6 2011 PDF
45 pages
Computer Organization The Role of Performance
No ratings yet
Computer Organization The Role of Performance
45 pages
Computer Organization and Architecture (AT70.01)
No ratings yet
Computer Organization and Architecture (AT70.01)
29 pages
Comp Org Notes On Measuring Cpu Performance
No ratings yet
Comp Org Notes On Measuring Cpu Performance
4 pages
0measuring Performance PDF
No ratings yet
0measuring Performance PDF
15 pages
Performance of Processor1
No ratings yet
Performance of Processor1
9 pages
Chapter 1 Performance
No ratings yet
Chapter 1 Performance
32 pages
COD Ch. 2 The Role of Performance
No ratings yet
COD Ch. 2 The Role of Performance
28 pages
Module 3.3 - Problems On Performance
No ratings yet
Module 3.3 - Problems On Performance
54 pages
A Constant Clock Rate:: - Most Computers Run Synchronously Utilizing A CPU Clock Running at
No ratings yet
A Constant Clock Rate:: - Most Computers Run Synchronously Utilizing A CPU Clock Running at
45 pages
Unit 2 Performance
No ratings yet
Unit 2 Performance
6 pages
Lecture Ch4 Performance
No ratings yet
Lecture Ch4 Performance
25 pages
DSP Presentation Overview For Class
100% (1)
DSP Presentation Overview For Class
71 pages
23it1201-Digital Logic and Computer Organization - QB
No ratings yet
23it1201-Digital Logic and Computer Organization - QB
5 pages
CSC 102 +++
No ratings yet
CSC 102 +++
75 pages
TYBSC CS Cloud Computing
No ratings yet
TYBSC CS Cloud Computing
95 pages
Concurrency in Python Tutorial
No ratings yet
Concurrency in Python Tutorial
28 pages
Unit 1-Daa
No ratings yet
Unit 1-Daa
9 pages
5 Singlecycle
No ratings yet
5 Singlecycle
60 pages
Mcs 202 2021 To 2024
No ratings yet
Mcs 202 2021 To 2024
32 pages
Mi QB Even 2023
No ratings yet
Mi QB Even 2023
31 pages
Part 02 - Web Design - Advance CSS
No ratings yet
Part 02 - Web Design - Advance CSS
69 pages
1 Introduction
No ratings yet
1 Introduction
40 pages
3 Integer Arithmetic
No ratings yet
3 Integer Arithmetic
40 pages
Examining Object Code Lab Exercise
No ratings yet
Examining Object Code Lab Exercise
5 pages
Lab 1
No ratings yet
Lab 1
8 pages
John Von Neumann: Arithmetic Logic Unit (ALU)
No ratings yet
John Von Neumann: Arithmetic Logic Unit (ALU)
3 pages
Advanced Processor
No ratings yet
Advanced Processor
85 pages
The 8086 Input / Output Interface: Yamama A. Shafeek & Noor A. Yousif University of Technology Iraq, Baghdad
100% (1)
The 8086 Input / Output Interface: Yamama A. Shafeek & Noor A. Yousif University of Technology Iraq, Baghdad
17 pages
CA EC208 Assignment1 Feb 2024
No ratings yet
CA EC208 Assignment1 Feb 2024
2 pages
Hacking Windows Ce
No ratings yet
Hacking Windows Ce
29 pages
Detailed Review of The 8085 Instruction Set
100% (2)
Detailed Review of The 8085 Instruction Set
26 pages
STM32L4 Memory Flash
No ratings yet
STM32L4 Memory Flash
35 pages
Lecture-1 (Intro To Microprocessors)
No ratings yet
Lecture-1 (Intro To Microprocessors)
21 pages
1 CPE 413 Overview of x86 Architecture-1
No ratings yet
1 CPE 413 Overview of x86 Architecture-1
60 pages
Instruction Sets: Computer Architecture Taxonomy. Assembly Language
No ratings yet
Instruction Sets: Computer Architecture Taxonomy. Assembly Language
16 pages
Chap 4 1
No ratings yet
Chap 4 1
57 pages
Full PSPC Theory Complete C Programming Notes
No ratings yet
Full PSPC Theory Complete C Programming Notes
60 pages
Cdee Computer Science Upload 2015SM 3
No ratings yet
Cdee Computer Science Upload 2015SM 3
27 pages
Verilog Lab 201101 PDF
No ratings yet
Verilog Lab 201101 PDF
28 pages
Ss CD Module 1 Presentation Notes
No ratings yet
Ss CD Module 1 Presentation Notes
55 pages
Pre-Lab5 Digital Design
No ratings yet
Pre-Lab5 Digital Design
5 pages
TAM: An Abstract Machine Specification in Z: TAM: Una Especificación de Máquina Abstracta en Z
No ratings yet
TAM: An Abstract Machine Specification in Z: TAM: Una Especificación de Máquina Abstracta en Z
22 pages
Sequential, Parallel and Distributed Algorithms
No ratings yet
Sequential, Parallel and Distributed Algorithms
18 pages
ALU Design
No ratings yet
ALU Design
5 pages
Python Advanced Programming: The Guide to Learn Python Programming. Reference with Exercises and Samples About Dynamical Programming, Multithreading, Multiprocessing, Debugging, Testing and More
From Everand
Python Advanced Programming: The Guide to Learn Python Programming. Reference with Exercises and Samples About Dynamical Programming, Multithreading, Multiprocessing, Debugging, Testing and More
Marcus Richards
No ratings yet
Foundation Course for Advanced Computer Studies
From Everand
Foundation Course for Advanced Computer Studies
Franck Ismael Djédjé
No ratings yet