0% found this document useful (0 votes)
15 views21 pages

Lec 3

Uploaded by

medo.losy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views21 pages

Lec 3

Uploaded by

medo.losy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

Performance Evaluation

Why Performance Evaluation?

DESIGN

EVALUATION

2
Performance Evaluation Concepts
• Computer performance evaluation is based on:
– Throughput
– Execution time
• Processor performance evaluation
– Based on execution time of a program:
PerformanceX = 1/ Execution timeX
Relative performance: n = PerformanceX / PerformanceY
n > 1 => X is n times faster than Y
• Terminology
– Improve performance = increase performance
– Improve execution time = decrease execution time
Example
• Impact on throughput and Execution time
of:
– Using a faster processor
• Decrease in Execution time
• Increase in throughput
– Adding more processors to a system
• Increase in throughput
• Decrease in Execution time if the program parts can
run in parallel (if overhead is low)
CPU Performance
• CPU performance is measured in terms of time!

• CPU time deals with


– Number of instructions executed to complete a
job
– How many clock cycles are needed to execute a
single instruction
– The length of the clock cycle (clock cycle time)
5
CPU Performance Equation
• CPU time = CPU clock cycles * cycle time
= CPU clock cycles / clock rate
– CPU clock cycles = IC * CPI
• IC: instruction count (number of instructions per program)
• CPI: average cycles per instruction

CPU time = IC * CPI * cycle time


seconds instructions clock cycles seconds
* *
=
program program instruction clock cycle
Scope of Performance Sources
CPU time = IC * CPI * Cycle time
Program

Compiler

ISA

Microarchitecture

Hardware

Abstraction level interdependence


Scope of Performance Sources
CPU time = IC * CPI * Cycle time
Program

Compiler

ISA

Microarchitecture

Hardware

Abstraction level interdependence


Scope of Performance Sources
CPU time = IC * CPI * Cycle time
Program

Compiler

ISA

Microarchitecture

Hardware

Abstraction level interdependence


Example 1
• A program runs in 10 secs on a 3.5 GHz processor.
A designer wants to build a new computer that can
run the program in 6 secs by increasing the clock
frequency. However the average “new” CPI will be
1.2 times higher.
• What faster clock rate should the designer use?

10 IC * CPI / 3.5 GHz (current exec. time)


=
6 IC * 1.2 CPI / X GHz (target execution time)
Solve for X, to obtain X = ___ GHz
X= 1.2*3.5*10/6
CPU Performance Equation

• CPU clock cycles = Σi (CPIi * ICi)


– ICi : count of instructions of class i
– CPIi : cycles that takes to execute instructions of class i
Example 2
• Comparing two compiler code segments
Instruction class i CPIi for instruction class i
A 1
B 2
C 3

Code Instruction counts (ICi) for instruction class


sequence A B C
1 2 1 2
2 4 1 1

• Which will be faster? S1 = 2 . 1 + 1 . 2 + 2 . 3 = 10 cyc


S2 = 4 . 1 + 1 . 2 + 1 . 3 = 9 cyc
Example 3
• Suppose we have 2 implementations of the same
instruction set architecture. Computer A has a clock cycle
time of 10 nsec and a CPI of 2.0 for some program, and
computer B has a clock cycle time of 20 nsec and a CPI of
1.2 for the same program. Which machine is faster for this
program?
• Answer:
Assume the program requires I instructions to be executed:
CPU time A = I × 2.0 × 10 nsec = 20 × I nsec
CPU timeB = I × 1.2 × 20 nsec = 24 × I nsec
∴ Computer A is faster than computer B.
Example 4
• Machine A runs a program in 20 seconds ‰
machine B runs the same program in 25
seconds ‰
how many times faster is machine
A? 25 /20 = 1.25
Example 5
Assume that a benchmark has 100
instructions: 25 instructions are loads/stores
(each take 2 cycles) 50 instructions are adds
(each takes 1 cycle) 25 instructions are
square root (each takes 50 cycles)
What is the CPI for this benchmark?
CPI = ((0.25 * 2) + (0.50 * 1) + (0.25 * 50))
= 13.5
Amdahl’s Law
• Law of diminishing returns for acceleration
Execution time before improvement
Speedup =
Execution time after improvement

Example: Two cases - F1: 25% improved 2X, F2: 50% improved 2X

75 percent not improved F1

50 percent n.i. F2
Time →
1
Speedup =
(1 - fraction enhanced) + (fraction enhanced/factor of improvement)
Example
• A program runs in 10 seconds
• What is the speedup after a faster floating
point unit is incorporated?

FP Unit 5 times faster 5 seconds in FP operations


6 2
1

Speedup
Speedup = 1.9
5
(1 – Fraction FP) + FractionFP 1.8
4 5 1.7
1.6
3 1.5 1
Speedup =
2
1.4 0.5
1.3 0.5 + FP factor of improvement
1 1.2
Fraction of FP 1.1 FP Unit im provem ent
0 1
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Example
• A program runs in 10 seconds
• What is the speedup after a faster floating
point unit is incorporated?

FP Unit 5 times faster 5 seconds in FP operations


6 2
1

Speedup
Speedup = 1.9
5
(1 – Fraction FP) + FractionFP 1.8
4 5 1.7
1.6
3 1.5 1
Speedup =
2
1.4 0.5
1.3 0.5 + FP factor of improvement
1 1.2
Fraction of FP 1.1 FP Unit im provem ent
0 1
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Amdahl’s Law - example
• A program runs in 100 seconds on a computer, with
multiply operations responsible for 80 seconds of this time
• How much do I have to improve the speed of
multiplication, if I want my program to run 5 times faster?
• Timeimproved = Timeunaffected + Timeaffected/(Improvement
Factor)
• 20 s = 20 s + 80 s / n
• 0 = 80 s / n
• There is no amount by which we can improve multiply to
achieve a fivefold increase in performance!

19
Amdahl’s Law
• On a large system, suppose we can upgrade a
CPU to make it 50% faster for $10,000 or
upgrade its disk drives for $7,000 to make them
150% faster.
• Processes spend 70% of their time running in the
CPU and 30% of their time waiting for disk
service.
• An upgrade of which component would offer the
greater benefit for the lesser cost?
• The processor option offers a 30% speedup:

• And the disk drive option gives a 22% speedup:


• Each 1% of improvement for the processor costs
$333 (10000/30), and for the disk a 1%
improvement costs $318 (7000/22).

You might also like