0% found this document useful (0 votes)
53 views32 pages

Chapter 1 Performance

Using the performance equation: CPU time = Instruction count × CPI × Clock cycle time For machine A: CPU time = Instructions × 2.0 × 0.250 ns = 0.5 Instructions-ns For machine B: CPU time = Instructions × 1.2 × 0.500 ns = 0.6 Instructions-ns Machine A is faster than Machine B by a factor of 0.6/0.5 = 1.2x So machine A is 20% faster than machine B for this program.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
53 views32 pages

Chapter 1 Performance

Using the performance equation: CPU time = Instruction count × CPI × Clock cycle time For machine A: CPU time = Instructions × 2.0 × 0.250 ns = 0.5 Instructions-ns For machine B: CPU time = Instructions × 1.2 × 0.500 ns = 0.6 Instructions-ns Machine A is faster than Machine B by a factor of 0.6/0.5 = 1.2x So machine A is 20% faster than machine B for this program.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 32

Chapter 1

The Role of Performance

1
Outline
 Defining Performance

 Measuring Performance

 CPU Performance & It’s Factors

2
Performance

 Performance is the key to understanding underlying


motivation for the hardware and its organization
 Measure, report, and summarize performance to enable
users to
 make intelligent choices
 see through the marketing hype!

 Why is some hardware better than others for different programs?


 What factors of system performance are hardware related?
(e.g., do we need a new machine, or a new operating system?)
 How does the machine's instruction set affect performance?
3
What do we measure?
Define performance….

Figure lists some typical passenger airplanes, together with


their cruising speed, range, capacity and throughput.

If we wanted to know which of the planes in this table had


the best performance, we would first need to define
performance
4
What do we measure?
Define performance….

 Define Performance in Terms of Speed ?


 Define Performance in Terms of passenger throughput ?

 So which of these airplanes has the best performance?!

So we can define performance in several different ways 5


Computer Performance:
TIME, TIME, TIME!!!
 Response Time (elapsed time, latency):
 how long does it take for my job to run?
 how long does it take to execute (start to Individual user
finish) my job? concerns…

 how long must I wait for the database query?


 Throughput:
 how many jobs can the machine run at once?
Systems manager
 what is the average execution rate? concerns…
 how much work is getting done?

 If we upgrade a machine with a new processor what do we increase?


 If we add a new machine to the lab what do we increase? 6
Computer Performance:
TIME, TIME, TIME!!!

7
Execution Time
 Elapsed Time
 counts everything (disk and memory accesses, waiting for I/O,
running other programs, etc.) from start to finish
 a useful number, but often not good for comparison purposes
elapsed time = CPU time + wait time (I/O, other programs, etc.)

 CPU time
 doesn't count waiting for I/O or time spent running other programs
 can be divided into user CPU time and system CPU time (OS calls)
CPU time = user CPU time + system CPU time
 elapsed time = user CPU time + system CPU time + wait time

 Our focus: user CPU time (CPU execution time or, simply,
execution time)
 time spent executing the lines of code that are in our program

8
System Time & User Time

• System time - This is the time that the CPU was used for executing
system calls. The CPU time spent in the operating system performing
tasks on behalf of the program. You can think of I/O operations,
context switches, inter process communication, memory management,
interrupt requests, etc.

• User Time - This is the time the CPU spent running your code. It is called
user time because the CPU is used by an operation in a program that a
user has started.

9
Definition of Performance(1)
 For some program running on machine X:

PerformanceX = 1 / Execution timeX

 X is n times faster than Y means:

PerformanceX / PerformanceY = n

10
Definition of Performance(2)

11
Relative Performance

12
Clock Cycles
 Instead of reporting execution time in seconds, we often use cycles. In
modern computers hardware events progress cycle by cycle: in other
words, each event, e.g., multiplication, addition, etc., is a sequence of
cycles

 Clock ticks indicate start and end of cycles:


cycle time
tick

tick

 cycle time = time between ticks = seconds per cycle


 clock rate (frequency) = cycles per second (1 Hz. = 1 cycle/sec, 1
MHz. = 106 cycles/sec) 1
 Example: A 200 Mhz. clock has a  10 9  5 nanoseconds
cycle time 200  10 6
13
Performance Equation I
seconds cycles seconds
 
program program cycle

equivalently

CPU execution time CPU clock cycles  Clock cycle time


for a program = for a program

 So, to improve performance one can either:


 reduce the number of cycles for a program, or
 reduce the clock cycle time, or, equivalently,
 increase the clock rate

14
How many cycles are required
for a program?
 Could assume that # of cycles = # of instructions
2nd instruction
3rd instruction
1st instruction

4th
5th
6th
...
time

 This assumption is incorrect! Because:


 Different instructions take different amounts of time (cycles)

15
How many cycles are required
for a program?
time

 Multiplication takes more time than addition


 Floating point operations take longer than integer ones
 Accessing memory takes more time than accessing registers
 Important point: changing the cycle time often changes the
number of cycles required for various instructions because it
means changing the hardware design.

16
Example
 Our favorite program runs in 10 seconds on computer A, which
has a 400Mhz. clock.
 We are trying to help a computer designer build a new machine
B, that will run this program in 6 seconds. The designer can
use new (or perhaps more expensive) technology to
substantially increase the clock rate, but has informed us that
this increase will affect the rest of the CPU design, causing
machine B to require 1.2 times as many clock cycles as machine
A for the same program.

 What clock rate should we tell the designer to target?

17
Example(Sol.) seconds

cycles
program program

seconds
cycle

• Our favorite program runs in 10


seconds on computer A, which has
a 4 GHz. clock. We are trying to
help a computer designer build a
new machine B, that will run this
program in 6 seconds. The
designer can use new (or perhaps
more expensive) technology to
substantially increase the clock
rate, but has informed us that this
increase will affect the rest of the
CPU design, causing machine B to
require 1.2 times as many clock
cycles as machine A for the same
program. What clock rate should
we tell the designer to target?"

18
CPU clock cycles

The term clock cycles per instruction, which is the average


number of clock cycles each instruction takes to execute, is often
abbreviated as CPI.

Different instructions may take different amounts of time depending


on what they do, CPI is an average of all the instructions executed
in the program

CPI provides one way of comparing two different implementations


of the same instruction set architecture, since the number of
instructions executed for a program will, of course, be the same.
19
Back to the Same Formula, CPI (Cycles/Instruction)

seconds cycles seconds


 
program program cycle

20
Performance Equation II

CPU execution time Instruction count  average CPI  Clock cycle time
=
for a program for a program

21
CPI Example I (Using Performance Equation)
• Suppose we have two implementations of the same
instruction set architecture (ISA). For some program:
– machine A has a clock cycle time of 250 ps. and a CPI of 2.0
– machine B has a clock cycle time of 500 ps. and a CPI of 1.2

• Which machine is faster for this program, and by how much?

22
CPI Example I (Using Performance Equation)

23
CPI Example II
 A compiler designer is trying to decide between two code
sequences for a particular machine.
 Based on the hardware implementation, there are three
different classes of instructions: Class A, Class B, and Class C,
and they require 1, 2 and 3 cycles (respectively).
 The first code sequence has 5 instructions:
2 of A, 1 of B, and 2 of C
The second sequence has 6 instructions:
4 of A, 1 of B, and 1 of C.

 Which sequence will be faster? How much? What is the CPI for each
sequence?

24
CPI Example II
Which sequence will be faster? How much?
A compiler designer is trying to decide
between two code sequences for a seconds cycles seconds
particular machine. Based on the  
program program cycle
hardware implementation, there are three
different classes of instructions: Class A,
Class B, and Class C, and they require one,
two, and three cycles (respectively).

What is the CPI for each sequence?

The first code sequence has 5


instructions: 2 of A, 1 of B, and 2
of C

The second sequence has 6


instructions: 4 of A, 1 of B, and 1
of C.
What is MIPS?

• Instruction execution rate => higher is better


• Issues:

– Can not compare processors with different instruction sets


since the instruction count will certainly differ.
– Varies between programs on the same processor hence a
processor can not have a single MIPS rating.

ECE369
26
MIPS Example

ECE369
27
Amdahl's Law

• The performance enhancement of an improvement is limited by how


much the improved feature is used. In other words: Don’t expect an
enhancement proportional to how much you enhanced something.

• Example:
"Suppose a program runs in 100 seconds on a machine, with
multiply operations responsible for 80 seconds of this time. How
much do we have to improve the speed of multiplication if we want
the program to run 4 times faster?"
How about making it 5 times faster?

ECE369
28
Amdahl’s Law

1. Speed up = 4
2. Old execution time = 100
3. New execution time = 100/4 = 25
4. If 80 seconds is used by the affected part =>
5. Unaffected part = 100-80 = 20 sec
6. Execution time new = Execution time
unaffected + Execution time affected /
Improvement
7. 25= 20 + 80/Improvement

8. Improvement = 16

ECE369
29
How about making it 5 times faster?

ECE369
30
Performance Measurement Overview
CPUtime  CPUclock _ cycles _ for _ the _ pogram  Clock _ Cycle _ Time

CPUclock _ cycles _ for _ the _ pogram


CPUtime 
Clock _ Rate

CPUclock _ cycles _ for _ the _ pogram


CPI 
IC

CPUtime  IC  CPI  Clock _ Cycle _ Time

IC  CPI
CPUtime 
Clock _ Rate

Seconds Instructions ClcokCycles Seconds


CPUtime    
Pr ogram Pr ogram Instruction ClockCycle

ECE369
31
Summary
 Performance is specific to a particular program
 total execution time is a consistent summary of performance
 For a given architecture performance increases come from:
 increases in clock rate (without adverse CPI affects)
 improvements in processor organization that lower CPI
 compiler enhancements that lower CPI and/or instruction count
 Pitfall: expecting improvement in one aspect of a machine’s
performance to affect the total performance
 You should not always believe everything you read! Read
carefully! See newspaper articles, e.g., Exercise 2.37!!

32

You might also like