Performance
Performance
perf
Performance of Computer Systems
Many different factors to take into account when determining
performance:
• Technology
• circuit speed (clock, MHz)
• processor technology (how many transistors on a chip)
• Organization
• type of processor (ILP)
• configuration of the memory hierarchy
• type of I/O devices
• number of processors in the system
• Software
• quality of the compilers
• organization & quality of OS, databases, etc.
perf
“Principles” of Experimentation
Meaningful metrics
execution time & component metrics that explain it
Reproducibility
machine configuration, compiler & optimization level, OS, input
Real programs
no toys, kernels, synthetic programs
SPEC is the norm (integer, floating point, graphics, webserver)
TPC-B, TPC-C & TPC-D for database transactions
Simulation
long executions, warm start to mimic steady-state behavior
usually applications only; some OS simulation
simulator “validation” & internal checks for accuracy
perf
Metrics that Measure Performance
Raw speed: peak performance (never attained)
perf
Execution Time
Relative Performance
PerformanceA / PerformanceB
=n
= ExecutionTImeB / ExecutionTimeA
perf
CPU Execution Time
The time the CPU spends executing an application
• no memory effects
• no I/O
• no effects of multiprogramming
perf
CPI
n
CPUClockCycles = ∑ (CPI i × C i )
1
perf
CPU Execution Time
CPUExecutionTime =
numberofInstructions * CPI * clockCycleTime
To measure:
• execution time: depends on all 3 factors
• time the program
• number of instructions: determined by the ISA
• programmable hardware counters
• profiling
• count number of times each basic block is executed
• instruction sampling
• CPI: determined by the ISA & implementation
• simulator: interpret (in software) every instruction &
calculate the number of cycles it takes to simulate it
• clock cycle time: determined by the implementation & process
technology
perf
Metrics Not to Use
MIPS (millions of instructions per second)
instruction count / execution time*10^6 =
clock rate / (CPI * 10^6)
- instruction set-dependent (even true for similar architectures)
- implementation-dependent
- compiler technology-dependent
- program-dependent
+ intuitive: the higher, the better
perf
Means
Measuring the performance of a workload
• arithmetic: used for averaging execution times
n
1
∑ timei ×
i =1 n
• harmonic: used for averaging rates ("the average of", as
opposed to "the average statistic of")
p
p
∑ 1
ratei
i =1
• weighted means: the programs are executed with different
frequencies, for example:
n
1
∑ timei × weighti ×
i =1 n
perf
Means
Still true when measuring MFLOPS(a rate) with the harmonic mean
perf
Speedup
perf