CPE131-1: Computer Architecture and Organization Computer Performance Equations ❖ Basic Computer Performance Equation • time/program = time/cycle x cycles/instruction x instructions/program • where time per program is the required CPU time ❖ Ways to increase performance: • RISC Machines • reduce the number of cycles per instructions • CISC Machines • Reduce the number of instructions per program • Vector and Parallel Processors • Reducing CPU time Computer Performance Equations ❖ Options for increasing overall performance of a system: • CPU Optimization • Maximize the speed and efficiency of operations performed by the CPU (the performance equation addresses this optimization) • Memory Optimization • Maximize the efficiency of a code’s memory management • I/O Optimization • Maximize the efficiency of I/O operations Mathematical Preliminaries ❖ Computer performance assessment is a quantitative science • Mathematical and statistical tools give us many ways in which to rate the overall performance of a system and the performance of its constituent components Mathematical Preliminaries ❖ Measures of system performance depend on one’s viewpoint: • Computer user • Most concerned with response time • How long does it take for the system to carry out a task? • System administrator • Most concerned with throughput • How many concurrent tasks can the system carry out without adversely affecting response time? • If a system carries out a task in k seconds, its throughput is 1/k of these tasks per second Mathematical Preliminaries ❖ Comparing the performance of two systems • Measure the time that it takes for each system to perform the same amount of work • If the same program is run on two systems, System A and System B, System A is n times faster than System B if: • running time (B)/running time (A) = n • System A is x% faster than System B if: • ([running time (B)/running time (A)] – 1)*100 = x Mathematical Preliminaries ❖ Measure of Central Tendency • Averaging the data in a way that make sense • Measures of central tendency indicate to us the expected behavior of the sampled system (population) • Not all methods of averaging data are equal • Method depends on the nature of the data itself as well as the statistical distribution of the test results Mathematical Preliminaries ❖ Arithmetic Mean • Given n measurements, add them together and divide by n • Should not be used when the data are highly variable or skewed toward lower or higher values • Weighted Arithmetic Mean • If we have some indication of how frequently each of the n element is used, we can use the execution mix to calculate relative expected performance • Found by taking the products of the frequency with the element Mathematical Preliminaries ❖ Geometric Mean • Gives us a consistent number with which to perform comparisons regardless of the distribution of data • It is the nth root of the product of the n measurements • G = (x1 * x2 * x3 * … * xN)^(1/N) • More helpful when comparing the relative performance of two systems • System under evaluation are normalized to the reference machine when we take the ratio of the run time of a program on the reference machine to the run time of the same program on the system being evaluated Mathematical Preliminaries ❖ Harmonic Mean • Used for averaging rates or ratios (such as operations per second) • Allows us to form a mathematical expectation of throughput as well as to compare the relative throughput of systems or system components • H = n / (1/x1 + 1/x2 + 1/x3 + … + 1/xN) BENCHMARKING Performance Benchmarking ❖ The science of making objective assessments of the performance of one system over another • Benchmarks are also useful for assessing performance improvements obtained by upgrading a computer or its components • Good benchmarks enable us to cut through advertising hype and statistical tricks • Will identify the systems that provide good performance at the most reasonable cost Clock Rate, MIPS, and FLOPS ❖ CPU speed is a misleading metric that is most often used by computer vendors touting to their system’s alleged superiority to others • MIPS – millions of instructions per second • Measures the rate at which the system can execute a typical mix of floating-point and integer arithmetic instructions, as well as logical operations • FLOPS – floating-point operations per second • Even more vexing than MIPS because there is no agreement as to what constitutes a floating-point operation Synthetic Benchmarks ❖ Do not represent any particular workload or application • Independently compare the performance of many different systems through a standardized benchmarking application program • It follows that one could write a program using 3GL, compile it and run it on various systems • Resulting execution time would lead to a single performance metric across all of the systems tested Synthetic Benchmarks ❖ Whetstone • Published in 1976 by Harold J. Curnow and Brian A. Wichman of the British National Physical Laboratory • Floating-point intensive, with many calls to library routines for computation of trigonometric and exponential functions • Results are reported in Kilo-Whetstone Instructions per second or Mega-Whetstone Instructions per second Synthetic Benchmarks ❖ Linpack • A contraction of LINear algebra PACKage, is a collection of subroutines called Basic Linear Algebra Subroutines (BLAS) which solve systems of linear equations using double-precision arithmetic • Originally written in FORTRAN 77 and has subsequently been rewritten in C and Java • Sets a standard measure for FLOPS Synthetic Benchmarks ❖ Dhrystone • A benchmarking program written by Reinhold P. Weicker of Siemens Nixdorf Information Systems that focused on string manipulation and integer operations • Program is CPU bound, performing no I/O or system calls • Results are reported simply as Dhrystones per second (the number of times the test program can be run in one second) PROGRAM OPTIMIZATION Program Optimization Tips ❖ Give the compiler as much information as possible about what you are doing • Use constants and local variables where possible • If your language permits them, define prototypes and declare static functions • Use arrays instead of pointers when you can ❖ Avoid unnecessary type casting and minimize floating-point to integer conversions Program Optimization Tips ❖ Avoid overflow and underflow ❖ Use a suitable data type (e.g., float, double, int) ❖ Consider using multiplication instead of division ❖ Eliminate all unnecessary branches Program Optimization Tips ❖ Use iteration instead of recursion when possible ❖ Build conditional statements (e.g., if, switch, case) with the most probable cases first ❖ Declare variables in a structure in order of size with the largest ones first Program Optimization Tips ❖ When program is having problems, profile the program before beginning optimization procedures • Profiling is the process of breaking your code into small chunks and timing each of these chunks to determine which of them consume the most time ❖ Never discard an algorithm based solely on its original performance • A fair comparison can occur only when all algorithms are fully optimized Thank you for listening