Measuring and Reporting Performance

Measuring and reporting performance is important for both computer users and managers. For users, response time or execution time is key, while managers focus on throughput. Execution time can be defined in different ways, such as wall-clock time. When evaluating a new computer system, the ideal candidate is a user who runs the same programs regularly, allowing direct comparison of execution times. There are five levels of programs used for such evaluations, from most accurate to least: 1) real applications, 2) modified applications, 3) kernels, 4) toy benchmarks, and 5) synthetic benchmarks. Real applications are best but lack portability, while synthetic benchmarks aim to match average operations but no one would run them solely for performance evaluation

Uploaded by

Lipsa Rani Jena

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

44 views3 pages

Measuring and Reporting Performance

Uploaded by

Lipsa Rani Jena

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 3

Measuring and Reporting Performance

The computer user is interested in reducing response time( the time
between the start and the completion of an event) also referred to as execution
time. The manager of a large data processing center may be interested in
increasing throughput( the total amount of work done in a given time).

Even execution time can be defined in different ways depending on what
we count. The most straightforward definition of time is called wall-clock time,
response time, or elapsed time, which is the latency to complete a task,
including disk accesses, memory accesses, input/output activities, operating
system overhead
Choosing Programs to Evaluate Performance

A computer user who runs the same programs day in and day out would
be the perfect candidate to evaluate a new computer. To evaluate a new system
the user would simply compare the execution time of her workload—the
mixture of programs and operating system commands that users run on a
machine.

There are five levels of programs used in such circumstances, listed
below in decreasing order of accuracy of prediction
1. Real applications— Although the buyer may not know what fraction of
time is spent on these programs, she knows that some users will run them to
solve real problems. Examples are compilers for C, text-processing software
like Word, and other applications like Photoshop. Real applications have input,
output, and options that a user can select when running the program. There is
one major downside to using real applications as benchmarks: Real applications
often encounter portability problems arising from dependences on the operating
system or compiler. Enhancing portability often means modifying the source
and sometimes eliminating some important activity, such as interactive
graphics, which tends to be more system-dependent.

2. Modified (or scripted) applications—In many cases, real applications
are used as the building block for a benchmark either with modifications to the
application or with a script that acts as stimulus to the application. Applications
are modified for two primary reasons: to enhance portability or to focus on one
particular aspect of system performance. For example, to create a CPU-oriented
benchmark, I/O may be removed or restructured to minimize its impact on
execution time. Scripts are used to reproduce interactive behavior, which might
occur on a desktop system, or to simulate complex multiuser interaction, which
occurs in a server system.

Kernels—Several attempts have been made to extract small, key pieces from
real programs and use them to evaluate performance. Livermore Loops and
Linpack are the best known examples. Unlike real programs, no user would run
kernel programs, for they exist solely to evaluate performance. Kernels are best
used to isolate performance of individual features of a machine to explain the
reasons for differences in performance of real programs.
4. Toy benchmarks—Toy benchmarks are typically between 10 and 100
lines of code and produce a result the user already knows before running the toy
program. Programs like Sieve of Eratosthenes, Puzzle, and Quicksort are
popular because they are small, easy to type, and run on almost any computer.
The best use of such programs is beginning programming assignments

5. Synthetic benchmarks—Similar in philosophy to kernels, synthetic
benchmarks try to match the average frequency of operations and operands of a
large set of programs. Whetstone and Dhrystone are the most popular synthetic
benchmarks.