Cse - 321 - 2

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 37

Computer Architecture

Chapter - 2

• Discusses how to measure, report,


and summarize performance

• Describe the major factors that


determine the performance of a
computer.
Why examining performance is
important?
• Hardware performance is often key to the effectiveness of an
entire system.

• The computer components are now updating and improving its


technology frequently and speedy, and hence affected its price are
rising. The computer hardware and software costs a lot of money in an
organization. Therefore, it is very important that the IT department of
an organization should choose and buy the most appropriate and
cost-effective computer hardware.
Why assessing the performance is
challenging?

• The scale and intricacy of modern software


systems, together with the wide range of
performance improvement techniques employed
by hardware designers have made performance
assessment much more difficult.

• For different types of applications, different


performance metrics may be appropriate and
different aspects of a computer system may be
the most significant in determining overall
performance.
Measuring Performance

• Time is the measure of computer


performance.
• Program execution time is measured in
seconds per program.
• Wall-clock time / response time / elapsed time
/ execution time – total time to complete a
task, including - disk accesses, memory
access, I/O activity, OS overhead.
• Throughput : the total amount of work done in
a given time.
Performance analysis
1
PerformanceX =
Execution timeX
Performance of X is greater than the performance
of Y

PerformanceX > PerformanceY


1 1
>
Execution timeX Execution timeY

Execution timey > Execution timex

X is faster than Y
Continuation

• X is n times faster than Y, it means,

PerformanceX
= n
PerformanceY

PerformanceX Execution timey


= =n
PerformanceY Execution timex
Relative performance

• Example: If machine A runs a program in 10


seconds and machine B runs the same program
in 15 seconds, how faster is A than B?
– A is n times faster than B if
PerformanceA
=n
PerformanceB
Execution timeB 15
=n = 1.5
Execution timeA 10
– A is 1.5 times faster than B
Continuation

• We could also say that – Machine B is 1.5 times


slower than machine A. since

PerformanceA
=n
PerformanceB

PerformanceA
PerformanceB =
n
CPU execution time / CPU time

• is the time the CPU spends computing


for a task and does not include time
spent waiting for I/O or running other
programs.

CPU execution time / CPU time < Response time


Continuation
User CPU time
CPU time
System CPU time
• User CPU time – the CPU time spent in
the program

• System CPU time – the CPU time spent


in the OS performing tasks on behalf of
the program
Continuation

Execution Time
CPU time

For I/O User CPU System


and Others time CPU time
Continuation

• Example:
• Unix time command –
• 90.7u 12.9s 2:39 65%

User CPU time System CPU time Elapsed time


(90.7 seconds) (12.9 seconds) 2*60 + 39 =
(159 seconds)

90.7 + 12.9
= 0.65
159
Continuation
• Clock cycle – Almost all computers are
constructed using a clock that determines
when events take place. These discrete time
intervals are called clock cycles (ticks / clock
ticks / clock periods / clocks / cycles).

• Clock rate – Inverse of clock period.


Relating the Metrics

CPU execution CPU clock


Clock cycle
time for a = cycle for a ×
time
program program
CPU execution CPU clock cycle for a
time for a = program
program Clock rate
Hardware designer can improve performance by
reducing either the length of the clock cycle or
the number of clock cycles required for a
program.
Improving Performance
Our favorite program runs in 10 seconds on
computer A, which has a 400 MHz clock. We are
trying to help a computer designer build a machine
B, that will run this program in 6 seconds. The
designer has determined that a substantial increase
in the clock rate is possible, but this increase will
affect the rest of the CPU design, causing machine
B to require 1.2 times as many clock cycles as
machine A for this program. What clock rate
should we tell the designer to target?
Improving Performance (Cont.)
CPU clock cycleA
CPU timeA =
Clock rateA
CPU clock cycleA
10 Seconds =
400 × 106 cycles/sec
CPU clock cycleA = 10 seconds × 400 × 106 cycles/sec
= 4000 × 106 cycles
CPU clock cycleB
CPU timeB =
Clock rateB
1.2 × CPU clock cycleA
CPU timeB =
Clock rateB
Improving Performance (Cont.)
1.2 × 4000 × 106 cycles
6 seconds =
Clock rateB
1.2 × 4000 × 106 cycles
Clock rateB =
6 seconds
= 800 MHz

Machine B must therefore have twice the clock


rate of A to run the program in 6 seconds.
Hardware Software Interface

• Since Machine had to execute the


instructions to run the program, the
execution time must depend on the
number of instructions in a program.
CPU clock Instructions Average clock
cycles (for a = for a × cycles per
program) program instruction

CPI
Using the Performance Equation

• Suppose, we have two implementations of the


same instruction set architecture. Machine A has
a clock cycle time of 1 ns and a CPI of 2.0 for
some program, and machine B has a clock
cycle time of 2 ns and a CPI of 1.2 for the
same program. Which machine is faster for this
program, and by how much?
Continuation
Let the number of instructions of the program be I
CPU clock cyclesA = I × 2.0
CPU clock cyclesB = I × 1.2
CPU timeA = CPU clock cyclesA × Clock cycle timeA
= I × 2.0 × 1 ns = 2I ns
CPU timeB = I × 1.2 × 2 ns = 2.4I ns

CPU performanceA Execution timeB 2.4I ns


= = = 1.2
CPU performanceB Execution timeA 2I ns
A is 1.2 times faster than B
Continuation
• Basic performance equation

CPU time = Instruction count × CPI × clock cycle time

Instruction count × CPI


CPU time =
Clock rate
Continuation

• It is possible to compute the CPU clock


cycles by looking at the different types
of instructions and using their
individual clock cycle counts.
• In such cases,
CPU clock cycles= summation of (CPIi*Ci)
Comparing Code Segments
• Example
– The hardware designer supplied:
Instruction Class CPI for this class
A 1
B 2
C 3

– Two code sequences requires the following:


Code Sequence Instruction Counts for instruction class
A B C
1 2 1 2
2 4 1 1

– Which code sequence executes the most instructions?


– Which will be faster?
– What is the CPI for each sequence?
Solution

• Sequence 1 executes 2 + 1 + 2 = 5
instructions.
• Sequence 2 executes 4 + 1 + 1 = 6
instructions.
• So sequence 2 executes most instructions.
Solution
• CPU clock cycles1 = (2×1) + (1×2) +
(2×3) = 2 + 2 + 6 = 10 cycles

• CPU clock cycles2 = (4×1) + (1×2) +


(1×3) = 4 + 2 + 3 = 9 cycles

• So code sequence 2 is faster.


Solution
CPU clock cycles1 10
CPI1 = = = 2
Instruction count1 5

CPU clock cycles2 9


CPI2 = = = 1.5
Instruction count2 6

When comparing two machines, we must look at all three


components, which combine to form execution time.
Processor Clock Rate CPI
P1 4GHz 1.25
P2 3GHz 0.75

Instruction count= 10^6


Prove the fallacy, “ Largest clock rate has largest performance”

Here,
CPU execution time , p1= (CPI * Instructions) / clock rate
= (1.25* 10^6)/ (4*10^9)

CPU execution time, p2 = (0.75*10^6)/ (3*10^9)


Performance p1 : performance p2 =
((0.75*10^6)/ (3*10^9) ) / ((1.25* 10^6)/ (4*10^9) )
= 0.8

So, performance p1 = 0.8 * performance p2


Here,
P1 has highest clock rate but performance is lower.
So, the fallacy is wrong.
Check yourself:
Processors Clock rate CPI
P1 2GZ 1.5
P2 1.5GZ 1.0
P3 3GZ 2.5

Instruction set is same.

1. Which processor has the highest performance?


2. If the processors each execute a program in 10s, find the number
of cycles and number of instructions.
3. If execution time is 30% reduced and CPI is 20% increased then
what clock rate should be given?
MIPS (Millions instructions per second)

A measurement of program
execution speed based on the
number of millions of
instructions.

Limitations of MIPS:

Firstly, MIPS specifies the instruction execution rate but does not
specify the capabilities of the instructions.

Secondly, MIPS varies between program on the same computer.


Thus, a machine should not have a same MIPS ratings.

Finally, MIPS inversely related to performance!!


MIPS as a Performance Measure
MFLOPS (Million floating point operation per
second)– Performance Metric

MFLOPS=(Number of floating point operations in a


program) / (Execution time * 10^6)
Amdahl’s Law (self)

Earlier version of Amdahl’s law:

Latest version (second law) of Amdahl’s law:

Speed up = (Performance after improvement) / (Performance before


improvement)
= (Execution time before improvement) / Execution time
after improvement)

You might also like