0% found this document useful (0 votes)
29 views

C A Lecture-3

Compuer Architecture Course Lecture -1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views

C A Lecture-3

Compuer Architecture Course Lecture -1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 41

Computer Architecture

Lecture-03

Shahanaz Islam Shaown


Lecturer, CSE, UU

1
Chapter - 2

• Discusses how to measure, report,


and summarize performance

• Describe the major factors that


determine the performance of a
computer.

2
Why examining performance is
important?
• Hardware performance is often key to the
effectiveness of an entire system.

3
Why assessing the performance is
challenging?
• The scale and intricacy of modern software
systems, together with the wide range of
performance improvement techniques employed
by hardware designers have made performance
assessment much more difficult.

• For different types of applications, different


performance metrics may be appropriate and
different aspects of a computer system may be
the most significant in determining overall
performance.
4
Measures

• Response Time / Execution Time :


the time between the start and
completion of a task.

• Throughput : the total amount of


work done in a given time.

5
Throughput and Response Time

• Do the following changes to a computer


system to increase throughput, decrease
response time, or both

– Replacing the processor in a computer with a


faster version – both.

– Adding additional processors to a system that


uses processors for separate tasks –
throughput (also response time).

6
Continuation
1
PerformanceX =
Execution timeX
Performance of X is greater than the performance
of Y

PerformanceX > PerformanceY


1 1
>
Execution timeX Execution timeY
Execution timey > Execution timex

X is faster than Y
7
Continuation

• X is n times faster than Y, it means,

PerformanceX
= n
PerformanceY

PerformanceX Execution time


= =n
y
PerformanceY Execution time x

8
Example
If machine A runs a program in 10 seconds
and machine B runs the same program in 15
seconds, how faster is A than B?

9
Solution

– A is n times faster than B if

PerformanceA
=n
PerformanceB
Execution timeB
=n
Execution timeA
15
= 1.5

10 than
– A is 1.5 times faster
B
10
Continuation

• We could also say that – Machine B is 1.5 times


slower than machine A. since

PerformanceA
=n
PerformanceB

PerformanceA
PerformanceB =
n

11
Measuring Performance

• Time is the measure of computer


performance.
• Program execution time is measured in
seconds per program.
• Wall-clock time / response time /
elapsed time / execution time – total
time to complete a task, including - disk
accesses, memory access, I/O activity,
OS overhead. 12
CPU execution time / CPU time

• is the time the CPU spends computing


for a task and does not include time
spent waiting for I/O or running other
programs.

CPU execution time / CPU time < Response time

13
Continuation
User CPU time
CPU time
System CPU time
• User CPU time – the CPU time spent in
the program

• System CPU time – the CPU time spent


in the OS performing tasks on behalf of
the program
14
Continuation

Execution Time
CPU time

For I/O User CPU System


and Others time CPU time

15
Continuation

• Example:
• Unix time command –
• 90.7u 12.9s 2:39 65%

User CPU time System CPU time Elapsed time


(90.7 seconds) (12.9 seconds) 2*60 + 39 =
(159 seconds)

90.7 + 12.9
= 0.65
16
159
Continuation

• System Performance – considering


elapsed time on an unloaded system

• CPU Performance – considering user



CPU time.

17
Continuation
• Clock cycle – Almost all computers are
constructed using a clock that
determines when events take place. These
discrete time intervals are called clock cycles
(ticks / clock ticks / clock periods / clocks /
cycles).

• Clock rate – Inverse of clock period.

18
Relating the Metrics

CPU execution time CPU clock cycle Clock cycle


= ×
for a program for a program time

CPU execution time CPU clock cycle for a program


=
for a program Clock rate

Hardware designer can improve performance


by reducing either the length of the clock
cycle or the number of clock cycles required for
a program.
19
Improving Performance
Our favorite program runs in 10 seconds on
computer A, which has a 400 MHz clock. We are
trying to help a computer designer build a machine
B, that will run this program in 6 seconds. The
designer has determined that a substantial increase
in the clock rate is possible, but this increase will
affect the rest of the CPU design, causing machine
B to require 1.2 times as many clock cycles as
machine A for this program. What clock rate
should we tell the designer to target?

20
Improving Performance (Cont.)
CPU clock cycleA
CPU timeA =
Clock rateA
CPU clock cycleA
10 Seconds =
400 × 106 cycles/sec
CPU clock cycleA = 10 seconds × 400 × 106 cycles/sec
= 4000 × 106 cycles
CPU clock cycleB
CPU timeB =
Clock rateB
1.2 × CPU clock cycleA
CPU timeB =
Clock rateB
21
Improving Performance (Cont.)
1.2 × 4000 × 106 cycles
6 seconds =
Clock rateB
1.2 × 4000 × 106 cycles
Clock rateB =
6 seconds
= 800 MHz

Machine B must therefore have twice the


clock rate of A to run the program in 6
seconds.

22
Example
Our favourite program runs in 10 seconds on computer A,
which has a 2 GHz clock. We are trying to help a computer
designer build a computer, B, which will run this program in 6
seconds. The designer has determined that a substantial
increase in the clock rate is possible, but this increase will affect
the rest of the CPU design, causing computer B to require 1.2
times as many clock cycles as computer A for this program.
What clock rate should we tell the designer to target?

23
24
Hardware Software Interface

• Since Machine had to execute the


instructions to run the program, the
execution time mustdepend on the
number of instructions in a program.
Average
CPU clock cycles Instructions
= × clock cycles
(for a program) for a
program per
instruction
CPI
25
Using the Performance Equation

• Suppose, we have two implementations of the


same instruction set architecture. Machine A has
a clock cycle time of 1 ns and a CPI of 2.0 for
some program, and machine B has a clock
cycle time of 2 ns and a CPI of 1.2 for the
same program. Which machine is faster for this
program, and by how much?

26
Continuation
Let the number of instructions of the program be I
CPU clock cyclesA = I × 2.0
CPU clock cyclesB = I × 1.2
CPU timeA = CPU clock cyclesA × Clock cycle timeA
= I × 2.0 × 1 ns = 2I ns
CPU timeB = I × 1.2 × 2 ns = 2.4I ns

CPU performanceA Execution timeB 2.4I ns


= × = 1.2
CPU performanceB Execution timeA 2I ns
A is 1.2 times faster than
27
B
Continuation
• Basic performance equation

CPU time = Instruction count × CPI × clock cycle time

Instruction count × CPI


CPU time =
Clock rate

28
Continuation

• It is possible to compute the CPU clock


cycles by looking at the different types of
instructions and using their individual
clock cycle counts.
• In such cases,
CPU clock cycles= summation of (CPIi*Ci)

29
Comparing Code Segments
• Example
– The hardware designer supplied:
Instruction Class CPI for this class
A 1
B 2
C 3

– Two code sequences requires the following:


Code Sequence Instruction Counts for instruction class
A B C
1 2 1 2
2 4 1 1

– Which code sequence executes the most instructions?


– Which will be faster?
– What is the CPI for each sequence? 30
Solution

• Sequence 1 executes 2 + 1 + 2 = 5
instructions.
• Sequence 2 executes 4 + 1 + 1 = 6
instructions.
• So sequence 2 executes most instructions.

31
Solution
• CPU clock cycles1 = (2×1) + (1×2)
+ (2×3) = 2 + 2 + 6 = 10 cycles

• CPU clock cycles2 = (4×1) + (1×2)


+ (1×3) = 4 + 2 + 3 = 9 cycles

• So code sequence 2 is faster.

32
Solution
CPU clock cycles1 10
CPI1 = = = 2
Instruction count1 5

CPU clock cycles2 9


CPI2 = = = 1.5
Instruction count2 6

When comparing two machines, we must look at all


three components, which combine to form execution
time.

33
Example

34
Processor Clock Rate CPI
P1 4GHz 1.25
P2 3GHz 0.75

Instruction count= 10^6


Prove the fallacy, “ Largest clock rate has largest performance”

Here,
CPU execution time , p1= (CPI * Instructions) / clock rate
= (1.25* 10^6)/ (4*10^9)

CPU execution time, p2 = (0.75*10^6)/ (3*10^9)

35
Performance p1 : performance p2 =
((0.75*10^6)/ (3*10^9) ) / ((1.25* 10^6)/ (4*10^9) )
= 0.8

So, performance p1 = 0.8 * performance p2


Here,
P1 has highest clock rate but performance is higher.
So, the fallacy is true.

36
MIPS (Millions instructions per second)

A measurement of program
execution speed based on the
number of millions of
instructions.

Limitations of MIPS:

Firstly, MIPS specifies the instruction execution rate but does


not specify the capabilities of the instructions.

Secondly, MIPS varies between program on the same


computer. Thus, a machine should not have a same MIPS
ratings.
37
MIPS as a Performance Measure

38
39
40
Amdahl’s Law

Earlier version of Amdahl’s


law:

Latest version (second law) of Amdahl’s law:

Speed up = (Performance after improvement) / (Performance


before improvement)
= (Execution time before improvement) / Execution
time after improvement) 41

You might also like