IT401 Computer Organization and Architecture: Prasun Ghosal

per

Uploaded by

Aveek Chatterjee

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

74 views30 pages

IT401 Computer Organization and Architecture: Prasun Ghosal

per

Uploaded by

Aveek Chatterjee

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 30

1

IT401 Computer Organization

and Architecture
Prasun Ghosal
Department of Information Technology
Bengal Engineering and Science University,
Shibpur
2
Outline
How to measure, report and summarize performance?
What are the major factors that determine the
performance of a computer?
Execution time is the only adequate measure of
performance
Benchmarks, what are they, and how are they used to
evaluate performance
3
Why study Performance?
Hardware performance is often key to the effectiveness of an
entire system of Hardware and Software
The goal is not just to assess performance but need to understand
what affects performance of a machine
To improve performance of software understand how
hardware affects system performance
How well a program uses instructions of the machine?
How well underlying HW implements instructions?
How well memory and I/O systems perform?
4
How to define performance?
Airplanes example
Passenger capacity
Cruising range (miles)
Cruising speed (m.p.h)
Passenger throughput (passengers * m.p.h)
Which airplane has the best
performance?
Highest cruising speed
Longest range
Largest capacity
Speed
Highest cruising speed
highest throughput
Run a program on two different workstations, which
is fastest?
User: response time (execution time)
Computer center manager: throughput
(how many tasks were performed during a time interval)
Relationship between response time and throughput
5
Performance
Use response time or execution time. To maximize performance
minimize execution time for some task
ime ExecutionT
e Performanc
1
=
What does it mean that Performance(X) is greater than Performance(Y)?
) ( ) (
) (
1
) (
1
) ( ) (
X ime ExecutionT Y ime ExecutionT
Y ime ExecutionT X ime ExecutionT
Y e Performanc X e Performanc
>
>
>
X is n times faster than Y n
X ime ExecutionT
Y ime ExecutionT
Y e Performanc
X e Performanc
= =
) (
) (
) (
) (
6
Performance Example
Machine A runs a program in 10 seconds and machine B runs
the same program in 15 seconds, how much faster is A than B?
5 . 1
10
15
) (
) (
) (
) (
= = =
A ime ExecutionT
B ime ExecutionT
B e Performanc
A e Performanc
A is 1.5 times faster than B
7
Measuring Performance1/4
Time is the measure of computer performance (sec per program)
Response time or elapsed time
Total time to complete a task including everything (disk access, memory
access, operating system overhead, )
CPU execution time (CPU time)
Time CPU spends computing for this task and does not include time spent
waiting for I/O or running other programs (some computers are timeshared)
CPU execution time can be divided into
User CPU time: CPU time spent in the program
System CPU time: CPU time spent in operating system performing tasks
on behalf of the program
8
Measuring Performance2/4
Example of user CPU time and System CPU time
Output of Unix time command
90.7u 12.9s 2:39 65%
User CPU time 90.7 sec
System CPU time 12.9 sec
Elapsed time 2:39 =( 2 minutes and 39 sec) =159 sec
% of elapsed time that is CPU time =(90.7 +12.9)/159 =65%
Then 100 65 =35% of elapsed time was spent doing something else
(waiting for I/O, running other programs, )
9
Measuring Performance3/4
Express CPU execution time in terms of other metric that relates to how fast the
HW can perform basic functions
Computers governed by a clock that runs at constant rate and determines when
events happen in HW
Length of a clock period is Clock cycle (measured in nanoseconds (10
-9
sec) or
picoseconds (10
-12
sec))
Clock rate is 1/(clock cycle) (measured in Megahertz (MHz =10
6
Hz), or
Gigahertz (GHz =10
9
Hz) )
1 Hertz is 1 cycle/sec
CPU execution time =CPU clock cycles for a program * clock cycle time
CPU execution time =CPU clock cycles for a program / clock rate
How to improve CPU execution time?
10
Measuring Performance4/4
Relating to Software
express CPU clock cycles in terms of program instructions
CPU clock cycles =Instruction for a program * Average clock cycles per
instruction
Clock cycles per instruction (average number of cycles each instruction takes
to execute) is abbreviated as CPI
CPI can be used to compare two implementations of the same instruction set
architecture (since instruction count for a program will remain the same)
11
Measuring Performance1/5
CPU clock cycles =Instructions for a program * CPI
CPU time =CPU clock cycles * clock cycle time
CPU time =Instruction count * CPI * clock cycle time
CPU time =Instruction count * CPI/clock rate
clockcycle
Seconds
n Instructio
s Clockcycle
ogram
ns Instructio
Time * *
Pr
=
Basic Performance Components
CPU execution time
Instruction count
CPI
Clock cycle time
12
Measuring Performance2/5
How to determine values of performance components
CPU execution time: measurement
Clock cycle time: published as part of documentation for a machine
Instruction count:
Software tools to profile execution, or use a simulator of the architecture
Hardware counters if available to measure the #of instructions executed
CPI: varies by application , as well as among implementations within the same
instruction set. Obtained through a detailed simulation or by combining HW
counters and simulation
CPI Can be calculated if different types of instructions and individual clock cycle
counts are known
13
Measuring Performance3/5

=
=
n
i
i i C CPI
1
) * ( CPU clock cycles
C
i
: number of instructions of class i executed
CPI
i
: average number of cycles per instruction for that
instruction class
n: number of instruction classes
Overall program CPI dependent on
Number of cycles for each instruction type
Frequency of each instruction type in the program
execution
14
Measuring Performance4/5
CPU clock cycles =Instructions for a program * CPI
ime ExecutionT
e Performanc
1
=
CPU time =CPU clock cycles * clock cycle time
CPU time =CPU clock cycles for a program / clock rate
CPU time =Instruction count * CPI * clock cycle time
CPU time =Instruction count * CPI/clock rate

=
=
n
i
i i C CPI
1
) * (
CPU clock cycles
15
Benchmarks1/3
Concept of Workload
Informally, set of programs that the user runs day in and day out
Benchmarks
Programs specifically chosen to measure performance
Form a workload that the user hopes will predict the performance of the actual
workload
Best benchmark types are real programs
Use of benchmarks whose performance depends on small code segments
encourages optimizations in either the architecture or compiler
A problem: Compilers with special-purpose optimizations targeted at specific
benchmarks. Will such optimizations produce good or correct code with a real
application?
16
Benchmarks2/3
COPYRIGHT 1998 MORGAN KAUFMANN PUBLISHERS, INC. ALL RIGHTS RESERVED
Matrix 300 in SPEC suite
in 1989
SPEC is System
Performance Evaluation
Cooperative
For matrix 300, the
enhanced compiler
improves performance by a
factor of more than 9!.
Although not that much
improvement with other
benchmarks.
SPEC benchmark web site
http://www.specbench.org
17
Benchmarks3/3
Why real programs are not used to measure performance?
Small size of benchmark (easier compilation and simulation)
Compilers might not be available for a new machine
Numerous published performance results are available for small
benchmarks
Benchmarks are OK for the initial design phase, but a working computer
system should be evaluated with a real program
Writing Performance reports
Reproducibility
Include everything needed to be able to duplicate the experiment
18
Comparing and Summarizing Performance1/4
Selected benchmark
Agreed to use response time or throughput
How to summarize performance of a group of benchmarks?
M/C A M/C B
P1 1 10
P2 1000 100
Total 1001 110
A is 10 times faster than B for P1
B is 10 times faster than A for P2
What is the relative performance of A &
B?
Use Total Execution Time
1 . 9
110
1001
) (
) (
) (
) (
= = =
B ime ExecutionT
A ime ExecutionT
A e Performanc
B e Performanc
19
Comparing and Summarizing Performance2/4
B is 9.1 times faster than A for P1 and P2 together
One figure as Summary of performance directly proportional to execution
time
If the workload consists of running P1 and P2 an equal number of times,
this statement would predict the relative execution times for the workload on
each machine
Average of execution times that is directly proportional to total execution
time isarithmetic mean (AM)

=
=
n
i
i Time
n
AM
1
) (
1
Time(i): execution time for i
th
program
n: total number of programs in the workload
A Smaller mean means smaller average
execution time and thus improved performance
20
Comparing and Summarizing Performance3/4
Arithmetic mean proportional to execution time, if programs in workload are
each run an equal number of times. What happens if not the case?
Assign a weighting factor w(i) to each program to indicate frequency of the
program in the workload
Weighted arithmetic mean
AM special case of weighted AM when all weights are equal

=
=
n
i
i Time i w WeightedAM
1
) ( * ) (
21
Comparing and Summarizing Performance4/4
Program M/C A M/C B M/C C
P1 1 10 20
P2 1000 100 20
Table shows runtimes of P1 and P2 on three machines A, B, and C
Workload consists of P1 and P2.
P1 is run 10 times as often as P2
Find which machine is fastest for this workload and by how much?
22
SPEC95 Benchmarks
CPU benchmark
Created by a set of computer companies in 1989
SPEC95 (8 integer and 10 floating point programs). Figure 2.6
SPEC95 web site (http://www.specbench.org/osg/cpu95/news/cpu95descr.html)
SPEC ratio for xxx.benchmark =
xxx.benchmark reference time /xxx.benchmark run time
Normalized measure. Higher results indicate faster performance
Reference machine is a Sun SPARCstation 10/40
SPECint95 or SPECfp95 summary measurement is obtained by taking geometric mean
of the SPEC ratios
n
n
i
i SPECratio

=1
) (
=
n
i
i a
1
) (
Product of a
1
* a
2
* ..* a
n
23
SPEC95 Benchmark results for Pentium and
Pentium Pro
At same clock rate, Pentium Pro
is 1.4 to 1.5 times faster
When clock rate increased by a
certain factor, processor
performance increases by a lower
factor
Pentium clock rate from 100 to
200 MHz. SPECint95 performance
improves by only 1.7 (Why?)
24
SPEC95 Benchmark results for Pentium and
Pentium Pro
At same clock rate, Pentium
Pro is 1.7 to 1.8 times faster
Clock rate from 100 to 200
MHz, SPECfp95 improves by
only 1.4 (Why?)
Bottleneck at memory system
due to increase of processor
speed, which effect is more
evident on floating point
benchmarks because of size.
25
Performance Summary Example1/2
M/C A M/C B
P1 1482 139
P2 2266 254
P3 6206 690
Which machine is faster according to total
execution time? And by how much?
Total Execution Time (A) =1482 +2266 +6206 =9954
Total Execution (B) =139 +254 +690 =1083
Machine B is fastest by 9954/1083 =9.27 times
26
Performance Summary Example2/2
M/C A M/C B
P1 1482 139
P2 2266 254
P3 6206 690
Which machine is faster by the geometric
mean measure?
Remember how SPEC reported performance?
Normalize in reference to one machine
Choose A as reference machine
Obtain Execution time ratios (ET Ratio)
ET Ratio(P1) =ET(A)/ET(B) =1482/139 =10.66
ET Ratio (P2) =2266/254 =8.92
ET Ratio(P3) =6206/690 =8.99
Geometric Mean =(Ratio (P1) * Ratio(P2) * Ratio(P3))
1/3
Geometric Mean =9.49
Machine B is 9.49 times faster than A according to
geometric mean measure
27
Amdahls Law1/3
Pitfall
Expecting the improvement of one aspect of a machine to increase performance
by an amount proportional to the size of the improvement
Program runs in 100 sec on a machine
Multiply operations responsible for 80 sec of time
How much do we need to improve the speed of multiplication if program is to run 5
times faster?
Execution time after improvement =
(Execution time affected by improvement/Amount of improvement +Execution time unaffected)
Execution time after improvement =80/n +(100-80) =20 =(100/5)
20 =80/n +20 80/n =0 no n can be found to achieve the requested improvement
Make the common case fast
28
Amdahls Law2/3
Another form of Amdahls Law (to yield Speedup)
Speedup =Performance after improvement/Performance before
Speedup =Execution time before/Execution time after improvement
Assume new hardware added to machine
f =fractions of all operations which use new hardware
s =speedup of those operations using new hardware
Execution time with new hardware is T
new
Execution time without new hardware is T
old
T
new
=f* T
old
/s +(1-f) * T
old
Overall speedup S =T
old
/T
new
Speedup =s / (s f * (s-1))
f
s 0.1 Speedup
2 1.052632
5 1.086957
10 1.098901

s 0.5 Speedup
2 1.333333
5 1.666667
10 1.818182
s 0.9 Speedup
2 1.818182
5 3.571429
10 5.263158
s 0.99 Speedup
2 1.980198
5 4.807692
10 9.174312
29
Amdahls Law3/3
Example of memory versus processor speedup
A =B op C
Assume memory access takes 4 cycles and a typical operation takes 2 cycles
Which of the following achieves the best increase in performance
Increase memory speed by 50%
Double operation speed
Calculate how many memory accesses are needed first?
1 to get instruction from memory
2 to get B and C from memory
1 to store result (A) back in memory
Then we need a total of 4 memory access operations
Memory access time =4 (accesses) * 4 (cycles/access) =16 cycles
Operation time =1 (operation) * 2 (cycles/operation) =2 cycles
Total number of cycles =16 +2 =18
Option 1 increase memory speed by 50%
s1 =1.5 (how?)
f1 =memory access time/ total time
=16/18 =0.889
S1 =1.42
Option 2 double operation speed
s2 =2
f2 =operation time/total time
=2/18 =0.111
S2 =1.059
30
MIPS as a Performance Metric
MIPS is million instructions per second
MIPS =instruction count / (Execution time * 10
6
)
Instruction execution rate (instruction/sec)
Faster machines have a higher MIPS rating
Problems with MIPS
Does not take into account capabilities of instructions
(can not compare computers with different ISA)
Varies between programs on the same computer
(a machine can not have a single MIPS rating for all programs)
Can vary inversely with performance

Well Control Course Workbook 2001 Q1 PDF
No ratings yet
Well Control Course Workbook 2001 Q1 PDF
96 pages
Life Cycle of Mushroom
No ratings yet
Life Cycle of Mushroom
17 pages
Computer Architecture and Performance
No ratings yet
Computer Architecture and Performance
33 pages
Iped Action Plan
100% (1)
Iped Action Plan
5 pages
Unit 1 - Canal Design
100% (1)
Unit 1 - Canal Design
75 pages
Chapter 2-Part 12 1
No ratings yet
Chapter 2-Part 12 1
38 pages
Chapter4 Performance
No ratings yet
Chapter4 Performance
36 pages
Performance Measures For Computers
No ratings yet
Performance Measures For Computers
53 pages
2 RISC V Performance ISA
No ratings yet
2 RISC V Performance ISA
72 pages
Pipelining 1
No ratings yet
Pipelining 1
21 pages
C A Lecture-3
No ratings yet
C A Lecture-3
41 pages
Performance Matrices
No ratings yet
Performance Matrices
14 pages
CS5204/EE5364 - Advanced Computer Architecture - Performance
No ratings yet
CS5204/EE5364 - Advanced Computer Architecture - Performance
56 pages
L-2 (Computer Performance)
No ratings yet
L-2 (Computer Performance)
52 pages
Chapter 1 PPT 2007 V 2
No ratings yet
Chapter 1 PPT 2007 V 2
36 pages
Lecture 4
No ratings yet
Lecture 4
37 pages
Cs2100 14 Understanding Performance
No ratings yet
Cs2100 14 Understanding Performance
46 pages
Lecture # 2
No ratings yet
Lecture # 2
33 pages
The Role of Performance: Chapter - 2
No ratings yet
The Role of Performance: Chapter - 2
40 pages
Impact of Training en Employee Perf
No ratings yet
Impact of Training en Employee Perf
35 pages
05 Performance
No ratings yet
05 Performance
16 pages
Lecture 02 CH01 Performance Power
No ratings yet
Lecture 02 CH01 Performance Power
76 pages
4 Performance
No ratings yet
4 Performance
67 pages
Cse - 321 - 2
No ratings yet
Cse - 321 - 2
37 pages
Designing For Performance - Performance Metrics
No ratings yet
Designing For Performance - Performance Metrics
19 pages
L-2 (Computer Performance)
No ratings yet
L-2 (Computer Performance)
47 pages
Chapter 1 Introduction
No ratings yet
Chapter 1 Introduction
17 pages
Cse431 04
No ratings yet
Cse431 04
17 pages
Puter Performance
No ratings yet
Puter Performance
15 pages
Performance Measures
No ratings yet
Performance Measures
25 pages
CSE 332 L4 - 14 Nov 2020
No ratings yet
CSE 332 L4 - 14 Nov 2020
41 pages
ComputerOrganization Chapter4 Performance Color
No ratings yet
ComputerOrganization Chapter4 Performance Color
37 pages
SEN307 Lecture 5
No ratings yet
SEN307 Lecture 5
34 pages
Computer Performance
No ratings yet
Computer Performance
17 pages
Da Ci
No ratings yet
Da Ci
13 pages
CS3350B Computer Architecture CPU Performance and Profiling: Marc Moreno Maza
No ratings yet
CS3350B Computer Architecture CPU Performance and Profiling: Marc Moreno Maza
28 pages
Module 2 (26-10-2024)
No ratings yet
Module 2 (26-10-2024)
50 pages
Computer Organization CS1403 System Performance: Mayank Pandey, MNNIT, Allahabad, India
No ratings yet
Computer Organization CS1403 System Performance: Mayank Pandey, MNNIT, Allahabad, India
23 pages
Lec10 Performance
No ratings yet
Lec10 Performance
22 pages
Computer Performance
No ratings yet
Computer Performance
18 pages
Labreport Heat Exchanger
No ratings yet
Labreport Heat Exchanger
27 pages
Week
No ratings yet
Week
12 pages
Performance: Latency
No ratings yet
Performance: Latency
7 pages
Week 2 - Lecture 2 - Performance Measurement
No ratings yet
Week 2 - Lecture 2 - Performance Measurement
25 pages
William Stallings Computer Organization and Architecture 8 Edition Computer Evolution and Performance
No ratings yet
William Stallings Computer Organization and Architecture 8 Edition Computer Evolution and Performance
28 pages
Week 13 14 - Performance Evaluation
No ratings yet
Week 13 14 - Performance Evaluation
19 pages
Measuring Computer Performance
No ratings yet
Measuring Computer Performance
26 pages
Computer Architecture Measurement
No ratings yet
Computer Architecture Measurement
26 pages
Computer Architecture A Quantitative Approach (5th Edition) - Comparación
No ratings yet
Computer Architecture A Quantitative Approach (5th Edition) - Comparación
2 pages
Chapter 8 - CPU Performance
No ratings yet
Chapter 8 - CPU Performance
40 pages
Performance
No ratings yet
Performance
12 pages
M116C 1 M116C 1 Lect02-Performance
No ratings yet
M116C 1 M116C 1 Lect02-Performance
23 pages
Chapter 01 RISC V
No ratings yet
Chapter 01 RISC V
30 pages
Measuring Performance: Chris Clack B261 Systems Architecture
No ratings yet
Measuring Performance: Chris Clack B261 Systems Architecture
19 pages
Computer Performance
No ratings yet
Computer Performance
22 pages
Lesson 3 - Computing For Performance
No ratings yet
Lesson 3 - Computing For Performance
38 pages
Assessing and Understanding Performance
No ratings yet
Assessing and Understanding Performance
31 pages
Sistem Komputer Berkinerja Tinggi: L #3 Assessing and Understanding Performance
No ratings yet
Sistem Komputer Berkinerja Tinggi: L #3 Assessing and Understanding Performance
11 pages
Co Unit1 Part3
No ratings yet
Co Unit1 Part3
11 pages
Chapter 1 Performance
No ratings yet
Chapter 1 Performance
32 pages
Defining Performance
No ratings yet
Defining Performance
6 pages
Computer Organization and Architecture (AT70.01)
No ratings yet
Computer Organization and Architecture (AT70.01)
29 pages
Lecture4 Performance Evaluation 2011
No ratings yet
Lecture4 Performance Evaluation 2011
34 pages
CH 02a-Computer Performance
No ratings yet
CH 02a-Computer Performance
22 pages
List of Documents To Be Attached With The Application Form For Registration As Professional Engineer (Pe) (Through Epe)
100% (1)
List of Documents To Be Attached With The Application Form For Registration As Professional Engineer (Pe) (Through Epe)
6 pages
Ramu Final Project
No ratings yet
Ramu Final Project
68 pages
Bengal Engineering and Science University, Shibpur
No ratings yet
Bengal Engineering and Science University, Shibpur
9 pages
2019 ERP Software Project Report
No ratings yet
2019 ERP Software Project Report
23 pages
(2010-02-27) Measuring Performance
No ratings yet
(2010-02-27) Measuring Performance
11 pages
Midsummer
No ratings yet
Midsummer
20 pages
IT401: Computer Organization and Architecture: External Memory Prasun Ghosal
No ratings yet
IT401: Computer Organization and Architecture: External Memory Prasun Ghosal
34 pages
Demographics of India
No ratings yet
Demographics of India
21 pages
Trauma With Injury Severity Score of 75: Are These Unsurvivable Injuries?
No ratings yet
Trauma With Injury Severity Score of 75: Are These Unsurvivable Injuries?
11 pages
Loci Booklet
No ratings yet
Loci Booklet
7 pages
Swimdex Proposal
No ratings yet
Swimdex Proposal
22 pages
The Cost of Capital Cost of Capital, Discounts Rates, and The Required Rate of Return
No ratings yet
The Cost of Capital Cost of Capital, Discounts Rates, and The Required Rate of Return
12 pages
Exit Survey Form IT
No ratings yet
Exit Survey Form IT
7 pages
Entity Level GHG Survey (2019)
No ratings yet
Entity Level GHG Survey (2019)
2 pages
cpphtp10 07
No ratings yet
cpphtp10 07
111 pages
Net It A Snapshot of Contemporary Architecture Design and Photography in Italy Scagline Pino (Editor)
No ratings yet
Net It A Snapshot of Contemporary Architecture Design and Photography in Italy Scagline Pino (Editor)
82 pages
Cbar
No ratings yet
Cbar
12 pages
IPB New PGS Proposal Form
No ratings yet
IPB New PGS Proposal Form
3 pages
Entso-E CESysSep 210724 02 Final Report 220325
No ratings yet
Entso-E CESysSep 210724 02 Final Report 220325
132 pages
Sample Formatted Document - Research Report
No ratings yet
Sample Formatted Document - Research Report
5 pages
Crop Circle Templates
No ratings yet
Crop Circle Templates
2 pages
Fundamentals of GST - 1
No ratings yet
Fundamentals of GST - 1
39 pages
Introduction To CN-Parte-4
No ratings yet
Introduction To CN-Parte-4
27 pages
Adieu Bash: Sample Project Management Template
No ratings yet
Adieu Bash: Sample Project Management Template
34 pages
LiveProject Template - 2member
No ratings yet
LiveProject Template - 2member
1 page
Governor Rues Besu Loss': Today's Edition
No ratings yet
Governor Rues Besu Loss': Today's Edition
1 page
CMT05207 Reproductive and Child Health Theory AG CAT IV - October 2024
No ratings yet
CMT05207 Reproductive and Child Health Theory AG CAT IV - October 2024
12 pages
Reactivity Series of Metals
No ratings yet
Reactivity Series of Metals
14 pages
Blast E-Value
No ratings yet
Blast E-Value
5 pages
Boggio - Dino Risi's Il Sorpasso - (Im) Mobility in The Economic Boom Years
No ratings yet
Boggio - Dino Risi's Il Sorpasso - (Im) Mobility in The Economic Boom Years
13 pages
Unit 1 Network and Security New Study Notes
No ratings yet
Unit 1 Network and Security New Study Notes
6 pages
Sandeep
No ratings yet
Sandeep
6 pages
CT200 Littlefuse
No ratings yet
CT200 Littlefuse
2 pages
CISA Exam-Testing Concept-PERT/CPM/Gantt Chart/FPA/EVA/Timebox (Chapter-3)
From Everand
CISA Exam-Testing Concept-PERT/CPM/Gantt Chart/FPA/EVA/Timebox (Chapter-3)
Hemang Doshi
1.5/5 (3)

IT401 Computer Organization and Architecture: Prasun Ghosal

Uploaded by

IT401 Computer Organization and Architecture: Prasun Ghosal

Uploaded by

1

IT401 Computer Organization

You might also like