0% found this document useful (0 votes)

78 views37 pages

Performance Evaluation of Parallel Computers

This document discusses performance evaluation of parallel computers. It defines key metrics like parallel runtime, speedup and efficiency. It explains how these metrics relate to the number of processors. It also covers standard performance measures, sources of parallel overhead, and performance laws like Amdahl's law, Gustafson's law and Sun & Ni's law. These laws describe how speedup is theoretically bounded based on the fraction of sequential vs parallel work.

Uploaded by

swapnil dwivedi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

78 views37 pages

Performance Evaluation of Parallel Computers

Uploaded by

swapnil dwivedi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 37

Performance Evaluation of

Parallel Computers

NARENDRA KUMAR
Basics of
Performance Evaluation
 A sequential algorithm is evaluated in terms of its execution time which is
expressed as a function of its input size.

 For a parallel algorithm, the execution time depends not only on input size but
also on factors such as parallel architecture, no. of processors, etc.

 Performance Metrics
 Parallel Run Time Speedup Efficiency

 Standard Performance Measures

 Peak Performance Sustained Performance
 Instruction Execution Rate (in MIPS) Floating Point Capability (in MFLOPS)
2
Performance Metrics

 Parallel Runtime
 The parallel run time T(n) of a program or application is the time required to
run the program on an n-processor parallel computer.
 When n = 1, T(1) denotes sequential runtime of the program on single
processor.

 Speedup
 Speedup S(n) is defined as the ratio of time taken to run a program on a single
processor to the time taken to run the program on a parallel computer with
identical processors

 It measures how faster the program runs on a parallel computer rather than on
NARENDRA KUMAR
a single processor.
3
Performance Metrics

 Efficiency
 The Efficiency E(n) of a program on n processors is defined as the ratio of
speedup achieved and the number of processor used to achieve it.

 Relationship between Execution Time, Speedup and Efficiency and the

number of processors used is depicted using the graphs in next slides.

 In the ideal case:

 Speedup is expected to be linear i.e. it grows linearly with the number of
processors,
NARENDRA KUMAR but in most cases it falls due to parallel overhead.

4
Performance Metrics

 Graphs showing relationship b/w T(n) and no. of processors

 <<<IMAGES>>>

NARENDRA KUMAR

5
Performance Metrics

 Graphs showing relationship b/w S(n) and no. of processors

 <<<IMAGES>>>

NARENDRA KUMAR

6
Performance Metrics

 Graphs showing relationship b/w E(n) and no. of processors

 <<<IMAGES>>>

NARENDRA KUMAR

7
Performance Measures

 Standard Performance Measures

 Most of the standard measures adopted by the industry to compare the

performance of various parallel computers are based on the concepts of:
 Peak Performance
 [Theoretical maximum based on best possible utilization of all resources]
 Sustained Performance
 [based on running application-oriented benchmarks]

 Generally measured in units of:

 MIPS [to reflect instruction execution rate]
 MFLOPS [to reflect the floating-point capability]
NARENDRA KUMAR

8
Performance Measures

 Benchmarks

 Benchmarks are a set of programs of program fragments used to compare the

performance of various machines.

 Machines are exposed to these benchmark tests and tested for performance.

 When it is not possible to test the applications of different machines, then the
results of benchmark programs that most resemble the applications run on
those machines are used to evaluate the performance of machine.

NARENDRA KUMAR

9
Performance Measures

 Benchmarks

 Kernel Benchmarks
 [Program fragments which are extracted from real programs]
 [Heavily used core and responsible for most execution time]
 Synthetic Benchmarks
 [Small programs created especially for benchmarking purposes]
 [These benchmarks do not perform any useful computation]

 EXAMPLES
 LINPACK LAPACK Livermore Loops SPECmarks
 NAS Parallel
NARENDRA KUMAR Benchmarks Perfect Club Parallel Benchmarks

10
Parallel Overhead

 Sources of Parallel Overhead

 Parallel computers in practice do not achieve linear speedup or an efficiency of

1 because of parallel overhead. The major sources of which could be:

• Inter-processor Communication
• Load Imbalance
• Inter-Task Dependency
• Extra Computation
• Parallel Balance Point
NARENDRA KUMAR

11
Speedup
Performance Laws
 Speedup Performance Laws

 Amdahl’s Law
 [based on fixed problem size or fixed work load]

 Gustafson’s Law
 [for scaled problems, where problem size increases with
machine size
 i.e. the number of processors]

 Sun & Ni’s Law

 [applied
NARENDRA KUMAR to scaled problems bounded by memory capacity]

12
Speedup
Performance Laws
 Amdahl’s Law (1967)

 For a given problem size, the speedup does not increase linearly as the number
of processors increases. In fact, the speedup tends to become saturated.
 This is a consequence of Amdahl’s Law.

 According to Amdahl’s Law, a program contains two types of operations:

 Completely sequential
 Completely parallel

 Let, the time Ts taken to perform sequential operations be a fraction α (0<α≤1)

of the total execution time T(1) of the program, then the time Tp to perform
parallel operations shall be (1-α) of T(1)
NARENDRA KUMAR

13
Speedup
Performance Laws
 Amdahl’s Law

 Thus, Ts = α.T(1) and Tp = (1-α).T(1)

 Assuming that the parallel operations achieve linear speedup

 (i.e. these operations use 1/n of the time taken to perform on each processor),
then

 T(n) = Ts + Tp/n =

 Thus, the speedup with n processors will be:

NARENDRA KUMAR

14
Speedup
Performance Laws
 Amdahl’s Law

 Sequential operations will tend to dominate the speedup as n becomes very

large.

 As n  ∞, S(n)  1/α

 This means, no matter how many processors are employed, the speedup in
this problem is limited to 1/α.
 This is known as sequential bottleneck of the problem.

 Note: Sequential bottleneck cannot be removed just by increasing the no. of

processors.
NARENDRA KUMAR

15
Speedup
Performance Laws
 Amdahl’s Law

 A major shortcoming in applying the Amdahl’s Law: (is its own characteristic)
 The total work load or the problem size is fixed
 Thus, execution time decreases with increasing no. of processors

 Thus, a successful method of overcoming this shortcoming is to increase the

problem size!

NARENDRA KUMAR

16
Speedup
Performance Laws
 Amdahl’s Law

 <<<GRAPH>>>

NARENDRA KUMAR

17
Speedup
Performance Laws
 Gustafson’s Law (1988)

 It relaxed the restriction of fixed size of the problem and used the notion of
fixed execution time for getting over the sequential bottleneck.

 According to Gustafson’s Law,

 If the number of parallel operations in the problem is increased (or scaled up)
sufficiently,
 Then sequential operations will no longer be a bottleneck.

 In accuracy-critical applications, it is desirable to solve the largest problem size

on a larger machine rather than solving a smaller problem on a smaller
machine, with almost the same execution time.
NARENDRA KUMAR

18
Speedup
Performance Laws
 Gustafson’s Law

 As the machine size increases, the work load (or problem size) is also increased
so as to keep the fixed execution time for the problem.

 Let, Ts be the constant time tank to perform sequential operations; and

 Tp(n,W) be the time taken to perform parallel operation of problem size or
workload W using n processors;
 Then the speedup with n processors is:

NARENDRA KUMAR

19
Speedup
Performance Laws
 Gustafson’s Law

 <<<IMAGES>>>

NARENDRA KUMAR

20
Speedup
Performance Laws
 Gustafson’s Law

 Assuming that parallel operations achieve a linear speedup

 (i.e. these operations take 1/n of the time to perform on one processor)
 Then, Tp(1,W) = n. Tp(n,W)

 Let α be the fraction of sequential work load in problem, i.e.

 Then the speedup can be expressed as : with n processors is:

NARENDRA KUMAR

21
Speedup
Performance Laws
 Sun & Ni’s Law (1993)

 This law defines a memory bounded speedup model which generalizes both
Amdahl’s Law and Gustafson’s Law to maximize the use of both processor and
memory capacities.

 The idea is to solve maximum possible size of problem, limited by memory capacity

 This inherently demands an increased or scaled work load,

 providing higher speedup,
 Higher efficiency, and
 Better resource (processor & memory) utilization
 But may result in slight increase in execution time to achieve this scalable
speedup performance!
NARENDRA KUMAR

22
Speedup
Performance Laws
 Sun & Ni’s Law

 According to this law, the speedup S*(n) in the performance can be defined by:

 Assumptions made while deriving the above expression:

• A global address space is formed from all individual memory spaces i.e. there is
a distributed shared memory space
• All available memory capacity of used up for solving the scaled problem.
NARENDRA KUMAR

23
Speedup
Performance Laws
 Sun & Ni’s Law

 Special Cases:
• G(n) = 1
 Corresponds to where we have fixed problem size i.e. Amdahl’s Law
• G(n) = n
 Corresponds to where the work load increases n times when memory is increased n
times, i.e. for scaled problem or Gustafson’s Law
• G(n) ≥ n
 Corresponds to where computational workload (time) increases faster than memory
requirement.

 Comparing
NARENDRA KUMAR speedup factors S*(n), S’(n) and S’(n), we shall find S*(n) ≥ S’(n) ≥ S(n)

24
Speedup
Performance Laws
 Sun & Ni’s Law

 <<<IMAGES>>>

NARENDRA KUMAR

25
Scalability Metric

 Scalability
 – Increasing the no. of processors decreases the efficiency!
 + Increasing the amount of computation per processor, increases the efficiency!
 To keep the efficiency fixed, both the size of problem and the no. of processors
must be increased simultaneously.

 A parallel computing system is said to be scalable if its efficiency can be

fixed by simultaneously increasing the machine size and the problem size.

 Scalability of a parallel system is the measure of its capacity to increase

speedup in proportion to the machine size.
NARENDRA KUMAR

26
Scalability Metric

 Isoefficiency Function
 The isoefficiency function can be used to measure scalability of the parallel
computing systems.

 It shows how the size of problem must grow as a function of the number of
processors used in order to maintain some constant efficiency.

 The general form of the function is derived using an equivalent definition of

efficiency as follows:

 Where, U is the time taken to do the useful computation (essential work), and
 O is the parallel overhead. (Note: O is zero for sequential execution).
NARENDRA KUMAR

27
Scalability Metric

 Isoefficiency Function

 If the efficiency is fixed at some constant value K then

 Where, K’ is a constant for fixed efficiency K.

 This function is known as the isoefficiency function of parallel computing system.

 A small isoefficiency function means that small increments in the problem size (U),
are sufficient for efficient utilization of an increasing no. of processors, indicating
high scalability.
 ANARENDRA
largeKUMAR
isoeffcicnecy function indicates a poorly scalable system.

28
Scalability Metric

 Isoefficiency Function

NARENDRA KUMAR

29
Performance
Measurement Tools
 Performance Analysis

 Search Based Tools

 Visualization
 Utilization Displays
 [Processor (utilization) count, Utilization Summary, Gantt charts, Concurrency
Profile, Kiviat Diagrams]
 Communication Displays
 [Message Queues, Communication Matrix, Communication Traffic, Hypercube]
 Task Displays
 [Task Gantt, Task Summary]
NARENDRA KUMAR

30
Performance
Measurement Tools

NARENDRA KUMAR

31
Performance
Measurement Tools

NARENDRA KUMAR

32
Performance
Measurement Tools

NARENDRA KUMAR

33
Performance
Measurement Tools

NARENDRA KUMAR

34
Performance
Measurement Tools

NARENDRA KUMAR

35
Performance
Measurement Tools
 Instrumentation

 A way to collect data about an application is to instrument the application

executable so that when it executes, it generates the required information as a
side-effect.

 Ways to do instrumentation:
 By inserting it into the application source code directly, or
 By placing it into the runtime libraries, or
 By modifying the linked executable, etc.

 Doing this, some perturbation of the application program will occur

NARENDRA KUMAR
 (i.e. intrusion problem)
36
Performance
Measurement Tools
 Instrumentation

 Intrusion includes both:

 Direct contention for resources (e.g. CPU, memory, communication links, etc.)
 Secondary interference with resources (e.g. interaction with cache replacements or
virtual memory, etc.)

 To address such effects, you may adopt the following approaches:

 Realizing that intrusion affects measurement, treat the resulting data as an
approximation
 Leave the added instrumentation in the final implementation.
 Try to minimize the intrusion.
 Quantify
NARENDRA KUMARthe intrusion and compensate for it!

(SVM-IR2-EN) - Ichroma II - Rev.00 (20160229)
100% (3)
(SVM-IR2-EN) - Ichroma II - Rev.00 (20160229)
36 pages
SOP 026 Resources Page Link-Building Outreach
100% (1)
SOP 026 Resources Page Link-Building Outreach
9 pages
Power of API Economy
No ratings yet
Power of API Economy
28 pages
4.1.2.3 Lab - Design A Prototype of An AI Application PDF
No ratings yet
4.1.2.3 Lab - Design A Prototype of An AI Application PDF
2 pages
Principles of Scalable Performance
No ratings yet
Principles of Scalable Performance
61 pages
Zindagi Zama Da
No ratings yet
Zindagi Zama Da
21 pages
CS-3006 10 PerformanceAnalysis
No ratings yet
CS-3006 10 PerformanceAnalysis
52 pages
Document
No ratings yet
Document
10 pages
Module 1 Chapter3
No ratings yet
Module 1 Chapter3
45 pages
PDC Week 2 (Performance Metrice, Amdahl's Law)
No ratings yet
PDC Week 2 (Performance Metrice, Amdahl's Law)
18 pages
Performance Analysis: PE PE
No ratings yet
Performance Analysis: PE PE
10 pages
COE4590 12 Amdahls Law
No ratings yet
COE4590 12 Amdahls Law
18 pages
Unit 4 - Analytical Modeling of Parallel Programs
No ratings yet
Unit 4 - Analytical Modeling of Parallel Programs
37 pages
Lecture 4 Analytical Modeling of Parallel Programs
No ratings yet
Lecture 4 Analytical Modeling of Parallel Programs
11 pages
Pc98 Lect5 Part1 Speedup
No ratings yet
Pc98 Lect5 Part1 Speedup
36 pages
Speed Up Laws
No ratings yet
Speed Up Laws
21 pages
Pc7 Performance
No ratings yet
Pc7 Performance
50 pages
Lecture-11 Amdhals Law Gustafsons Law
No ratings yet
Lecture-11 Amdhals Law Gustafsons Law
16 pages
12 MPIProgramPerformance
No ratings yet
12 MPIProgramPerformance
33 pages
Performance Metrics
No ratings yet
Performance Metrics
34 pages
Principles of Scalable Performance
0% (1)
Principles of Scalable Performance
7 pages
Performance and Scalability Class
No ratings yet
Performance and Scalability Class
63 pages
Lecture 6 (Amdahl's Law)
No ratings yet
Lecture 6 (Amdahl's Law)
13 pages
Laraib Cs - 39 Assig 1
No ratings yet
Laraib Cs - 39 Assig 1
4 pages
Performance&Scalability Ch3
No ratings yet
Performance&Scalability Ch3
41 pages
Lecture 3.1.4 (Amdahl's Law)
No ratings yet
Lecture 3.1.4 (Amdahl's Law)
13 pages
Amdahl's Law
No ratings yet
Amdahl's Law
25 pages
Lecture 02
No ratings yet
Lecture 02
31 pages
Amdahl's Law (Autosaved)
No ratings yet
Amdahl's Law (Autosaved)
12 pages
Chapter 3: Principles of Scalable Performance
No ratings yet
Chapter 3: Principles of Scalable Performance
41 pages
Parallel Algorithm Analysis
No ratings yet
Parallel Algorithm Analysis
11 pages
Lecture 3.1.4 (Amdahl's Law)
No ratings yet
Lecture 3.1.4 (Amdahl's Law)
4 pages
03 Performance
No ratings yet
03 Performance
29 pages
Lect 02
No ratings yet
Lect 02
51 pages
HPC 4th Unit - 240504 - 160030
No ratings yet
HPC 4th Unit - 240504 - 160030
19 pages
2.0 DD2356 DiscussingSpeedUp
No ratings yet
2.0 DD2356 DiscussingSpeedUp
13 pages
Lecture 4 - Parallel Computing Metrics
No ratings yet
Lecture 4 - Parallel Computing Metrics
3 pages
IT3030E CA Chap8 Multiprocessing
No ratings yet
IT3030E CA Chap8 Multiprocessing
26 pages
Amdahls Law - Advanced Computer Architecture
No ratings yet
Amdahls Law - Advanced Computer Architecture
2 pages
All Numerical Unit-1
No ratings yet
All Numerical Unit-1
28 pages
34-Amdahl''s Law-10-04-2023
No ratings yet
34-Amdahl''s Law-10-04-2023
9 pages
Unit 1 - Part 3
No ratings yet
Unit 1 - Part 3
17 pages
Chen Paap08-Multicorescalability
No ratings yet
Chen Paap08-Multicorescalability
12 pages
HW2 Solutions
No ratings yet
HW2 Solutions
4 pages
Lecture 2 Amdahl's Law and Karp-Flatt Metric
0% (1)
Lecture 2 Amdahl's Law and Karp-Flatt Metric
14 pages
Amdahl's Law and Gustafson's Law
75% (4)
Amdahl's Law and Gustafson's Law
16 pages
Lecture Week - 3 Amdahl Law 1
No ratings yet
Lecture Week - 3 Amdahl Law 1
19 pages
Lecture04 PDF
No ratings yet
Lecture04 PDF
27 pages
Screenshot 2024-12-05 at 2.01.32 PM
No ratings yet
Screenshot 2024-12-05 at 2.01.32 PM
49 pages
Module 1-Performance Measure
No ratings yet
Module 1-Performance Measure
14 pages
Parallel2 PDF
No ratings yet
Parallel2 PDF
16 pages
PDC Last Min Notes For MCQS - Theory
No ratings yet
PDC Last Min Notes For MCQS - Theory
39 pages
5 Amdahl
No ratings yet
5 Amdahl
3 pages
5 Problems PDF
No ratings yet
5 Problems PDF
32 pages
Aitsam B21F0230CS015 PDC ASS02
No ratings yet
Aitsam B21F0230CS015 PDC ASS02
5 pages
2 Performance Matirices
No ratings yet
2 Performance Matirices
22 pages
Course Outcome 1:: 15Cs4180 - Parallel Computing
No ratings yet
Course Outcome 1:: 15Cs4180 - Parallel Computing
23 pages
Unit 6'
No ratings yet
Unit 6'
15 pages
Performance Metrices
100% (1)
Performance Metrices
18 pages
Principles of Scalable Performance
No ratings yet
Principles of Scalable Performance
34 pages
Speedup and Efficiency of Parallel Algorithms: N N N P T Sequential T N P S
No ratings yet
Speedup and Efficiency of Parallel Algorithms: N N N P T Sequential T N P S
4 pages
1-QP KEY PDC CAT-1 - C1-Slot Answer Key PDF
No ratings yet
1-QP KEY PDC CAT-1 - C1-Slot Answer Key PDF
8 pages
CS-3006 4 PerformanceAnalysis
No ratings yet
CS-3006 4 PerformanceAnalysis
62 pages
Types and Componets of A Computer System
No ratings yet
Types and Componets of A Computer System
57 pages
Subject: Computer Organization Sub Code: 21Cs34 Semester: 3
No ratings yet
Subject: Computer Organization Sub Code: 21Cs34 Semester: 3
43 pages
Sushant Sharma: IIT Kharagpur Alumnus, Engineering at Mayhem Studios
No ratings yet
Sushant Sharma: IIT Kharagpur Alumnus, Engineering at Mayhem Studios
3 pages
11
No ratings yet
11
4 pages
Convert Your Evaluation Panorama To A Production Panorama Without Local Log Collector
No ratings yet
Convert Your Evaluation Panorama To A Production Panorama Without Local Log Collector
3 pages
Wilmont DroneDelivery RiskManagement PDF
No ratings yet
Wilmont DroneDelivery RiskManagement PDF
1 page
05 2
100% (2)
05 2
11 pages
The Impacts of Technology
No ratings yet
The Impacts of Technology
9 pages
Control It: Hardware (Ac800 M/C) & Softcontroller
No ratings yet
Control It: Hardware (Ac800 M/C) & Softcontroller
74 pages
What Is 5S?: 5S Implementation in Welding Workshop
No ratings yet
What Is 5S?: 5S Implementation in Welding Workshop
1 page
Slides14 Distribution
No ratings yet
Slides14 Distribution
29 pages
Subcontractor Documentation Tracker: (I.e.. Request To Sublet, Statement of Intent, Etc.)
No ratings yet
Subcontractor Documentation Tracker: (I.e.. Request To Sublet, Statement of Intent, Etc.)
2 pages
Program 6 Algorithm: Void Input
No ratings yet
Program 6 Algorithm: Void Input
15 pages
Information Delivery Manuals: General Overview
No ratings yet
Information Delivery Manuals: General Overview
14 pages
Klv-17hr1 LCD Colour TV
No ratings yet
Klv-17hr1 LCD Colour TV
74 pages
Goods Receipt Migo Idoc - SCN
No ratings yet
Goods Receipt Migo Idoc - SCN
4 pages
Bus Tech
No ratings yet
Bus Tech
4 pages
Siemens PLM LMS News Issue28-Aerospace tcm1023-235949 PDF
No ratings yet
Siemens PLM LMS News Issue28-Aerospace tcm1023-235949 PDF
32 pages
Fisa Tehnica Camera Supraveghere Exterior Acvil Pro ACV-EF20-1080PL 2.0
No ratings yet
Fisa Tehnica Camera Supraveghere Exterior Acvil Pro ACV-EF20-1080PL 2.0
3 pages
UMC 1st Meeting Detail List
No ratings yet
UMC 1st Meeting Detail List
9 pages
Module 1-AI
No ratings yet
Module 1-AI
79 pages
Normalization 1
No ratings yet
Normalization 1
5 pages
Scmax4700 8700 RQT0B21 Eng Esp
No ratings yet
Scmax4700 8700 RQT0B21 Eng Esp
21 pages
Electronics 2: BJT AC Analysis
No ratings yet
Electronics 2: BJT AC Analysis
37 pages
Model Paper Ece Eee
No ratings yet
Model Paper Ece Eee
12 pages
Switchgear & Protection MCQ
100% (1)
Switchgear & Protection MCQ
65 pages

Performance Evaluation of Parallel Computers

Uploaded by

Performance Evaluation of Parallel Computers

Uploaded by

Performance Evaluation of

 Standard Performance Measures

 Relationship between Execution Time, Speedup and Efficiency and the

 In the ideal case:

 Graphs showing relationship b/w T(n) and no. of processors

 Graphs showing relationship b/w S(n) and no. of processors

 Graphs showing relationship b/w E(n) and no. of processors

 Standard Performance Measures

 Most of the standard measures adopted by the industry to compare the

 Generally measured in units of:

 Benchmarks are a set of programs of program fragments used to compare the

 Sources of Parallel Overhead

 Parallel computers in practice do not achieve linear speedup or an efficiency of

 Sun & Ni’s Law

 According to Amdahl’s Law, a program contains two types of operations:

 Let, the time Ts taken to perform sequential operations be a fraction α (0<α≤1)

 Thus, Ts = α.T(1) and Tp = (1-α).T(1)

 Assuming that the parallel operations achieve linear speedup

 Thus, the speedup with n processors will be:

 Sequential operations will tend to dominate the speedup as n becomes very

 Note: Sequential bottleneck cannot be removed just by increasing the no. of

 Thus, a successful method of overcoming this shortcoming is to increase the

 According to Gustafson’s Law,

 In accuracy-critical applications, it is desirable to solve the largest problem size

 Let, Ts be the constant time tank to perform sequential operations; and

 Assuming that parallel operations achieve a linear speedup

 Let α be the fraction of sequential work load in problem, i.e.

 Then the speedup can be expressed as : with n processors is:

 This inherently demands an increased or scaled work load,

 Assumptions made while deriving the above expression:

 A parallel computing system is said to be scalable if its efficiency can be

 Scalability of a parallel system is the measure of its capacity to increase

 The general form of the function is derived using an equivalent definition of

 If the efficiency is fixed at some constant value K then

 Where, K’ is a constant for fixed efficiency K.

 Search Based Tools

 A way to collect data about an application is to instrument the application

 Doing this, some perturbation of the application program will occur

 Intrusion includes both:

 To address such effects, you may adopt the following approaches:

You might also like