Assignment 1

The document outlines an assignment on parallel computing that is due on September 17, 2019. It contains 5 questions related to parallelizing algorithms for summing numbers, optimizing memory access for vector and matrix operations, and analyzing performance on single-processor and multi-processor systems based on cache hit rates and memory access times. The student is asked to analyze algorithms, calculate performance metrics like speedup and efficiency, and determine peak achievable performance for different parallel problems and system configurations.

Uploaded by

Samira Yomi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

209 views2 pages

Assignment 1

Uploaded by

Samira Yomi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 2

Assignment 1

Due on September 17, 2019

1. Parallelize the program of finding the sum of n numbers (a1+a2+...+an) using different
numbers of processes (Algorithm 1: n/2 processors; Algorithm 2: n/log2n processors).

(1) Draw the diagrams for the implementations of Algorithm 1 and Algorithm 2 respectively.

(2) Find the numbers of operations for the implementations of the sequential algorithm,
Algorithm 1, and Algorithm 2 respectively.

(3) Calculate the speedups of Algorithm 1 and Algorithm 2 respectively.

(4) Calculate the efficiencies of Algorithm 1 and Algorithm 2 respectively.

(5) Are Algorithm 1 and Algorithm 2 cost optimal respectively? Justify your answer.

2. Consider a memory system with a level 1 cache of 32 KB and DRAM of 512 MB with

the processor operating at 1 GHz. The latency to L1 cache is one cycle and the latency to
DRAM is 100 cycles. In each memory cycle, the processor fetches four words (cache line
size is four words). What is the peak achievable performance of a dot product of two
vectors? Note: Where necessary, assume an optimal cache placement policy.
1 /* dot product loop */
2 for (i = 0; i < dim; i++)
3 dot_prod += a[i] * b[i];

3. Now consider the problem of multiplying a dense matrix with a vector using a two loop

dot-product formulation. The matrix is of dimension 4K x 4K. (Each row of the matrix
takes 16 KB of storage.) What is the peak achievable performance of this technique using
a two-loop dot-product based matrix-vector product?

1 /* matrix-vector product loop */

2 for (i = 0; i < dim; i++)
3 for (j = 0; i < dim; j++)
4 c[i] += a[i][j] * b[j];

4. Extending this further, consider the problem of multiplying two dense matrices of

dimension 4K x 4K. What is the peak achievable performance using a three-loop dotproduct
based formulation? (Assume that matrices are laid out in a row-major fashion.)
1 /* matrix-matrix product loop */
2 for (i = 0; i < dim; i++)
3 for (j = 0; i < dim; j++)
4 for (k = 0; k < dim; k++)
5 c[i][j] += a[i][k] * b[k][j];
5. Consider an SMP with a distributed shared-address-space. Consider a simple cost
model in which it takes 10 ns to access local cache, 100 ns to access local memory, and
400 ns to access remote memory. A parallel program is running on this machine. The
program is perfectly load balanced with 80% of all accesses going to local cache, 10% to
local memory, and 10% to remote memory. What is the effective memory access time for
this computation? If the computation is memory bound, what is the peak computation
rate?
Now consider the same computation running on one processor. Here, the processor hits
the cache 70% of the time and local memory 30% of the time. What is the effective peak
computation rate for one processor? What is the fractional computation rate of a processor
in a parallel configuration as compared to the serial configuration?
Hint: Notice that the cache hit for multiple processors is higher than that for one
processor. This is typically because the aggregate cache available on multiprocessors is
larger than on single processor systems.

Equifax PEC 2025
No ratings yet
Equifax PEC 2025
4 pages
DBMS OS CN OOPs MostFrequentlyAskedQuestions
No ratings yet
DBMS OS CN OOPs MostFrequentlyAskedQuestions
91 pages
PC-CS402 OS Question Bank
No ratings yet
PC-CS402 OS Question Bank
58 pages
Vector and Array Processor
No ratings yet
Vector and Array Processor
3 pages
Page Rank Questions
No ratings yet
Page Rank Questions
4 pages
Assignment 5 - OpenCL Optimizations
100% (1)
Assignment 5 - OpenCL Optimizations
2 pages
HPC Notes
No ratings yet
HPC Notes
24 pages
HPC Int I Retest Answer Key
No ratings yet
HPC Int I Retest Answer Key
10 pages
Unit 1 Introduction To Embedded System Design
No ratings yet
Unit 1 Introduction To Embedded System Design
67 pages
Trie and Redblack Tree Mcqs
No ratings yet
Trie and Redblack Tree Mcqs
9 pages
Asymptotic Notations
No ratings yet
Asymptotic Notations
4 pages
Code Optimization Unit-4-II
No ratings yet
Code Optimization Unit-4-II
27 pages
Optimal Binary Search Tree (OBST)
No ratings yet
Optimal Binary Search Tree (OBST)
104 pages
Ai Lab
No ratings yet
Ai Lab
48 pages
Assignment 1: Name Class Date Period Sbuid Netid Email
No ratings yet
Assignment 1: Name Class Date Period Sbuid Netid Email
4 pages
Introduction To Common Lisp
No ratings yet
Introduction To Common Lisp
32 pages
LAB Manual - PART A - PLSQL
No ratings yet
LAB Manual - PART A - PLSQL
8 pages
CS3401 Algorithms Syllabus
No ratings yet
CS3401 Algorithms Syllabus
3 pages
Lab 3
No ratings yet
Lab 3
23 pages
Chpater 1 - Unit 2
No ratings yet
Chpater 1 - Unit 2
31 pages
P and NP Problems
No ratings yet
P and NP Problems
4 pages
DPCO Unit 1 - New
No ratings yet
DPCO Unit 1 - New
78 pages
Compiler Design MCQ - Javatpoint
No ratings yet
Compiler Design MCQ - Javatpoint
1 page
Question Bank of Applied Machine Learning
No ratings yet
Question Bank of Applied Machine Learning
2 pages
FS-113 Maintenance Book
100% (1)
FS-113 Maintenance Book
41 pages
Striver's CP List (Solely For Preparing For Coding Rounds of Top Prod Based Companies and To Do Well in Coding Sites and Competitions)
No ratings yet
Striver's CP List (Solely For Preparing For Coding Rounds of Top Prod Based Companies and To Do Well in Coding Sites and Competitions)
30 pages
CS3401-ALGORITHMS QB Original
No ratings yet
CS3401-ALGORITHMS QB Original
51 pages
Assignment 1 DBMS JUL 2022
100% (1)
Assignment 1 DBMS JUL 2022
11 pages
Practical - 7: Aim: Implement A Program That Remove Left Recursion On Given Grammar. Theory: Left Recursion
No ratings yet
Practical - 7: Aim: Implement A Program That Remove Left Recursion On Given Grammar. Theory: Left Recursion
4 pages
CS3401 - Algorithm
No ratings yet
CS3401 - Algorithm
37 pages
Syllabus
No ratings yet
Syllabus
9 pages
Closure Properties of Context-Free Languages: Osama Awwad
No ratings yet
Closure Properties of Context-Free Languages: Osama Awwad
25 pages
Matlab File - Deepak - Yadav - Bca - 4TH - Sem - A50504819015
No ratings yet
Matlab File - Deepak - Yadav - Bca - 4TH - Sem - A50504819015
59 pages
Code Optimization
0% (1)
Code Optimization
90 pages
Theory of Automata Chapter 4
No ratings yet
Theory of Automata Chapter 4
24 pages
Algo - Mod12 - NP-Hard and NP-Complete Problems
No ratings yet
Algo - Mod12 - NP-Hard and NP-Complete Problems
56 pages
Query Processing - Database Questions & Answers - Sanfoundry 00
No ratings yet
Query Processing - Database Questions & Answers - Sanfoundry 00
7 pages
Part - A: Database Management System Lab
No ratings yet
Part - A: Database Management System Lab
26 pages
Assignment 1: Data Mining MGSC5126 - 10
No ratings yet
Assignment 1: Data Mining MGSC5126 - 10
10 pages
Question Bank 1to11
No ratings yet
Question Bank 1to11
19 pages
Unit 3 Basic Processing Unit
No ratings yet
Unit 3 Basic Processing Unit
86 pages
Python Model Soultion 2
0% (1)
Python Model Soultion 2
12 pages
Os Lab
No ratings yet
Os Lab
26 pages
Stqa Viva
No ratings yet
Stqa Viva
10 pages
Cache-Conscious Algorithms and Data Structures
No ratings yet
Cache-Conscious Algorithms and Data Structures
47 pages
AoA Important Question
100% (1)
AoA Important Question
3 pages
440 Sample Questions Dec
No ratings yet
440 Sample Questions Dec
7 pages
COA Class Test-1
No ratings yet
COA Class Test-1
3 pages
Institute of Technology & Science, Mohan Nagar, Ghaziabad Compiler Design Model Questions Unit-1
No ratings yet
Institute of Technology & Science, Mohan Nagar, Ghaziabad Compiler Design Model Questions Unit-1
4 pages
Huawei FusionServer Pro V5 Rack Server Data Sheet PDF
No ratings yet
Huawei FusionServer Pro V5 Rack Server Data Sheet PDF
20 pages
DSA Lab Syllabus
No ratings yet
DSA Lab Syllabus
1 page
Exercises 695 Clas
No ratings yet
Exercises 695 Clas
3 pages
CS8602 CD
No ratings yet
CS8602 CD
2 pages
All Theory Questions
No ratings yet
All Theory Questions
2 pages
At-2X Service Manual
No ratings yet
At-2X Service Manual
106 pages
Howto HA Zimbra8
No ratings yet
Howto HA Zimbra8
13 pages
Spring End Sem Data Structure Question 2012-13
No ratings yet
Spring End Sem Data Structure Question 2012-13
2 pages
Examples of Operating Systems
No ratings yet
Examples of Operating Systems
4 pages
Unit-2 (Building Drones)
No ratings yet
Unit-2 (Building Drones)
45 pages
Fortios v7.2.6 Release Notes
No ratings yet
Fortios v7.2.6 Release Notes
53 pages
GT-521FX2 Datasheet V1.1
100% (2)
GT-521FX2 Datasheet V1.1
10 pages
Practice Questions For Chapter 3 With Answers
No ratings yet
Practice Questions For Chapter 3 With Answers
9 pages
Quiz 1
No ratings yet
Quiz 1
16 pages
CSE 3RDedit Organized
No ratings yet
CSE 3RDedit Organized
8 pages
FirstStep-FullCrateInspector (Updated) (1) Compressed - PPTX 0
No ratings yet
FirstStep-FullCrateInspector (Updated) (1) Compressed - PPTX 0
9 pages
Features of Pentium and Above Microprocessors
0% (1)
Features of Pentium and Above Microprocessors
5 pages
PL 1 - Questions2 - HelwanCoders
No ratings yet
PL 1 - Questions2 - HelwanCoders
17 pages
Shortest Common Superstring1
No ratings yet
Shortest Common Superstring1
14 pages
Comparator & Decoders: by Dr. Nermeen Talaat
No ratings yet
Comparator & Decoders: by Dr. Nermeen Talaat
31 pages
CallRecord Log
No ratings yet
CallRecord Log
51 pages
19A02201T Basic Electrical & Electronics Engineering
No ratings yet
19A02201T Basic Electrical & Electronics Engineering
2 pages
Logcat From Poco
No ratings yet
Logcat From Poco
59 pages
M10 - Bridging L2 To Existing Networks
No ratings yet
M10 - Bridging L2 To Existing Networks
26 pages
Lecture#1-Introduction To DIP
No ratings yet
Lecture#1-Introduction To DIP
19 pages
6SL3220-3YE32-0AF0 Datasheet en
No ratings yet
6SL3220-3YE32-0AF0 Datasheet en
3 pages
TI Impedance Front End Design
No ratings yet
TI Impedance Front End Design
18 pages
Freshers Jobs 10 June 2024
No ratings yet
Freshers Jobs 10 June 2024
5 pages
New Motokit WPTT Box Solution
No ratings yet
New Motokit WPTT Box Solution
2 pages
Digimar CX - 1 Fly en 2005 06 08 - 001
No ratings yet
Digimar CX - 1 Fly en 2005 06 08 - 001
7 pages
Prescurtari
No ratings yet
Prescurtari
3 pages
GC 2024 06 19
No ratings yet
GC 2024 06 19
5 pages
EE-206 Networks and Linear Systems IV Semester B.Tech. (EEE) Assignment-3
No ratings yet
EE-206 Networks and Linear Systems IV Semester B.Tech. (EEE) Assignment-3
3 pages
61811333035
No ratings yet
61811333035
3 pages
Memory Hierarchy and Cache Quiz Answers
No ratings yet
Memory Hierarchy and Cache Quiz Answers
3 pages
Terraform Course L200 Syllabus
No ratings yet
Terraform Course L200 Syllabus
3 pages
DIN-68S-01 Datasheet en 1
No ratings yet
DIN-68S-01 Datasheet en 1
2 pages
Development of DWT Algorithm For Frequency Estimation in DSP Based Pmu
No ratings yet
Development of DWT Algorithm For Frequency Estimation in DSP Based Pmu
1 page
Python Advanced Programming: The Guide to Learn Python Programming. Reference with Exercises and Samples About Dynamical Programming, Multithreading, Multiprocessing, Debugging, Testing and More
From Everand
Python Advanced Programming: The Guide to Learn Python Programming. Reference with Exercises and Samples About Dynamical Programming, Multithreading, Multiprocessing, Debugging, Testing and More
Marcus Richards
No ratings yet
Foundation Course for Advanced Computer Studies
From Everand
Foundation Course for Advanced Computer Studies
Franck Ismael Djédjé
No ratings yet
IGNOU PGDCA MCS 202 Computer Organisation Previous Years Unsolved Papers
From Everand
IGNOU PGDCA MCS 202 Computer Organisation Previous Years Unsolved Papers
Manish Soni
No ratings yet

Assignment 1

Uploaded by

Assignment 1

Uploaded by

Assignment 1

Due on September 17, 2019

(3) Calculate the speedups of Algorithm 1 and Algorithm 2 respectively.

(4) Calculate the efficiencies of Algorithm 1 and Algorithm 2 respectively.

1 /* matrix-vector product loop */

You might also like