Parallel and Distributed Computing
Parallel and Distributed Computing
Subject Code:
2,0,2,4,4
The goal of the course is to provide an introduction to parallel computing
including parallel computer architectures, analytical modeling of parallel
programs, principles of parallel algorithm design. We will introduce existing
mainstream parallel programming environment and present development
Course Description
situation, which will make the students understand the basic knowledge of
parallel programming. The labs will guide students to use tools to write
parallel programs, enable them to master skills to design, code, and debug,
parallel software.
Objectives 1. To introduce the fundamentals of parallel and distributed computing
including parallel and distributed architectures and paradigms
2. To understand the technologies, system architecture, and
communication architecture that propelled the growth of parallel and
distributed computing systems
3. To develop and execute basic parallel and distributed application using
basic programming models and tools.
Expected Outcome Students who complete this course successfully are expected to:
4 Communication
Message Passing Distributed Shared Memory Group
Communication - Case Study (Sockets, MPI, OpenMP, RPC and Java 4 5, 17
RMI)
5 Coordination
Time and Global States Coordination and Agreement Transaction 5 5, 17
and Concurrency Control- Distributed Transactions
6 Distributed Systems
Characterization of Distributed Systems Distributed File System
6 5, 17
Distributed Web-based System Distributed Coordination-based
System
7 Distributed Computing Platforms
2 5, 17
Cluster Computing Grid Computing Cloud Computing
8 Recent Trends
2 11
Map Reduce Paradigm Hadoop - Spark
Lecture Hours 30
Lab (Indicative List of Experiments (in the areas of ) 5, 17, 14
1. OpenMP Basic programs such as Vector addition, Dot Product
3. OpenMP Combined parallel loop reduction and Orphaned parallel loop reduction
4. OpenMP Matrix multiply (specify run of a GPU card, large scale data
Complexity of the problem need to be specified)
13. CUDA
Sample Exercises
1. Assume that a program generates large quantities of floating point data that is stored
in an array. In order to determine the distribution of the data, we can make a
histogram of the data. To make a histogram, we simply divide the range of the data
up into equal sized subintervals, or bins; determine the number of measurements in
each bin; and plot a bar graph showing the relative sizes of the bin. Use MPI to
implement the histogram.
since the ratio of the area of the circle to the area of the square is .
We can use this formula to estimate the value of with a random number generator:
number_in_circle = 0;
for (toss = 0; toss < number_of_tossess; toss++) {
x = random double between -1 and 1;
y = random double between -1 and 1;
distance_squared = x * x + y * y;
if (distance_squared <= 1) number_in_circle++;
}
pi_estimate=4*number_in_circle/(double)number_of_tossess;
This is called a Monte Carlo method, since it uses randomness. Write a program
that uses the above Monte Carlo method to estimate (MPI / Pthreads/OpenMP).
3. Consider a problem where you have a large array of data, where ,
and you want to perform the same computation on each element of the array. Take
the initial conditions of the array such that . The operation to
perform on each element is the function . After the operation has been
performed on all array elements, we want to calculate the sum of all of the array
elements. Write a parallel code using Master-Slave scheme where the Master
initializes the array, distributes chunks of the array to the slave workers to do the
computation, and then gathers the computed data to ultimately produce the sum.
Also, achieve a load balance between processors.
a. Communications in a ring.
b. Hypercube communications.
Put in timing calls using to test the timing of your routines compared to
using the MPI routine to do the same computation.
5. Conways Game of Life is played on a rectangular grid of cells that may or may not
contain an organism. The state of the cells is updated each time step by applying the
following set of rules:
Every empty cell adjacent to three organisms gives birth to a new one.
Create an MPI program that evolves a board of arbitrary size (dimensions could be
specified at the command line) over several iterations. The board could be randomly
generated or real from a file. Try applying the geometric decomposition pattern to
partition the work among your process.
6. A search engine can be implemented using a farm of servers; each contains a subset
of data that can be searched. Assume that this farm server has a single front-end that
interacts with clients who submit queries. Implement the above server form using
master-worker pattern.
8. Build a P2P infrastructure in which every peer has the ability to publish XML
snippets and to express interest in particular XML fragments.
Text Books
1. George Coulouris, Jean Dollimore, Tim Kindberg, and Gordon Blair, Distributed Systems:
Concepts and Design, 5th Edition, Pearson / Addison Wesley, 2012
2. Ananth Grama, Anshul Gupta, George Karypis and Vipin Kumar, Introduction to Parallel
Computing, Pearson, 2nd Edition, 2008
Reference Books
1. Andrew S. Tanenbaum and Maarten Van Steen, Distributed Systems: Principles and
Paradigms, Pearson, 2nd Edition, 2006
2. Pradeep K. Sinha, Distributed Operating System: Concepts and Design, PHI Learning
Pvt. Ltd., 2007
Parallel and Distributed Computing
Knowledge Areas that contain topics and learning outcomes covered in the course
Total hours 30
This section introduces the diverse multiprocessor architectures such as shared memory, message
passing, data parallel etc. We cover how communication and coordination occurs in parallel
computing system.
This section covers parallel algorithm design and development for shared memory and other
parallel architectures. It introduces languages for parallel programming such as OpenMP and
MPI. We also discuss about multi-threading and other relevant literature.
This Course is designed with 100 minutes of in-classroom sessions per week, video/reading
instructional material, 100 minutes of lab hours per week, as well as 200 minutes of non-contact
time spent on implementing course related project. Generally this course should have the
combination of lectures, in-class discussion, case studies, guest-lectures, mandatory off-class
reading material, quizzes.
Additional weightage will be given based on their rank in crowd sourced projects/ Kaggle
like competitions.
30 Hours (2 30
Credit hours Hours (2
Credit
Weeks hours /
schedule) week )