33% found this document useful (3 votes)
870 views

Parallel and Distributed Computing

This document provides information on a course titled "Parallel and Distributed Computing". The course aims to introduce students to parallel computing concepts including architectures, modeling, and algorithm design. Students will learn mainstream parallel programming and develop skills in designing, coding, and debugging parallel software. The course objectives are to understand parallel and distributed systems and develop basic parallel applications. Upon completing the course, students will be able to design distributed systems and implement distributed algorithms and programming models.

Uploaded by

Raunak Sharma
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
33% found this document useful (3 votes)
870 views

Parallel and Distributed Computing

This document provides information on a course titled "Parallel and Distributed Computing". The course aims to introduce students to parallel computing concepts including architectures, modeling, and algorithm design. Students will learn mainstream parallel programming and develop skills in designing, coding, and debugging parallel software. The course objectives are to understand parallel and distributed systems and develop basic parallel applications. Upon completing the course, students will be able to design distributed systems and implement distributed algorithms and programming models.

Uploaded by

Raunak Sharma
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

Parallel and Distributed Computing L,T,P,J,C

Subject Code:
2,0,2,4,4
The goal of the course is to provide an introduction to parallel computing
including parallel computer architectures, analytical modeling of parallel
programs, principles of parallel algorithm design. We will introduce existing
mainstream parallel programming environment and present development
Course Description
situation, which will make the students understand the basic knowledge of
parallel programming. The labs will guide students to use tools to write
parallel programs, enable them to master skills to design, code, and debug,
parallel software.
Objectives 1. To introduce the fundamentals of parallel and distributed computing
including parallel and distributed architectures and paradigms
2. To understand the technologies, system architecture, and
communication architecture that propelled the growth of parallel and
distributed computing systems
3. To develop and execute basic parallel and distributed application using
basic programming models and tools.

Expected Outcome Students who complete this course successfully are expected to:

1. Design and implement distributed computing systems.


2. Asses models for distributed systems.
3. Design and implement distributed algorithms.
4. Experiment with mechanisms such as client/server and P2P
algorithms, remote procedure calls (RPC/RMI), and consistency.
5. Analyse the requirements for programming parallel systems and
critically evaluate the strengths and weaknesses of parallel
programming models.
6. Differentiate between the major classes of parallel processing systems.
7. Analyse the efficiency of a parallel processing system and evaluate the
types of application for which parallel programming is useful.

Module Topics L Hrs SLO


1 Parallelism Fundamentals
Motivations Key Concepts - Challenges - Overview: Parallel 2 2
computing, architectural demands and trends, goals of parallelism,
communication, coordination.
2 Parallel Architectures
Multi-core Processors GPUs Instruction Level Support for Parallel 4 5, 17
Programming Shared vs. Distributed Memory Symmetric
Multiprocessing (SMP) SIMD Vector Processing
3 Parallel Decomposition
Preliminaries Decomposition Techniques Characteristics of Tasks
and Interactions Parallel Programming Models Parallel Algorithm 5 5, 17
Design

4 Communication
Message Passing Distributed Shared Memory Group
Communication - Case Study (Sockets, MPI, OpenMP, RPC and Java 4 5, 17
RMI)

5 Coordination
Time and Global States Coordination and Agreement Transaction 5 5, 17
and Concurrency Control- Distributed Transactions

6 Distributed Systems
Characterization of Distributed Systems Distributed File System
6 5, 17
Distributed Web-based System Distributed Coordination-based
System
7 Distributed Computing Platforms
2 5, 17
Cluster Computing Grid Computing Cloud Computing
8 Recent Trends
2 11
Map Reduce Paradigm Hadoop - Spark
Lecture Hours 30
Lab (Indicative List of Experiments (in the areas of ) 5, 17, 14
1. OpenMP Basic programs such as Vector addition, Dot Product

2. OpenMP Loop work-sharing and sections work-sharing

3. OpenMP Combined parallel loop reduction and Orphaned parallel loop reduction

4. OpenMP Matrix multiply (specify run of a GPU card, large scale data
Complexity of the problem need to be specified)

5. MPI Basics of MPI

6. MPI Communication between MPI process

7. MPI Advanced communication between MPI process

8. MPI Collective operation with synchronization

9. MPI Collective operation with data movement


10. MPI Collective operation with collective computation

11. MPI Non-blocking operation


12. LRMS Grid / P2P/Cloud

13. CUDA

Sample Exercises

1. Assume that a program generates large quantities of floating point data that is stored
in an array. In order to determine the distribution of the data, we can make a
histogram of the data. To make a histogram, we simply divide the range of the data
up into equal sized subintervals, or bins; determine the number of measurements in
each bin; and plot a bar graph showing the relative sizes of the bin. Use MPI to
implement the histogram.

2. Suppose we toss darts randomly at a square dartboard, whose bullseye is at the


origin, and whose sides are 2 feet in length. Suppose also that there is a circle
inscribed in the square dartboard. The radius of the circle is 1 foot, and its area is
square feet. If the points that are hit by the darts are uniformly distributed (and we
always hit the square), then the number of darts that hit inside the circle should
approximately satisfy the equation

since the ratio of the area of the circle to the area of the square is .

We can use this formula to estimate the value of with a random number generator:
number_in_circle = 0;
for (toss = 0; toss < number_of_tossess; toss++) {
x = random double between -1 and 1;
y = random double between -1 and 1;
distance_squared = x * x + y * y;
if (distance_squared <= 1) number_in_circle++;
}

pi_estimate=4*number_in_circle/(double)number_of_tossess;

This is called a Monte Carlo method, since it uses randomness. Write a program
that uses the above Monte Carlo method to estimate (MPI / Pthreads/OpenMP).
3. Consider a problem where you have a large array of data, where ,
and you want to perform the same computation on each element of the array. Take
the initial conditions of the array such that . The operation to
perform on each element is the function . After the operation has been
performed on all array elements, we want to calculate the sum of all of the array
elements. Write a parallel code using Master-Slave scheme where the Master
initializes the array, distributes chunks of the array to the slave workers to do the
computation, and then gathers the computed data to ultimately produce the sum.
Also, achieve a load balance between processors.

4. Write a code that sets a real variable on each of processors equal to


of the task. Then write your own routine to perform a
reduction operation over all processors to sum the values using only and
calls. Do this global reduction operation using the following
communication algorithms:

a. Communications in a ring.

b. Hypercube communications.

Put in timing calls using to test the timing of your routines compared to
using the MPI routine to do the same computation.

5. Conways Game of Life is played on a rectangular grid of cells that may or may not
contain an organism. The state of the cells is updated each time step by applying the
following set of rules:

Every organism with two or three neighbours survives.

Every organism with four or more neighbours dies from overpopulation.

Every organism with zero or one neighbours dies from isolation.

Every empty cell adjacent to three organisms gives birth to a new one.

Create an MPI program that evolves a board of arbitrary size (dimensions could be
specified at the command line) over several iterations. The board could be randomly
generated or real from a file. Try applying the geometric decomposition pattern to
partition the work among your process.

6. A search engine can be implemented using a farm of servers; each contains a subset
of data that can be searched. Assume that this farm server has a single front-end that
interacts with clients who submit queries. Implement the above server form using
master-worker pattern.

7. Use OpenMP to implement a producer-consumer program in which some of the


threads are producers and others are consumers. The producers read text from a
collection of files, one per producer. They insert lines of text into a single shared
queue. The consumers take the lines of text and tokenize them. Tokens are words
separated by white space. When a consumer finds a token, it writes it to stdout.

8. Build a P2P infrastructure in which every peer has the ability to publish XML
snippets and to express interest in particular XML fragments.

Project # Generally a team project 60 [Non 5, 7, 13


Contact
Assessment on a continuous basis with a min of 3 reviews. hrs]

Projects may be given as group projects

Some of the projects are:


1. Implement distributed banking system
2. Implementing mail box
3. Implementing distributed has table
4. Implementing / Using distributed file system
5. Implementing message bus
6. Building application using parallelization concepts
7. Performance measurements of different clustering approach
8. Grid
9. Cloud
10. Map Reduce
11. JXTA

Text Books
1. George Coulouris, Jean Dollimore, Tim Kindberg, and Gordon Blair, Distributed Systems:
Concepts and Design, 5th Edition, Pearson / Addison Wesley, 2012

2. Ananth Grama, Anshul Gupta, George Karypis and Vipin Kumar, Introduction to Parallel
Computing, Pearson, 2nd Edition, 2008

Reference Books
1. Andrew S. Tanenbaum and Maarten Van Steen, Distributed Systems: Principles and
Paradigms, Pearson, 2nd Edition, 2006

2. Pradeep K. Sinha, Distributed Operating System: Concepts and Design, PHI Learning
Pvt. Ltd., 2007
Parallel and Distributed Computing
Knowledge Areas that contain topics and learning outcomes covered in the course

Knowledge Area Total Hours of Coverage

CS: IM(Information Management) 6

CS: PD(Parallel and Distributed Computing) 14

CS: SF(System Fundamental) 10

Body of Knowledge coverage

KA Knowledge Unit Topics Covered Hours

CS: Parallelism Multiple simultaneous computations 2


PD Fundamentals Goals of parallelism
Parallelism, communication and coordination
Map-reduce

CS: Parallel Architecture Multicore processors 4


PD Shared vs. distributed memory
Symmetric multiprocessing (SMP)
SIMD
Vector Processing

CS: Communication and Shared Memory 9


PD Coordination Consistency
Message Passing

CS: Parallel Independence and partitioning, 4


PD Decomposition Data and task decomposition,
Synchronization

CS: Distributed Systems Faults (Characters of distributed systems) 6


PD Core distributed algorithms
Distributed service design

CS: Cloud Computing Cloud services 2


PD Virtualization
Distributed file system
Clusters
Grid computing

CS: Parallel Algorithm, Speed-up 1


PD Analysis, and Parallel algorithm
Programming

CS: Networked Distributed applications (client/server, peer-to-peer, 2


NC Applications cloud etc.)
Socket APIs

Total hours 30

Where does the course fit in the curriculum?


This course is a
Core Course.
Suitable from 5th semester onwards.
Knowledge of any one of the programming language is essential.
Knowledge of Computer Architecture and Operating System is essential.

What is covered in the course?

Part I: Parallel architecture

This section introduces the diverse multiprocessor architectures such as shared memory, message
passing, data parallel etc. We cover how communication and coordination occurs in parallel
computing system.

Part II: Parallel algorithm

This section covers parallel algorithm design and development for shared memory and other
parallel architectures. It introduces languages for parallel programming such as OpenMP and
MPI. We also discuss about multi-threading and other relevant literature.

Part III: Parallel decomposition


This section deals with parallel decomposition techniques for solving large scale problems using
parallel architectures.
Part IV: Parallel Performance
This section describes performance of parallel system and algorithms. It explains the reason for
poor performance and intends to suggest ways to improve performance of parallel applications.

What is the format of the course?

This Course is designed with 100 minutes of in-classroom sessions per week, video/reading
instructional material, 100 minutes of lab hours per week, as well as 200 minutes of non-contact
time spent on implementing course related project. Generally this course should have the
combination of lectures, in-class discussion, case studies, guest-lectures, mandatory off-class
reading material, quizzes.

How are students assessed?

Students are assessed on a combination group activities, classroom discussion, projects,


and continuous, final assessment tests.

Additional weightage will be given based on their rank in crowd sourced projects/ Kaggle
like competitions.

Students can earn additional weightage based on certificate of completion of a related


MOOC course.
Session wise plan

Class Hour Lab Topic Covered levels of Reference Remarks


Hour mastery Book

2 Motivations Key Usage 1, 3


Concepts,
Challenges
2 Multi-core Usage 1, 4
Processors GPUs
-Shared vs.
Distributed
Memory
3 Preliminaries Usage 1
Decomposition
Techniques
3 Challenges in Usage 1
handling Big Data
4 Parallel Familiarity 2
Programming
Models Parallel
Algorithm Design
3 Message Passing Usage 1, 2, 4
Distributed Shared
Memory
4 Sockets, MPI and Familiarity 1, 2, 4 LAB
Java RMI Component
3 4 Time and Global Usage 1, 2, 4 LAB
States Component
Coordination and
Agreement
3 4 Using MPI and Usage 1, 2, 4 LAB
RMI for Component
Transaction and
Concurrency
Control-
Distributed
Transactions
3 2 Characterization of Usage 1, 3, 4 LAB
Distributed Component
Systems
Distributed File
System
1 4 Distributed Web- Usage 1, 3, 4 LAB
based System Component
Distributed
Coordination-based
System
4 Cluster Computing LAB
Component
4 Grid Computing Assessment LAB
Component
1 4 Cloud Computing Usage LAB
Component

2 Recent trends Familiarity

30 Hours (2 30
Credit hours Hours (2
Credit
Weeks hours /
schedule) week )

You might also like