0% found this document useful (0 votes)

31 views55 pages

Week1-Parallel-and-Distributed-Computing

Uploaded by

bjotero

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

31 views55 pages

Week1-Parallel-and-Distributed-Computing

Uploaded by

bjotero

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 55

CS481 Parallel and

Distributed
Computing
Instructor
Muhammad Danish Khan
Lecturer, Department of Computer Science
FAST NUCES Karachi
[email protected]

Recommended Text Book

Introduction to Parallel Computing, Second Edition

by Ananth Grama
Marks Distribution and
Deadlines
Assessment Weightage Schedule/Deadlines/ Remarks
Mid-term I Examinations 15 6th Week
Mid-term II Examinations 15 12th Week
Tasks/Assignments 10 1-2 Assignment Tasks per Week with Plagiarism
Checking
Semester Project 10 Project idea submission: 7th week, Final Submission
due in 15th - 16th Week
Final Examinations 50 Comprehensively from all the covered and assigned
topics.
Pre-Requisite
Operating System Concepts
Algorithms
Semester Plan
Week 1: Revision of Operating System Concepts

Week 2: Introduction to Parallel Computing, Assignment Task(s)

Week 3: Parallel Programming Platforms

Week 4: Parallel Programming Platforms , Assignment Task(s)

Week 5: Principles of Parallel Algorithm Design, Quiz-1

Week 6: Mid Term-1 Examinations

Semester Plan
Week 7: Principles of Parallel Algorithm Design, Assignment Task(s), Project Proposal Submission

Week 8: Programming Shared Address Space

Week 9: Programming Shared Address Space

Week 10: Programming Shared Address Space, Assignment Task(s)

Week 11: Programming Using the Message Passing Paradigm, Quiz-2

Week 12: Mid Term-2 Examinations

Semester Plan
Week 13: Programming Using the Message Passing Paradigm, Assignment Task(s)

Week 14: Distributed System Models and Enabling Technologies, Assignment Task(s)

Week 15: Distributed System Models and Enabling Technologies, Quiz-3, Project
Evaluations

Week 16: Distributed System Models and Enabling technologies, Project Evaluations
LMS: Google Class Room
Section 5C Class Code: eg2u47t
https://classroom.google.com/c/Mzg4NTA1MTMzNDI3?cjc=eg2u47t
Operating System Concepts
Program
◦ Set of instructions and associated data
◦ resides on the disk and is loaded by the operating system to perform some task.
◦ E.g. An executable file or a python script file.

Process
◦ A program in execution.
◦ In order to run a program, the operating system's kernel is first asked to create a new
process, which is an environment in which a program executes.
◦ consists of instructions, user-data, and system-data segments, CPU, memory, address-
space, disk acquired at runtime
Thread
◦ the smallest unit of execution in a process.
◦ A thread simply executes instructions serially.
◦ A process can have multiple threads running as part of it.
◦ Processes don't share any resources amongst themselves whereas threads
of a process can share the resources allocated to that particular process,
including memory address space.
CLOUD COMPUTING DEFINITION

•Paralllel computing (processing):

• the use of two or more processors (computers), usually
within a single system, working simultaneously to solve a
single problem.

•Distributed computing (processing):

• Any computing that involves multiple computers remote
from each other that each have a role in a computation
problem or information processing.
Parallel programming:
• The human process of developing programs that express what
computations should be executed in parallel.
Serial Computing:

•Traditionally, software has been written for serial

computation:
•A problem is broken into a discrete series of instructions
•Instructions are executed sequentially one after another
•Executed on a single processor
•Only one instruction may execute at any moment in time
Parallel Computing
Traditionally, software has been written for serial computation:
◦ To be run on a single computer having a single Central Processing Unit (CPU);
◦ A problem is broken into a discrete series of instructions.
◦ Instructions are executed one after another.
◦ Only one instruction may execute at any moment in time.
Serial execution
“Multiprocessing systems”
◦ where multiple processes get scheduled on more than one CPU.
◦ Usually, this requires hardware support where a single system comes with
multiple cores
◦ or the execution takes place in a cluster of machines.

Multiple Processors vs Multiple Cores

Concurrency vs Parallel
Execution
Concurrency

Parallel Execution
Parallelism
The term parallelism means that an application splits its tasks up
into smaller subtasks which can be processed in parallel, for instance
on multiple CPUs at the exact same time.
Serial Execution vs. Parallel Execution

Concurrent Execution on a Single-core System

Parallel Execution on a Multicore System

Limitations of Serial
Computing
Limits to serial computing - both physical and practical reasons pose significant
constraints to simply building ever faster serial computers.

Transmission speeds - the speed of a serial computer is directly dependent upon how
fast data can move through hardware.
◦ Absolute limits are the speed of light (30 cm/nanosecond) and the transmission limit of copper
wire (9 cm/nanosecond). Increasing speeds necessitate increasing proximity of processing
elements.

Economic limitations - it is increasingly expensive to make a single processor faster.

◦ Using a larger number of moderately fast commodity processors to achieve the same (or better)
performance is less expensive.
Parallel Computing
In the simplest sense, parallel computing is the
simultaneous use of multiple computing resources
to solve a computational problem.
◦ To be run using multiple CPUs
◦ A problem is broken into discrete parts that can be solved
concurrently
◦ Each part is further broken down to a series of instructions

Instructions from each part execute simultaneously

on different CPUs
Parallel Computing: Resources
The compute resources can include:
◦ A single computer with multiple processors;
◦ A single computer with (multiple) processor(s) and some specialized
computer resources (GPU, FPGA …)
◦ An arbitrary number of computers connected by a network (Cluster)
◦ A combination of both.
Parallel Computing: The
computational problem
The computational problem usually demonstrates characteristics
such as the ability to be:
◦ Broken apart into discrete pieces of work that can be solved simultaneously;

◦ Execute multiple program instructions at any moment in time;

◦ Solved in less time with multiple compute resources than with a single
compute resource.
LD $12, (100)
ADD $11, $12
SUB $10, $11
INC $10
SW $13, ($10)
int sample1
{
X = sample2()
Return x;
}
float sample3
{
Pi=3.14
Return pi
}
Int sample2()
{
Cin>>I
Return I;
}
Parallel Computing: what for?
 Example applications include:

◦ Parallel Databases, Data Mining

◦ Web Search Engines, Web Based Business Services

◦ Computer-aided diagnosis in medicine

◦ advanced graphics and virtual reality, particularly in the entertainment industry

◦ networked video and multi-media technologies

Why Parallel Computing?
Save time

Solve larger problems

Provide parallelism (do multiple things at the same time)

… ..
Flynn Taxonomy
referred to as Flynn's Taxonomy, is a classification system for
computer architectures introduced by Michael J. Flynn in 1966. It
categorizes computer systems based on the number of instruction
streams and data streams they can handle simultaneously. This
taxonomy is particularly useful for understanding parallel processing
and the design of processors.
Flynn Taxonomy
Based on the number of concurrent instruction (single or multiple)
and data streams (single or multiple) available in the architecture
Single Instruction, Single Data
(SISD)
It represents the organization of a single computer containing a control
unit, processor unit and a memory unit.

Single instruction: only one instruction stream is being acted on by the

CPU during any one clock cycle
Single data: only one data stream is being used as input during any one
clock cycle

A single processor executes a single instruction stream, to operate

on data stored in a single memory.

This is the oldest and until recently, the most prevalent form of computer
◦ Examples: most PCs, single CPU workstations and mainframes
Single Instruction, Single Data
(SISD)
Single Instruction, Multiple Data
(SIMD)
Single instruction: All processing units execute the same instruction at any given clock cycle
Multiple data: Each processing unit can operate on a different data element
The processing units are made to operate under the control of a common control unit, thus
providing a single instruction stream and multiple data streams.
◦ Best suited for specialized problems characterized by a high degree of regularity, such as image processing.

Two varieties: Processor Arrays and Vector Pipelines

Examples:
◦ Processor Arrays: Connection Machine CM-2, Maspar MP-1, MP-2
◦ Vector Pipelines: IBM 9000, Cray C90, Fujitsu VP, NEC SX-2, Hitachi S820
Single Instruction, Multiple Data
(SIMD)
Multiple Instruction, Single Data
(MISD)
A single data stream is fed into multiple processing units.

It consists of a single computer containing multiple processors connected with

multiple control units and a common memory unit

Each processing unit operates on the data independently via

independent instruction streams.

Some conceivable uses might be:

◦ multiple frequency filters operating on a single signal stream
Multiple Instruction, Single Data
(MISD)
Multiple Instruction, Multiple
Data (MIMD)
Currently, the most common type of parallel computer. Most
modern computers fall into this category.
It represents the organization which is capable of processing
several programs at same time.
Multiple Instruction: every processor may be executing a
different instruction stream
Multiple Data: every processor may be working with a
different data stream
Execution can be synchronous or asynchronous,
deterministic or non-deterministic
◦ Examples: most current supercomputers, networked parallel computer
"grids" and multi-processor SMP computers - including some types of PCs.
Multiple Instruction, Multiple
Data (MIMD)
Some General Parallel
Terminology
Task/Process
◦ A logically discrete section of computational work. A task is typically a program or program-
like set of instructions that is executed by a processor.

Parallel Task
◦ A task that can be executed by multiple processors safely (yields correct results)

Serial Execution
◦ Execution of a program sequentially, one statement at a time. In the simplest sense, this is
what happens on a one processor machine. However, virtually all parallel tasks will have
sections of a parallel program that must be executed serially.
Parallel Execution
◦ Execution of a program by more than one task, with each task being able to execute the same or
different statement at the same moment in time.
Shared Memory
◦ From a strictly hardware point of view, describes a computer architecture where all processors have
direct (usually bus based) access to common physical memory.
◦ In a programming sense, it describes a model where parallel tasks all have the same "picture" of memory and can directly address and access the same
logical memory locations regardless of where the physical memory actually exists.

Distributed Memory
◦ In hardware, refers to network based memory access for physical memory that is not common. As a
programming model, tasks can only logically "see" local machine memory and must use
communications to access memory on other machines where other tasks are executing.
Communications
◦ Parallel tasks typically need to exchange data. There are several ways this can be accomplished, such as
through a shared memory bus or over a network, however the actual event of data exchange is
commonly referred to as communications regardless of the method employed.

Synchronization
◦ The coordination of parallel tasks in real time, very often associated with communications. Often
implemented by establishing a synchronization point within an application where a task may not
proceed further until another task(s) reaches the same or logically equivalent point.

◦ Synchronization usually involves waiting by at least one task, and can therefore cause a parallel
application's wall clock execution time to increase.
Granularity
◦ In parallel computing, granularity is a qualitative measure of the ratio of computation to communication.
◦ Coarse: relatively large amounts of computational work are done between communication events
◦ Fine: relatively small amounts of computational work are done between communication events

Observed Speedup
◦ Observed speedup of a code which has been parallelized, defined as:

◦ One of the simplest and most widely used indicators for a parallel program's performance.
Parallel Overhead
◦ The amount of time required to coordinate parallel tasks, as opposed to doing useful work. Parallel
overhead can include factors such as:
◦ Task start-up time
◦ Synchronizations
◦ Data communications
◦ Software overhead imposed by parallel compilers, libraries, tools, operating system, etc.
◦ Task termination time

Massively Parallel
◦ Refers to the hardware that comprises a given parallel system - having many processors. The meaning of
many keeps increasing, but currently BG/L* pushes this number to 6 digits.

*Blue Gene is an IBM project aimed at designing supercomputers that can reach operating
speeds in the petaFLOPS (PFLOPS) range, with low power consumption.
Scalability
◦ Refers to a parallel system's (hardware and/or software) ability to
demonstrate a proportionate increase in parallel speedup with the addition of
more processors.

◦ Factors that contribute to scalabilty include:

◦ Hardware - particularly memory-cpu bandwidths and network communications
◦ Application Algorithm
◦ Parallel overhead related
◦ Characteristics of your specific application and coding
Parallel Computer
Memory
Architectures
Memory architectures
Shared Memory
Distributed Memory
Hybrid Distributed-Shared Memory
Shared Memory
Shared memory parallel computers vary widely, but generally have in common the ability for all processors to access all
memory as global address space.

Multiple processors can operate independently but share the same memory resources.
Changes in a memory location effected by one processor are visible to all other processors.
Shared memory machines can be divided into two main classes based upon memory access times: UMA and NUMA.
Shared Memory : UMA vs. NUMA
Uniform Memory Access (UMA):
◦ Most commonly represented today by Symmetric Multiprocessor (SMP) machines
◦ Identical processors with equal access and access times to memory
◦ Sometimes called CC-UMA - Cache Coherent UMA.

Non-Uniform Memory Access (NUMA):

◦ Often made by physically linking two or more SMPs
◦ One SMP can directly access memory of another SMP
◦ Not all processors have equal access time to all memories
Shared Memory: Pro and Con
Advantages
◦ Global address space provides a user-friendly programming perspective to memory
◦ Data sharing between tasks is both fast and uniform due to the proximity of memory to CPUs
Disadvantages:
◦ Primary disadvantage is the lack of scalability between memory and CPUs. Adding more CPUs can
geometrically increases traffic on the shared memory-CPU path, and for cache coherent systems,
geometrically increase traffic associated with cache/memory management.
◦ Programmer responsibility for synchronization constructs that insure "correct" access of global memory.
◦ Expense: it becomes increasingly difficult and expensive to design and produce shared memory
machines with ever increasing numbers of processors.
Distributed Memory
Like shared memory systems, distributed memory systems vary widely but share a common characteristic. Distributed
memory systems require a communication network to connect inter-processor memory.
Processors have their own local memory. Memory addresses in one processor do not map to another processor, so
there is no concept of global address space across all processors.
Because each processor has its own local memory, it operates independently. Changes it makes to its local memory have
no effect on the memory of other processors. Hence, the concept of cache coherency does not apply.
When a processor needs access to data in another processor, it is usually the task of the programmer to explicitly define
how and when data is communicated. Synchronization between tasks is likewise the programmer's responsibility.
The network "fabric" used for data transfer varies widely, though it can can be as simple as Ethernet.
Distributed Memory: Pro and
Con
Advantages
◦ Memory is scalable with number of processors. Increase the number of processors and the size of
memory increases proportionately.
◦ Each processor can rapidly access its own memory without interference and without the overhead
incurred with trying to maintain cache coherency.
◦ Cost effectiveness: can use commodity, off-the-shelf processors and networking.

Disadvantages
◦ The programmer is responsible for many of the details associated with data communication between
processors.
◦ It may be difficult to map existing data structures, based on global memory, to this memory
organization.
◦ Non-uniform memory access (NUMA) times
Hybrid Distributed-Shared
Memory
Comparison of Shared and Distributed Memory Architectures

Architecture CC-UMA CC-NUMA Distributed

Examples SMPs Bull NovaScale Cray T3E

Sun Vexx SGI Origin Maspar
DEC/Compaq Sequent IBM SP2
SGI Challenge HP Exemplar IBM BlueGene
IBM POWER3 DEC/Compaq
IBM POWER4 (MCM)

Communications MPI MPI MPI

Threads Threads
OpenMP OpenMP
shmem shmem

Scalability to 10s of processors to 100s of processors to 1000s of processors

Draw Backs Memory-CPU bandwidth Memory-CPU bandwidth System administration

Non-uniform access times Programming is hard to develop
and maintain
Software Availability many 1000s ISVs many 1000s ISVs 100s ISVs
Assignment:
For this assignment, you and your partner will research a specific
parallel or distributed system and create a detailed description of
your system. Additionally, you will prepare a 12-minute presentation
to present to the class. You will need to prepare 5-7 slides for your
presentation, ensuring that your talk lasts no more than 10 minutes,
leaving 2 minutes for questions.
For your system, you should address each of the items listed below.

1. .Definition of the system

2. Architecture overview. For this part you can grab figures from other
sites (just be sure to list a citation of from where you grabbed a figure if
you do this)
3. How are processors/nodes interconnected?
4. What things are shared? what are private?
5. Is there a specific programming model for this system? explain. Are
there some programming models that are not supported by this
system? explain.
6. Is the system designed/optimized for a specific type of use? or is it
designed for a specific type of program workload? explain.
7. How scalable is the system? explain.

Unit2_a
No ratings yet
Unit2_a
70 pages
Chapter 02 - Asynchronous and Parallel Programming in .NET
No ratings yet
Chapter 02 - Asynchronous and Parallel Programming in .NET
55 pages
PDC Complete Course File
No ratings yet
PDC Complete Course File
422 pages
Unit 1- Part 1
No ratings yet
Unit 1- Part 1
51 pages
Module -4 - Parallel Processing
No ratings yet
Module -4 - Parallel Processing
32 pages
U1&U2 PADCOM-25 (2)
No ratings yet
U1&U2 PADCOM-25 (2)
95 pages
Unit 1
No ratings yet
Unit 1
22 pages
PDA_2
No ratings yet
PDA_2
105 pages
1.3. Underlying Principles of Parallel and Distributed Computing
No ratings yet
1.3. Underlying Principles of Parallel and Distributed Computing
118 pages
Hpc_unit-1 Insem Notes
No ratings yet
Hpc_unit-1 Insem Notes
76 pages
1-Introduction
No ratings yet
1-Introduction
48 pages
Multiprocessing vs Multithreading 2
No ratings yet
Multiprocessing vs Multithreading 2
16 pages
BDS Session 2
No ratings yet
BDS Session 2
56 pages
Parallel Programming Module 1
No ratings yet
Parallel Programming Module 1
71 pages
Lecture Week - 2 General Parallelism Terms
No ratings yet
Lecture Week - 2 General Parallelism Terms
24 pages
Introduction To Parallel Computing-Dr Nousheen
No ratings yet
Introduction To Parallel Computing-Dr Nousheen
43 pages
Unit VI Parallel Programming Concepts
No ratings yet
Unit VI Parallel Programming Concepts
90 pages
Lecture 2 General Parallelism Terms
No ratings yet
Lecture 2 General Parallelism Terms
22 pages
Chapter 1 - Parallel Architectures
No ratings yet
Chapter 1 - Parallel Architectures
60 pages
Lec1 Introduction to Parallel Computing (2)
No ratings yet
Lec1 Introduction to Parallel Computing (2)
40 pages
CC UNIT-1 Material
No ratings yet
CC UNIT-1 Material
26 pages
Week1 - Parallel and Distributed Computing
100% (1)
Week1 - Parallel and Distributed Computing
46 pages
cloud computing
No ratings yet
cloud computing
30 pages
KCS 713 Unit 1 Lecture 5
No ratings yet
KCS 713 Unit 1 Lecture 5
32 pages
Topic 1 2024
No ratings yet
Topic 1 2024
41 pages
UNIT 3
No ratings yet
UNIT 3
46 pages
Computer Achitecture II - Parallel - Computing
No ratings yet
Computer Achitecture II - Parallel - Computing
46 pages
Ppt1 Lecture 1 Distributed and Parallel Computing CSE423
No ratings yet
Ppt1 Lecture 1 Distributed and Parallel Computing CSE423
24 pages
Lecture 3
No ratings yet
Lecture 3
24 pages
Basics of Parallel Programming: Unit-1
No ratings yet
Basics of Parallel Programming: Unit-1
79 pages
Lecture 4
No ratings yet
Lecture 4
27 pages
Parallel and Distributed Computing
No ratings yet
Parallel and Distributed Computing
90 pages
Lecture-2-06.01.2025
No ratings yet
Lecture-2-06.01.2025
21 pages
W3C1 Principles of Parallel Computing
No ratings yet
W3C1 Principles of Parallel Computing
28 pages
u 1 c
No ratings yet
u 1 c
20 pages
Parallel 123
No ratings yet
Parallel 123
28 pages
Introduction To Parallel Computing
No ratings yet
Introduction To Parallel Computing
38 pages
Cloud Computing
No ratings yet
Cloud Computing
27 pages
CS326 Parallel and Distributed Computing: SPRING 2021 National University of Computer and Emerging Sciences
No ratings yet
CS326 Parallel and Distributed Computing: SPRING 2021 National University of Computer and Emerging Sciences
47 pages
downloadfile (3)
No ratings yet
downloadfile (3)
16 pages
I Notes
No ratings yet
I Notes
27 pages
Flynns
No ratings yet
Flynns
41 pages
PARALLEL VS DISTRIBUTED COMPUTING
No ratings yet
PARALLEL VS DISTRIBUTED COMPUTING
9 pages
2-INTRODUCTION TO PDC - MOTIVATION - KEY CONCEPTS-03-Dec-2019Material - I - 03-Dec-2019 - Module - 1 PDF
No ratings yet
2-INTRODUCTION TO PDC - MOTIVATION - KEY CONCEPTS-03-Dec-2019Material - I - 03-Dec-2019 - Module - 1 PDF
63 pages
Parallel Computing
No ratings yet
Parallel Computing
32 pages
Parallel and Distributed Computing
No ratings yet
Parallel and Distributed Computing
28 pages
01 Intro Parallel Computing
No ratings yet
01 Intro Parallel Computing
40 pages
Advanced Computer Architecture
No ratings yet
Advanced Computer Architecture
28 pages
Unit 1 Advanced Operating Systems 23pcsce24 3
No ratings yet
Unit 1 Advanced Operating Systems 23pcsce24 3
22 pages
FALLSEM2021-22 CSE4001 ETH VL2021220104078 Reference Material I 05-Aug-2021 Module1 (Part 1)
No ratings yet
FALLSEM2021-22 CSE4001 ETH VL2021220104078 Reference Material I 05-Aug-2021 Module1 (Part 1)
30 pages
Parallel Processor Computing Unit 1
No ratings yet
Parallel Processor Computing Unit 1
10 pages
Parallel Computing: Er. Anupama Singh Department of Computer Science & Engg
No ratings yet
Parallel Computing: Er. Anupama Singh Department of Computer Science & Engg
22 pages
Project - ParallelComputing BSR v2
No ratings yet
Project - ParallelComputing BSR v2
40 pages
Introduction To Computing
No ratings yet
Introduction To Computing
6 pages
Parallel Computing Main
No ratings yet
Parallel Computing Main
47 pages
Lecture Parallel Computing
No ratings yet
Lecture Parallel Computing
6 pages
CS614 Solved MCQs Final Term By JUNAID
No ratings yet
CS614 Solved MCQs Final Term By JUNAID
49 pages
Single Multi Core Comparision Report Final
No ratings yet
Single Multi Core Comparision Report Final
102 pages
Introduction To Parallel Computing LLNL
No ratings yet
Introduction To Parallel Computing LLNL
44 pages
9-Text-and-Visual-Dimension
No ratings yet
9-Text-and-Visual-Dimension
36 pages
8-Trends-in-MIL
No ratings yet
8-Trends-in-MIL
31 pages
Operating Systems
No ratings yet
Operating Systems
133 pages
What is an Operating System-Introduction,system components,functions,types
No ratings yet
What is an Operating System-Introduction,system components,functions,types
16 pages
slot_01
No ratings yet
slot_01
15 pages
CA Chap7 Multiprocessing
No ratings yet
CA Chap7 Multiprocessing
26 pages
28 MIMD Architecture
No ratings yet
28 MIMD Architecture
28 pages
Theory of Distributed Computing and Parallel Processing With Its Applications, Advantages and Disadvantages
No ratings yet
Theory of Distributed Computing and Parallel Processing With Its Applications, Advantages and Disadvantages
11 pages
DDB Module-4 Mid-2 Notes Q and a 10-12-2024
No ratings yet
DDB Module-4 Mid-2 Notes Q and a 10-12-2024
10 pages
Shrinky Dink Molecula
No ratings yet
Shrinky Dink Molecula
42 pages
Armv7 A Cortex A Series PG PDF
No ratings yet
Armv7 A Cortex A Series PG PDF
421 pages
Teradata Architecture
100% (1)
Teradata Architecture
89 pages
The CPU & Memory - Design and Enhancement
No ratings yet
The CPU & Memory - Design and Enhancement
63 pages
Elements of Computer Science and Engineering: CS106ES
No ratings yet
Elements of Computer Science and Engineering: CS106ES
98 pages
PDF document-4588E141BC5B-1
No ratings yet
PDF document-4588E141BC5B-1
39 pages
Parallel Computing Terminology
No ratings yet
Parallel Computing Terminology
11 pages
MH230_Processors_en
No ratings yet
MH230_Processors_en
4 pages
FB 3 Migration Guide Rev102
No ratings yet
FB 3 Migration Guide Rev102
125 pages
Ibm Zseries 990: Technical Introduction
No ratings yet
Ibm Zseries 990: Technical Introduction
154 pages
Lecture 3 Multiprocessor Vs Multicomputer Vs DS
No ratings yet
Lecture 3 Multiprocessor Vs Multicomputer Vs DS
55 pages
Servicing IBM Systems X Servers II - Study Guide
No ratings yet
Servicing IBM Systems X Servers II - Study Guide
154 pages
Server Consolidation For Dummies
No ratings yet
Server Consolidation For Dummies
67 pages
2 Writing A Resume 120112050916 Phpapp01
No ratings yet
2 Writing A Resume 120112050916 Phpapp01
60 pages
ANG - NC 4328 - Guide
No ratings yet
ANG - NC 4328 - Guide
51 pages
Chapter 1
No ratings yet
Chapter 1
73 pages
Introduction To Distributed Systems
No ratings yet
Introduction To Distributed Systems
36 pages
7026 Pseries 640 Model B80 Service Guide
No ratings yet
7026 Pseries 640 Model B80 Service Guide
443 pages
Names of Students
No ratings yet
Names of Students
1 page
Applied Economics Entrepreneurship: Nciii Ict-Programming
No ratings yet
Applied Economics Entrepreneurship: Nciii Ict-Programming
2 pages
Least Learned: Aupagan National High School
No ratings yet
Least Learned: Aupagan National High School
2 pages
Certificate of Participation
No ratings yet
Certificate of Participation
6 pages
Midterm Quiz 1 - Attempt Review-1
No ratings yet
Midterm Quiz 1 - Attempt Review-1
3 pages
Omnibus
No ratings yet
Omnibus
4 pages
Quad Core
No ratings yet
Quad Core
31 pages
ZipGrade50QuestionV2 PDF
No ratings yet
ZipGrade50QuestionV2 PDF
1 page
The History of LifeThe Eukaryota Include The Organisms That Most People Are Most Familiar With
No ratings yet
The History of LifeThe Eukaryota Include The Organisms That Most People Are Most Familiar With
5 pages
Ansys Student Faq
No ratings yet
Ansys Student Faq
8 pages
SAP On IBM Systems
No ratings yet
SAP On IBM Systems
93 pages
Letter Ni Daryl
No ratings yet
Letter Ni Daryl
1 page
Term Paper Cse 211
No ratings yet
Term Paper Cse 211
20 pages
Learn Multithreading with Modern C++
From Everand
Learn Multithreading with Modern C++
James Raynard
No ratings yet
Quantum Computer Vs Traditional Computer
From Everand
Quantum Computer Vs Traditional Computer
Arief Muinnudin
No ratings yet

Week1-Parallel-and-Distributed-Computing

Uploaded by

Week1-Parallel-and-Distributed-Computing

Uploaded by

CS481 Parallel and

Recommended Text Book

Introduction to Parallel Computing, Second Edition

Week 2: Introduction to Parallel Computing, Assignment Task(s)

Week 3: Parallel Programming Platforms

Week 4: Parallel Programming Platforms , Assignment Task(s)

Week 5: Principles of Parallel Algorithm Design, Quiz-1

Week 6: Mid Term-1 Examinations

Week 8: Programming Shared Address Space

Week 9: Programming Shared Address Space

Week 10: Programming Shared Address Space, Assignment Task(s)

Week 11: Programming Using the Message Passing Paradigm, Quiz-2

Week 12: Mid Term-2 Examinations

•Paralllel computing (processing):

•Distributed computing (processing):

•Traditionally, software has been written for serial

Multiple Processors vs Multiple Cores

Concurrent Execution on a Single-core System

Parallel Execution on a Multicore System

Economic limitations - it is increasingly expensive to make a single processor faster.

Instructions from each part execute simultaneously

◦ Execute multiple program instructions at any moment in time;

◦ Parallel Databases, Data Mining

◦ Web Search Engines, Web Based Business Services

◦ Computer-aided diagnosis in medicine

◦ advanced graphics and virtual reality, particularly in the entertainment industry

◦ networked video and multi-media technologies

Solve larger problems

Provide parallelism (do multiple things at the same time)

Single instruction: only one instruction stream is being acted on by the

A single processor executes a single instruction stream, to operate

Two varieties: Processor Arrays and Vector Pipelines

It consists of a single computer containing multiple processors connected with

Each processing unit operates on the data independently via

Some conceivable uses might be:

◦ Factors that contribute to scalabilty include:

Non-Uniform Memory Access (NUMA):

Architecture CC-UMA CC-NUMA Distributed

Examples SMPs Bull NovaScale Cray T3E

Communications MPI MPI MPI

Scalability to 10s of processors to 100s of processors to 1000s of processors

Draw Backs Memory-CPU bandwidth Memory-CPU bandwidth System administration

1. .Definition of the system

You might also like