The document discusses various parallel programming models, including Shared Memory, Distributed Memory, and Hybrid Models, highlighting their architectures and programming paradigms such as OpenMP and MPI. It emphasizes the importance of choosing the right model based on available resources and personal preference, along with the significance of performance metrics, security, and energy efficiency in parallel computing. Additionally, it introduces high-level programming models like SPMD and MPMD, and mentions applications in cloud computing and GPU programming.
Download as PPTX, PDF, TXT or read online on Scribd
0 ratings0% found this document useful (0 votes)
9 views
Lecture 6 Parallel Programming Models
The document discusses various parallel programming models, including Shared Memory, Distributed Memory, and Hybrid Models, highlighting their architectures and programming paradigms such as OpenMP and MPI. It emphasizes the importance of choosing the right model based on available resources and personal preference, along with the significance of performance metrics, security, and energy efficiency in parallel computing. Additionally, it introduces high-level programming models like SPMD and MPMD, and mentions applications in cloud computing and GPU programming.
University of Lahore, Sargodha Campus Parallel Computers
• Programming mode types
– Shared Memory – Distributed Memory – Hybrid Model Parallel Programing Models • Parallel programming models exist as an abstraction above hardware and memory architectures • These models are NOT specific to a particular type of machine or memory architecture • These models can (theoretically) be implemented on any underlying hardware • Examples from past • SHARED memory model on a DISTRIBUTED memory machine. Kendall Square Research (KSR) ALLCACHE approach, “virtual shared memory" • DISTRIBUTED memory model on a SHARED memory machine. Message Passing Interface (MPI) on SGI Origin 2000, employed the CC-NUMA type of shared memory architecture, however, MPI commonly done over a network of distributed memory machines • Which model to use? • Combination of what is available and personal choice Shared Memory • Architecture Processors have direct access to global memory and I/O through bus or fast switching network • Cache Coherency Protocol guarantees consistency of memory and I/O accesses • Each processor also has its own memory (cache) • Data structures are shared in global address space • Concurrent access to shared memory must be coordinated • Programming Models – Multithreading (Thread Libraries) – OpenMP P P0 P1 ... Pn 0 Cach Cach Cach e e e Shared Bus Global Shared Memory Threads Model • Threads implementations commonly comprise: • A library of subroutines that are called from within parallel source code • A set of compiler directives imbedded in either serial or parallel source code • Historically, hardware vendors have implemented their own proprietary versions of threads, making it difficult for programmers to develop portable threaded applications • Standardization efforts: POSIX Threads (IEEE POSIX 1003.1c) and OpenMP (Industry standard) • POSIX Part of Unix/Linux, Library based • OpenMP Compiler directive based, Portable / multi-platform • Mircosoft threads, Java, Python threads, CUDA threads for GPUs OpenMP • OpenMP: portable shared memory parallelism • Higher-level API for writing portable multithreaded applications • Provides a set of compiler directives and library routines for parallel application programmers • API bindings for Fortran, C, and C++ Distributed Memory Architecture • Each Processor has direct access only to its local memory • Processors are connected via high-speed interconnect • Data structures must be distributed • Data exchange is done via explicit processor-to- processor communication: send/receive messages • Programming Models – Widely used standard: MPI – Others: PVM, Express, P4, Chameleon, PARMACS, ...
Memory Memory Memory
P0 P1 ... Pn Communicati on Interconne4 ct Message Passing Interface MPI provides: • Point-to-point communication • Collective operations – Barrier synchronization – gather/scatter operations – Broadcast, reductions • Different communication modes – Synchronous/asynchronous – Blocking/non-blocking – Buffered/unbuffered • Predefined and derived datatypes • Virtual topologies • Parallel I/O (MPI 2) • C/C++ and Fortran bindings Hybrid Model • A hybrid model combines more than one of the previously described programming models. • A common example of a hybrid model is the combination of the message passing model (MPI) with the threads model (OpenMP). • Threads perform computationally intensive kernels using local, on-node data • Communications between processes on different nodes occurs over the network using MPI Hybrid Model • Another similar and increasingly popular example of a hybrid model is using MPI with CPU-GPU (Graphics Processing Unit) programming. • MPI tasks run on CPUs using local memory and communicating with each other over a network. • Computationally intensive kernels are off-loaded to GPUs on node. • Data exchange between node-local memory and GPUs uses CUDA (or something equivalent) High level programming model • Single Program Multiple Data (SPMD) • Multiple Program Multiple Data (MPMD) SPMD • Built upon any combination of the previously mentioned parallel programming models • SINGLE PROGRAM: All tasks execute their copy of the same program simultaneously. This program can be threads, message passing, data parallel or hybrid. • MULTIPLE DATA: All tasks may use different data
-tasks do not necessarily have to execute the
entire program - perhaps only a portion of it MPMD • built upon any combination of the previously mentioned parallel programming models • MULTIPLE PROGRAM: Tasks may execute different programs simultaneously. • The programs can be threads, message passing, data parallel or hybrid. • MULTIPLE DATA: All tasks may use different data
MPMD applications are not as common
as SPMD applications Parallel and Distributed Programming Models • OPENMP • MPI • For message passing systems • MapReduce and BigTable • For internet clouds and data centers • Service clouds require extension of Hadoop, EC2, S3 to facilitate distributed computing over distributed storage system • CUDA • For NVIDIA GPUs • Open Grid Service Architecture (OGSA) • For grid application development Performance, Security and Energy Efficiency • Performance Metrics • CPU Speed, FLOPS, Job response time, network latency, system throughput, network bandwidth, System overhead (OS boot time, compile time, etc). • Scalability • Machine (size), software, application, and technology scalability • Amdahl’s law Performance, Security and Energy Efficiency • Security • Threats to system and network • Confidentiality, integrity, and availability • Copyright protection • System Defense technologies • Data protection infrastructures (IDS) • Energy efficiency • Distributed power management • Unused servers’ energy consumption • Reducing energy in active servers That’s all for today!!