Unit 5 - Advanced Computer Architecture - WWW - Rgpvnotes.in
Unit 5 - Advanced Computer Architecture - WWW - Rgpvnotes.in
Shared-Variable Model
A program is collection of processes. Parallelism depends on how inter-process communication (IPC) is
implemented. Fundamental issues in parallel programming are centered on the specification, creation,
suspension, reactivation, migration, termination and synchronization of concurrent processes residing in the
same or different processors.
By limiting the scope and access rights, the process address space may be shared or restricted. To
ensure orderly IPC, a mutual exclusion property requires the exclusive access of a shared object by one
process at a time.
Critical section.
Issues with shared variable model-
Memory consistency.
Atomicity with memory operation.
Fast synchronization.
Shared data structure.
Message-Passing Model
Two processes A and B communicate with each other by passing message through a direct network. The
messages may be instructions, data, synchronization, or interrupt signals, etc. Delay caused by message
passing is much longer than shared variable model in a same memory. Two message passing programming
models are introduced here.
It synchronizes the sender and receiver process with time and space just like telephone call.
Synchronous message passing-:
No shared memory.
No need of mutual exclusion.
No buffer is used in communication channel.
It can be blocked by channel being busy or error.
One message is allowed to be transmitted via a channel at a time.
Sender and receiver must be coupled in both time and space synchronously.
Also called blocking communication scheme.
Does not need to synchronize the sender and receiver in time and space.
Asynchronous message passing-:
Message passing programming is gradually changing, once the virtual memories from all nodes are
sets over the processing nodes.
combined.
Figure 5.3: (a) Synchronous message passing (b) Asynchronous message passing
Data-Parallel Model
The data parallel code is easier to write and to debug because parallelism is explicitly handled by hardware
synchronization and flow control. Data parallel languages are modified directly from standard serial
programming languages.
Data parallel programming emphasizes local computations and data routing operations such as permutation,
replication, reduction, and parallel prefix. It is applied to fine grain problems using regular grids, stencils, and
multidimensional signal / image data sets.
In data parallel model, tasks are assigned to processes and each task performs similar types of operations on
different data. Data parallelism is a consequence of single operations that is being applied on multiple data
items.
Data-parallel model can be applied on shared-address spaces and message-passing paradigms. In data-parallel
model, interaction overheads can be reduced by selecting a locality preserving decomposition, by using
optimized collective interaction routines, or by overlapping computation and interaction.
The primary characteristic of data-parallel model problems is that the intensity of data parallelism increases
with the size of the problem, which in turn makes it possible to use more processes to solve larger problems.
Example − De se atrix ultiplicatio
Role of Compiler
The role of compiler is to remove the burden of program optimization and code generation. A parallelizing
compiler consists of the three major phases.
Flow Analysis
o Program flow pattern in order to determine data and control dependence in the source code.
o Flow analysis is conducted at different execution levels on different parallel computers.
o Instruction level parallelism is exploited in super scalar or VLSI processors, loop level in SIMD,
vector, Task level in multiprocessors, multicomputer, or a network workstation.
Optimization
o The transformation of user programs in order to explore the hardware capabilities as much as
possible.
o Transformation can be conducted at the loop level, locality level, or prefetching level.
o The ultimate goal of PO is to maximize the speed of code execution.
o It involves minimization of code length and of memory accesses and the exploitation.
o Sometimes should be conducted at the algorithmic level and must involve the programmer.
Code Generation
o Code generation usually involves transformation from one representation to another, called an
intermediate form.
o Even more demanding because parallel constructs must be included.
o Code generation closely tied to instruction scheduling policies used.
o Optimized to encourage a high degree of parallelism.
o Parallel code generation is very different for different computer classes they are software and
hardware scheduled.
Optimization features.
Language features are classified into six categories-
Availability features.
o Feature enhances user friendliness. Make the language portable to a larger class of parallel
computer
o Expand the applicability of software libraries
o Scalability – scalable to the number of processors available and independent of hardware
topology
o Compatibility-compatible with establishment sequential
o Portability –portable to shared memory multiprocessors, message passing multicomputer, or
both
Control of parallelism
o Coarse ,medium, or fine grains
o Explicit versus implicit parallelism
o Global parallelism in the entire program
o Take spilt parallelism
o Shared task queue
Tools support individual process tasks such as checking the consistency of a design, compiling a
Software Tools and Environments
program, comparing test results, etc. Tools may be general-purpose, stand-alone tools (e.g. a word-
processor) or may be grouped into workbenches.
Workbenches support process phases or activities such as specification, design, etc. They normally
Environments support all or at least a substantial part of the software process. They normally include
consist of a set of tools with some greater or lesser degree of integration.