Unit 1 - Part - 2
Unit 1 - Part - 2
•The distinction between NUMA and UMA platforms is important from the
point of view of algorithm design. NUMA machines require locality from
underlying algorithms for performance.
•Programming these platforms is easier since reads and writes are
implicitly visible to other processors.
•However, read-write data to shared data must be coordinated (this will
be discussed in greater detail when we talk about threads programming).
•Caches in such machines require coordinated access to multiple copies.
This leads to the cache coherence problem.
•A weaker model of these machines provides an address map, but not
coordinated access. These models are called non cache coherent shared
address space machines.
Shared-Address-Space vs. Shared Memory machines
of Parallel Platforms
Architecture of an Ideal Parallel
Computer
•A natural extension of the Random Access Machine
(RAM) serial architecture is the Parallel Random
Access Machine, or PRAM.
•PRAMs consist of p processors and a global
memory of unbounded size that is uniformly
accessible to all processors.
•Processors share a common clock but may execute
different instructions in each cycle.
Architecture of an Ideal Parallel Computer
•Depending on how simultaneous memory accesses are
handled, PRAMs can be divided into four subclasses.
–Exclusive-read, exclusive-write (EREW) PRAM.
–Concurrent-read, exclusive-write (CREW) PRAM.
–Exclusive-read, concurrent-write (ERCW) PRAM.
–Concurrent-read, concurrent-write (CRCW) PRAM.
Architecture of an Ideal Parallel Computer
•What does concurrent write mean, anyway?
–Common: write only if all values are identical.
–Arbitrary: write the data from a randomly selected processor.
–Priority: follow a predetermined priority order.
–Sum: Write the sum of all data items.
Physical Complexity of an Ideal Parallel
•Processors and memoriesComputer
are connected via
switches.
•Since these switches must operate in O(1) time at
the level of words, for a system of p processors and m
words, the switch complexity is O(mp).
•Clearly, for meaningful values of p and m, a true
PRAM is not realizable.
Interconnection Networks for Parallel computers
•Interconnection networks carry data between processors
and to memory.
•Interconnects are made of switches and links (wires, fiber).
•Interconnects are classified as static or dynamic.
•Static networks consist of point-to-point communication links
among processing nodes and are also referred to as direct
networks.
•Dynamic networks are built using switches and
communication links. Dynamic networks are also referred to
as indirect networks.
Interconnection Networks for Parallel
computers
•Interconnection networks carry data between processors
and to memory.
•Interconnects are made of switches and links (wires, fiber).
•Interconnects are classified as static or dynamic.
•Static networks consist of point-to-point communication links
among processing nodes and are also referred to as direct
networks.
•Dynamic networks are built using switches and
communication links. Dynamic networks are also referred to
as indirect networks.
www.paruluniversity.ac.i
n