0% found this document useful (0 votes)
22 views

Introduction To Parallel Architectures - Josep Torrellas - CS533 (2012)

Uploaded by

alastorid
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views

Introduction To Parallel Architectures - Josep Torrellas - CS533 (2012)

Uploaded by

alastorid
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

Introduction to Parallel Architectures

Instructor: Josep Torrellas


CS533

Copyright Josep Torrellas 2012 1


Introduction
• Why parallel architectures:
– Absolute performance
– Power and energy
– Complexity
– Cost-performance
• Key enabling factors:
– Advances in semiconductor and interconnect technology
– Advances in software technology

Copyright Josep Torrellas 2012 2


Classification of Parallel Machines
• Michael Flynn’s classification
• Model-based classification

Copyright Josep Torrellas 2012 3


Flynn’s Classification
• Based on the notions of:
– Instruction streams and data streams
• The parallel organizations are characterized by the multiplicity of
hardware provided to service I and D streams:
– SISD: Single Instruction and Single Data Stream (uniprocessor)
– SIMD: Single I and Multiple D Streams (GPUs)
– MISD: Multiple I and Single D Streams
– MIMD: Multiple I and Multiple D Streams (multicores)

Copyright Josep Torrellas 2012 4


Model-Based Classification
• Shared-memory
• Message-passing
• Dataflow
• Systolic
• Data parallel

Copyright Josep Torrellas 2012 5


Shared-Memory Architectures
• Key feature: all processors in the system can directly access all
memory locations in the system, thus providing a convenient
and cheap mechanism for multiple processors to share data
– Convenient: (i) location transparency, (ii) abstraction
supported is same as that of uniprocessors
– Cheap: as compared to other models (more later)
• Memory can be centrally placed or distributed
• Better name is Single address space machines

Copyright Josep Torrellas 2012 6


• Programming model:
– Variety of parallel models can be easily supported: fork-join
model, task queue model, data parallel model.
– Parallel threads use shared-memory for communication and
synchronization
• A problem traditionally cited with such machines is scalability
• However, the programming model is very general and can easily
emulate the other models

Copyright Josep Torrellas 2012 7


Message Passing Architectures
• Processors can directly access only local memory, and all
comm. and synch happens via messages
• Sending a message often incurs many overheads:
– Building a header, copying data into network buffers,
sending data, receiving data into buffers, copying data from
kernel to user process address space
– Many of these steps may require OS intervention
• Synchronization using messages is typically based on various
handshake protocols
• One of the main advantages is easy scalability

Copyright Josep Torrellas 2012 8


Message Passing Architectures
• A variety of programming models can be supported:
– Actor model, Concurrent-object oriented programming
• Very popular:
– Clusters
– Cloud computing
– Supercomputing

Copyright Josep Torrellas 2012 9


Dataflow
• In the dataflow model, instructions are activated by the availability
of data operands for instructions
A = (B+1)* (B-C)
D=C*E
F=A*D
• In control flow models, computation is a series of instructions with
implicit or explicit sequencing between them
• One of the advantages is that all dependences are explicitly present
in the dataflow graph, so parallelism is not hidden from hardware

Copyright Josep Torrellas 2012 10


Dataflow
• Many variations exist
• Some issues:
– Granularity of operations (locality issues)
– Efficient handling of complex data structures like arrays
– Complexity of the matching store
– Problems due to excess of parallelism

Copyright Josep Torrellas 2012 11


Systolic Architectures
• Basic principle:
– Replace a single PE by a regular array of PEs and carefully
orchestrate the flow of data between Pes
– Result: high throughput without increasing memory
bandwidth requirements
• Distinguishing features from regular pipelined computers:
– Array structure can be non-linear (e.g. hexagonal)
– Pathways between the PEs may be multidirectional
– PEs may have local instruction and data memory, and are
more complex than the stage of a pipelined computer

Copyright Josep Torrellas 2012 12


Systolic Architectures
• Issues:
– System integration: Shipping data from host array and back
– Cell architecture and communication architecture
– Software for automatically mapping computations to systolic arrays
– General purpose systolic arrays
• Popular in signal processing

Copyright Josep Torrellas 2012 13


Data Parallel Architectures
• Programming model assumes that there is a processor associated with
each member of a collection of data
• All processors execute similar operations on different data, not in
lockstep
• Useful for highly parallel codes
• Used by graphics processors

Copyright Josep Torrellas 2012 14

You might also like