0% found this document useful (0 votes)

29 views41 pages

Topic 1 2024

plhmidy c

Uploaded by

ella.davis.9811

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

29 views41 pages

Topic 1 2024

plhmidy c

Uploaded by

ella.davis.9811

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 41

Information Technology

FIT3143 Parallel Computing

Semester 2 2024

Topic 1:
Introduction to Parallel Computing
Dr Carlo Kopp, MACM, SMIEEE, AFAIAA
Sima, Fountain and Kacsuk, Advanced Computer Architectures - a Design Space Approach, Chapter 1
Faculty of Information Technology
© 2024 Monash University
Why Study Parallel Computing?
§ Parallel computing hardware is now pervasive – smartphones, tablets,
notebooks, desktops, servers, clusters, clouds all employ parallel hardware in
various forms. Very few systems today use single core CPUs
§ Designing, developing and implementing code to run well on parallel hardware
requires a robust understanding of parallelism and how it impacts code design
and system performance
§ Theory: Parallelism arises in different forms that impact performance and code
design in different ways that require different approaches to system and code
§ Theory: Without a sound theory background good solutions are impossible
§ Practice: The ability to analyse a parallel computing problem and match it to
available hardware
§ Practice: The ability to code parallel applications on real world hardware

2
Parallelism Concepts and Context
Why Parallelism?
§ The basic reason why parallel systems are built is to improve
computational performance – in the simplest of terms to “make
applications run much faster”
§ Since the 1950s parallelism has been used in various ways to extract
more performance out of hardware that is limited in how fast it can
execute machine instructions
§ Parallelism has been used to improve the performance of individual
CPUs using techniques like pipelining, superscalar processing, and
vector processing
§ Parallelism has been used to improve the performance of systems by
aggregating multiple CPUs or processing elements (in GPUs, NPUs)

4
What is Parallelism?
§ In the most basic sense all parallelism involves the exploitation of concurrency
in executing instructions or code with many instructions
§ If you have a CPU with an arithmetic unit that can execute one machine
instruction e.g. a multiplication in say 1 nanosecond, the ability to execute ten
such instructions simultaneously provides a tenfold speedup – 10 multiplies
§ Parallelism always incurs costs in hardware – for instance, a CPU that can
execute ten machine instructions concurrently will be more costly, just as a
multicore CPU chip with ten cores will also be more costly than a single core
CPU, and a server with ten multicore CPU chips will be at least ten times as
expensive as a single multicore CPU chip machine, all else being equal
§ While hardware imposes many limits on parallelism, algorithms often present a
much bigger challenge to achieving high concurrency in a parallel system

5
What is Parallel Computing?
§ Parallel computing involves programming for concurrency on many
processors, rather than sequential execution on a single processor
§ Parallel computing requires learning a different approach to
programming to by-pass the limits of sequential processing;
§ Programmers need to understand parallel architectures and as needed
re-design applications for parallel platforms;
§ In considering performance a programmer working in a “traditional”
sequential system thinks in terms of linear timescales;
§ In a parallel environment, concurrency and synchronisation must also
be considered.

6
Where is Parallel Computing Used?
§ The most common application of parallel computing techniques is today in
commodity computing products
§ Multicore CPU chips are used in portable devices like smartphones and tablets,
portable equipment like notebooks, and all types of desktops and small servers
– gaming desktops now use up to 24 core CPUs, and servers up to 60 core
CPUs
§ Cloud systems used in data centres use many thousands of multicore CPUs
networked by a fabric, and are typical parallel/distributed systems
§ Supercomputers in the Exascale performance class arrived in 2022 – these are
massively parallel / distributed systems with performance in the class of 1018
IEEE 754 Double Precision (64-bit) operations per second and are typically
used to solve huge scientific computing problems

7
Intel Sapphire Rapids 2023 (Up to 60 cores)

8
Hewlett Packard Enterprise / Cray Frontier / OLCF-5 (2022)

§ 606208 cores in 9,472 AMD

EPYC 7713 Trento 64 core 2
GHz CPUs
§ 8,335,360 GPU cores in 37,888
AMD Instinct MI250X GPUs
§ Fabric: HPE Slingshot 64-port
switch using 200 Gbps QSPF
Terabit Ethernet
§ Developed by HPE Cray and
AMD for the US Oak Ridge
National Laboratory and U.S.
Department of Energy
https://www.flickr.com/photos/olcf/52117623843/

9
Large Scale Computational Problems (Sankar, 2008)
§ Science
– Global climate modelling
– Biology: genomics; protein folding; drug design
– Astrophysical modelling
– Computational Chemistry
– Computational Material Sciences and Nanosciences
§ Engineering
– Semiconductor design
– Earthquake and structural modelling, remote sensing
– Computation fluid dynamics (airplane design) & Combustion (engine design)
– Simulation
– Deep learning
– Game design
– Telecommunications (e.g. Network monitoring & optimization)
– Autonomous systems
§ Business
– Financial derivatives and economic modelling
– Transaction processing, web services and search engines
– Analytics

10
Parallel versus Distributed Computing
Parallel versus Distributed Computing
§ Parallel and Distributed Computing are frequently confused by novices,
because many applications are both parallel and distributed at the same time;
§ Parallel Computing is mostly focused on problems where the same computing
task is divided up to execute concurrently on many processing cores or
components, regardless of whether these are on the same chip, in the same
computer, or distributed across a fabric or network connecting many computers;
§ Distributed Computing is mostly focused on problems where the same or
different computing tasks are concurrently executed on multiple cores
distributed across a fabric or network connecting many computers;
§ This overlap results in a need to understand a number of important distributed
computing concepts to solve many parallel programming problems

12
Parallel versus Distributed Computing
Scalability
Latency
Bandwidth
Synchronisation
Reliability

Systems that are

Both distributed
and parallel
include clusters
and clouds

13
Conventional versus Parallel Processing

Conventional i.e. “Sequential” Computing

Parallel Computing

Adapted from https://computing.llnl.gov/tutorials/parallel_comp/

14
The Diversity Problem – How to Classify Parallel Processors?
§ Until the 1990s most computers in use in desktop and portable
applications were single CPU (core) machines
§ Large mainframes and servers were mainly machines with multiple
single CPU (core) processors sharing a main memory
§ A small minority of supercomputers were vector processing machines
§ Increasing chip density during the 1990s led to many changes:
a) Multicore CPU chips starting with dual and quad core devices
b) Networking of multicore CPU machines into clusters
c) Multimedia coprocessors to process streaming data and vectors
§ Density now permits single chip solutions for arbitrary parallel models

15
The Taxonomy Problem

§ A taxonomy of parallel architectures can be built based on three

relationships:
1. Relationship between PE and the instruction sequence executed
2. Relationship between PE and the memory
3. Relationship between PE and the interconnection network
§ Where a PE (PU) is a Processing Element (Processing Unit) e.g. a CPU
or Execution Unit
§ Please note that different textbooks and papers often use different
labels even if the definitions are fundamentally the same
§ The most widely used taxonomy is Flynn’s 1966 model

16
Flynn’s Taxonomy
Flynn’s Taxonomy

§ Michael Flynn (1966) developed a taxonomy of parallel

systems based on the number of independent
instruction and data streams.
A. SISD: Single Instruction Stream- Single Data Stream
(sequential Von Neumann machine)
B. MISD: Multiple Instruction Stream- Single Data Stream
C. SIMD: Single Instruction Stream- Multiple Data Stream
D. MIMD: Multiple Instruction Stream- Multiple Data
Stream

18
Flynn’s Taxonomy

Instructions
One Many
Defined by:
Prof Michael J Flynn
One

SISD MISD at Stanford University

Data

§ Single Instruction Single Data

Many

SIMD MIMD § Multiple Instruction Single Data

§ Single Instruction Multiple Data
§ Multiple Instruction Multiple Data

19
Flynn’s Taxonomy

20
Single Instruction Single Data (Von Neumann Model)
§ A serial (non-parallel) computer
§ Single instruction: only one instruction stream is being acted
on by the CPU during any one clock cycle
§ Single data: only one data stream is being used as input
during any one clock cycle
§ Deterministic execution
§ This is the oldest and until recently, the most prevalent form
of computer
§ Examples: most single core CPU notebooks, desktops,
workstations and servers)

21
Single Instruction Single Data (Von Neumann Model)
§ Consists of a processor
executing a program stored in a
(main) memory.
§ Each main memory location
located by its address.
Addresses start at 0 and extend
to 2n -1 when there are n bits
(binary digits) in the address.

22
Single Instruction Multiple Data System
§ A type of parallel computer
§ Single instruction: All processing units execute the same
instruction at any given clock cycle
§ Multiple data: Each processing unit can operate on a different
data element
§ This type of machine typically has an instruction dispatcher, a
very high-bandwidth internal interconnect, and a very large Processor Arrays:
array of very small-capacity instruction (execution) units. Connection Machine CM-
§ Best suited for specialized problems characterized by a high 2, Maspar MP-1, MP-2
degree of regularity, such as image processing. Vector Pipelines: IBM
§ Synchronous (lockstep) and deterministic execution 9000, Cray C90, Fujitsu VP,
§ Two varieties: Processor Arrays and Vector Pipelines NEC SX-2, Hitachi S820

23
Single Instruction Multiple Data System

Processor array
Array
control

Data interface Host

computer

Data Mass
in/out storage

24
Multiple Instruction Single Data
§ A single data stream is fed into multiple processing units.
§ Each processing unit operates on data independently via independent
instruction streams.
§ Few actual examples of this class of parallel computer have ever existed. One
is the experimental Carnegie-Mellon computer
§ Many textbooks class systolic arrays as MISD architectures but this is
disputed – systolic arrays are used in some AI NPUs
§ Some conceivable uses might be:
a) multiple frequency filters operating on a single signal stream
b) multiple cryptography algorithms attempting to crack a single coded
message.

25
Multiple Instruction Multiple Data

§ Currently, the most common type of parallel computer.

Most modern computers fall into this category.
§ Multiple Instruction: every processor may be executing a
different instruction stream
§ Multiple Data: every processor may be working with a
different data stream
§ Execution can be synchronous or asynchronous,
deterministic or non-deterministic
§ Examples: most current supercomputers, networked
parallel computer "grids" and multi-processor SMP
computers, and anything with a multi-core CPU chip

26
Multiple Instruction Multiple Data – Distributed and Shared Memory
Processor 1 Processor 1 Memory
Module
1

Memory Processor 2
Memory
Module
Interconnection 2
Network Interconnection
Network

Processor p Memory
Module
Processor p m

Memory

27
Memory Architectures
Why Do Memory Architectures Matter
§ There are multiple ways in which a parallel system might
access memory
§ The memory architecture in use can impact the behaviour of
the system in multiple ways:
a) Performance: bandwidth available to read and write data
from and to memory
b) Reliability: maintaining consistency of data for different
processing elements in the system
c) Security: control of access permissions to data in memory
29
Parallel Computer Memory Architectures
§ Broadly divided into three categories:
A. Shared Memory Architecture: multiple processing
elements access memory via a common local bus or switch
using a common address space
B. Distributed Memory Architecture: multiple processing
elements access memory over a fabric (network, bus,
switch) not always using a common address space
C. Hybrid Memory Architecture: Combines (A) and (B)

30
Shared Memory Architecture
§ Shared memory parallel computers vary widely,
but generally have in common the ability for all
processors to access all memory as a global
address space.
§ Multiple processors can operate independently
but share the same memory resources.
§ Changes in a memory location effected by one
processor are visible to all other processors.
§ Shared memory machines can be divided into
two main classes based upon memory access
times: UMA and NUMA (discussed later).

31
Distributed Memory Architecture
§ Distributed memory systems require a communication
network to connect inter-processor memory.
§ Processors have their own local memory. There is no
concept of global address space across all processors.
§ Because each processor has its own local memory, it
operates independently. Changes it makes to its local
memory have no effect on the memory of other processors.
Hence, the concept of cache coherency does not apply.
§ When a processor needs access to data in another
processor, it is usually the task of the programmer to
explicitly define how and when data is communicated.
Synchronization between tasks is likewise the programmer's
responsibility. The network “fabric” used for data transfer
varies widely, though it can be as simple as Ethernet.

32
Hybrid Memory Architecture
§ Clusters, clouds and supercomputers today employ both
shared and distributed memory architectures.
§ The shared memory component is usually a cache coherent
SMP (Symmetrical Multi-Processing) machine. Processors
on a given SMP can address that machine's memory as
global.
§ The distributed memory component is the networking of
multiple SMPs. SMPs know only about their own memory -
not the memory on another SMP. Therefore, network
communications are required to move data from one SMP to
another.
§ Current trends seem to indicate that this type of memory
architecture will continue to prevail and increase at the high
end of computing for the foreseeable future.

33
Parallel Programming Models
Parallel Programming Models Overview
§ There are several parallel programming models in common use:
A. Shared Memory
B. Threads
C. Message Passing
D. Data Parallel
E. Hybrid
§ Parallel programming models exist as an abstraction above the hardware and
memory architectures.
§ Although it might not seem apparent, these models are NOT specific to a
particular type of machine or memory architecture. In fact, any of these models
can (theoretically) be implemented on any underlying hardware

35
Single Program Multiple Data (SPMD) Structure
§ Another programming structure we may use is the Single Program
Multiple Data (SPMD) structure.
§ In this structure, a single source program is written and each processor
will execute its personal copy of this program, although independently,
and not in synchrony.
§ This source program can be constructed so that parts of the program
are executed by certain computers and not others depending on the
identity of the computer.
§ For a master-slave structure, the programs could have parts for the
master and parts for slaves.

36
Multiple Program Multiple Data (MPMD) Structure
§ Within the MIMD classification, each processor will have its own
program to execute. This could be described as MPMD.
§ In this case, some of the programs to be executed could be copies of
the same program.
§ Typically, only two source programs are written, one for the designated
master processor, and one for the remaining processors, which are
called slave processors.

37
Summary
Summary
§ Parallelism Concepts and Context
§ Parallel versus Distributed Computing
§ Flynn’s Taxonomy
§ Memory Architectures
§ Parallel Programming Models

39
Reading Materials
Reading and References
§ M. J. Flynn, “Very high-speed computing systems,” in Proceedings of
the IEEE, vol. 54, no. 12, pp. 1901-1909, Dec. 1966, doi:
10.1109/PROC.1966.5273, URI:
https://ieeexplore.ieee.org/document/1447203
§ Sima, Dezsö, Terry J. Fountain and Péter Kacsuk. “Advanced computer
architectures - a design space approach,” Chapter 1, International
computer science series, Addison-Wesley, Reading, MA (1997)

Introduction To Parallel Computing
No ratings yet
Introduction To Parallel Computing
38 pages
Pda 2
No ratings yet
Pda 2
105 pages
PDC Complete Course File
No ratings yet
PDC Complete Course File
422 pages
Lecture 1 Introduction
No ratings yet
Lecture 1 Introduction
34 pages
Lecture 2 Computer Architecture Course 2024 1
No ratings yet
Lecture 2 Computer Architecture Course 2024 1
57 pages
Lecture 2 General Parallelism Terms
No ratings yet
Lecture 2 General Parallelism Terms
22 pages
PP Cuda Unit1 1
No ratings yet
PP Cuda Unit1 1
77 pages
Parallel Distributed Computing
No ratings yet
Parallel Distributed Computing
51 pages
Parallel Computing
100% (1)
Parallel Computing
53 pages
PDC Notes Complete - Updated
No ratings yet
PDC Notes Complete - Updated
52 pages
1 Introduction
No ratings yet
1 Introduction
48 pages
Lecture-2-06 01 2025
No ratings yet
Lecture-2-06 01 2025
21 pages
CMP 252 - Parallelism Fundamentals
No ratings yet
CMP 252 - Parallelism Fundamentals
64 pages
Unit 5
No ratings yet
Unit 5
66 pages
Unit 1
No ratings yet
Unit 1
22 pages
Chapter # 1
No ratings yet
Chapter # 1
117 pages
EE664: Introduction To Parallel Computing: Dr. Gaurav Trivedi Lectures 5-14
No ratings yet
EE664: Introduction To Parallel Computing: Dr. Gaurav Trivedi Lectures 5-14
170 pages
Computer Achitecture II - Parallel - Computing
No ratings yet
Computer Achitecture II - Parallel - Computing
46 pages
Cloud Computing: Mr. Ajay B. Kapase
No ratings yet
Cloud Computing: Mr. Ajay B. Kapase
20 pages
Lecture 2 General Parallelism Terms
No ratings yet
Lecture 2 General Parallelism Terms
22 pages
Week1 Parallel and Distributed Computing
No ratings yet
Week1 Parallel and Distributed Computing
55 pages
Lecture 4
No ratings yet
Lecture 4
27 pages
Cloud Computing
No ratings yet
Cloud Computing
30 pages
Chapter 1 - Parallel Architectures
No ratings yet
Chapter 1 - Parallel Architectures
60 pages
KCS 713 Unit 1 Lecture 5
No ratings yet
KCS 713 Unit 1 Lecture 5
32 pages
Introduction To Parallel Computing-Dr Nousheen
No ratings yet
Introduction To Parallel Computing-Dr Nousheen
43 pages
Reflection Paper - 052520
No ratings yet
Reflection Paper - 052520
4 pages
Lec1 Introduction To Parallel Computing
No ratings yet
Lec1 Introduction To Parallel Computing
40 pages
W3C1 Principles of Parallel Computing
No ratings yet
W3C1 Principles of Parallel Computing
28 pages
Map Reduce
No ratings yet
Map Reduce
11 pages
Unit 4 - Cloud Programming Models
100% (2)
Unit 4 - Cloud Programming Models
21 pages
Parallel Computing Varun Patial
No ratings yet
Parallel Computing Varun Patial
41 pages
Lec1 Introduction
No ratings yet
Lec1 Introduction
23 pages
517 454, 517 441: Parallel and Distributed Computing: Apisake Hongwitayakorn
No ratings yet
517 454, 517 441: Parallel and Distributed Computing: Apisake Hongwitayakorn
31 pages
Advanced Computer Architecture
No ratings yet
Advanced Computer Architecture
28 pages
Assignment 1st PC
No ratings yet
Assignment 1st PC
12 pages
CS621 - Handouts - Mids
No ratings yet
CS621 - Handouts - Mids
61 pages
Parallel Computing
No ratings yet
Parallel Computing
32 pages
Unit VI Parallel Programming Concepts
No ratings yet
Unit VI Parallel Programming Concepts
90 pages
Module 1: Parallelism Fundamentals Week 1 Learning Outcomes
No ratings yet
Module 1: Parallelism Fundamentals Week 1 Learning Outcomes
8 pages
Parallel and Distributed Computing
No ratings yet
Parallel and Distributed Computing
90 pages
Parallel and Distributed Computing
No ratings yet
Parallel and Distributed Computing
28 pages
Introduction To Parallel Co...
No ratings yet
Introduction To Parallel Co...
44 pages
Lecture Week - 1 Introduction 1 - SP-24
No ratings yet
Lecture Week - 1 Introduction 1 - SP-24
51 pages
10 Parallel Computing
No ratings yet
10 Parallel Computing
15 pages
Introduction To Parallel Computing
100% (1)
Introduction To Parallel Computing
34 pages
Parallelism in Computer Architecture
No ratings yet
Parallelism in Computer Architecture
27 pages
OS Full Notes
No ratings yet
OS Full Notes
42 pages
Introduction To Parallel Computing
No ratings yet
Introduction To Parallel Computing
30 pages
Basics of Parallel Programming: Unit-1
No ratings yet
Basics of Parallel Programming: Unit-1
79 pages
01 Intro Parallel Computing
No ratings yet
01 Intro Parallel Computing
40 pages
FALLSEM2021-22 CSE4001 ETH VL2021220104078 Reference Material I 05-Aug-2021 Module1 (Part 1)
No ratings yet
FALLSEM2021-22 CSE4001 ETH VL2021220104078 Reference Material I 05-Aug-2021 Module1 (Part 1)
30 pages
Parallel Computing Main
No ratings yet
Parallel Computing Main
47 pages
2-INTRODUCTION TO PDC - MOTIVATION - KEY CONCEPTS-03-Dec-2019Material - I - 03-Dec-2019 - Module - 1 PDF
No ratings yet
2-INTRODUCTION TO PDC - MOTIVATION - KEY CONCEPTS-03-Dec-2019Material - I - 03-Dec-2019 - Module - 1 PDF
63 pages
Project - ParallelComputing BSR v2
No ratings yet
Project - ParallelComputing BSR v2
40 pages
Theory of Distributed Computing and Parallel Processing With Its Applications, Advantages and Disadvantages
No ratings yet
Theory of Distributed Computing and Parallel Processing With Its Applications, Advantages and Disadvantages
11 pages
Grid - and - Cloud Computing - Important - Questions - Unit - 1 - Part - A
No ratings yet
Grid - and - Cloud Computing - Important - Questions - Unit - 1 - Part - A
5 pages
Lecture Parallel Computing
No ratings yet
Lecture Parallel Computing
6 pages
Introduction To Computing
No ratings yet
Introduction To Computing
6 pages
Introduction To Parallel Computing LLNL
No ratings yet
Introduction To Parallel Computing LLNL
44 pages
Parallel Computing Terminology
No ratings yet
Parallel Computing Terminology
11 pages
Integrative Programming and Technologies: Intersystem Communications
No ratings yet
Integrative Programming and Technologies: Intersystem Communications
25 pages
Java 8 Concurrency Tutorial: Threads and Executors
No ratings yet
Java 8 Concurrency Tutorial: Threads and Executors
7 pages
Chapter Three
No ratings yet
Chapter Three
47 pages
MC251
No ratings yet
MC251
10 pages
r13 Cse Os Lab Manual
100% (1)
r13 Cse Os Lab Manual
53 pages
Concurrency in Computing
No ratings yet
Concurrency in Computing
16 pages
BDA 01 - Introduction
No ratings yet
BDA 01 - Introduction
42 pages
POSIX
No ratings yet
POSIX
24 pages
Big Data Unit 4
No ratings yet
Big Data Unit 4
14 pages
Apache Hadoop and Hive: Dhruba Borthakur
No ratings yet
Apache Hadoop and Hive: Dhruba Borthakur
32 pages
04 Signals SIGCHLD
No ratings yet
04 Signals SIGCHLD
25 pages
UNIT-3 2015 Regulation Process Synchronization and Deadlocks
No ratings yet
UNIT-3 2015 Regulation Process Synchronization and Deadlocks
58 pages
AMEOBA
No ratings yet
AMEOBA
41 pages
Unit - 7 Multithreading: Prof. Arjun Bala Oop Java
No ratings yet
Unit - 7 Multithreading: Prof. Arjun Bala Oop Java
19 pages
Kubernetes HA: Montreal Kubernetes Meetup October 12
No ratings yet
Kubernetes HA: Montreal Kubernetes Meetup October 12
14 pages
Cs2354 Advanced Computer Architecture 2 Marks
No ratings yet
Cs2354 Advanced Computer Architecture 2 Marks
10 pages
Multi-Tasking in Python - Speed Up Your Program 10x by Executing Things Simultaneously - by Mike Huls - Towards Data Science
No ratings yet
Multi-Tasking in Python - Speed Up Your Program 10x by Executing Things Simultaneously - by Mike Huls - Towards Data Science
18 pages
CUDA Tricks PDF
No ratings yet
CUDA Tricks PDF
33 pages
BCS303
No ratings yet
BCS303
3 pages
Learning Journal Unit 2
No ratings yet
Learning Journal Unit 2
3 pages
Assignment 2 Cluster Computing
No ratings yet
Assignment 2 Cluster Computing
3 pages
CS 201 Signals: Gerson Robboy Portland State University
No ratings yet
CS 201 Signals: Gerson Robboy Portland State University
27 pages
9 - Conservative, Strict and Rigorous 2PL
No ratings yet
9 - Conservative, Strict and Rigorous 2PL
5 pages
Studies On Performance Aspects of Scheduling Algorithms On Multicore Platforms
No ratings yet
Studies On Performance Aspects of Scheduling Algorithms On Multicore Platforms
7 pages
Exercise 4
No ratings yet
Exercise 4
4 pages
CUDA Programming Fundamentals: Definitive Reference for Developers and Engineers
From Everand
CUDA Programming Fundamentals: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
OpenCL Programming and Architecture: Definitive Reference for Developers and Engineers
From Everand
OpenCL Programming and Architecture: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Build Your Own Distributed Compilation Cluster - A Practical Walkthrough
From Everand
Build Your Own Distributed Compilation Cluster - A Practical Walkthrough
Hunter Davis
No ratings yet

Topic 1 2024

Uploaded by

Topic 1 2024

Uploaded by

Information Technology

FIT3143 Parallel Computing

§ 606208 cores in 9,472 AMD

Systems that are

Conventional i.e. “Sequential” Computing

Adapted from https://computing.llnl.gov/tutorials/parallel_comp/

§ A taxonomy of parallel architectures can be built based on three

§ Michael Flynn (1966) developed a taxonomy of parallel

SISD MISD at Stanford University

§ Single Instruction Single Data

SIMD MIMD § Multiple Instruction Single Data

Data interface Host

§ Currently, the most common type of parallel computer.

You might also like