0% found this document useful (0 votes)

63 views78 pages

02 Lecture Flynn IN

The document discusses parallel computing, focusing on Flynn's Taxonomy, which classifies computer architectures based on instruction and data streams into SISD, SIMD, MISD, and MIMD categories. It also covers interconnection networks, highlighting static and dynamic types, and various architectures such as shared memory and message passing. Additionally, the document explores bus-based and switch-based interconnection networks, detailing their synchronization and control strategies.

Uploaded by

John Wadie

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

63 views78 pages

02 Lecture Flynn IN

Uploaded by

John Wadie

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 78

CSE 455

Higher Performance
Computing
LECTURE 2: FLYNN’S TAXONOMY +
INTERCONNECTION NETWORKS
# Chapter Subtitle
Agenda
• Motivation for parallel computing
• Flynn’s Taxonomy
• Interconnection networks
• Dynamic Networks
• Static Networks
Motivation for Parallel Computing
• A multiprocessor is expected to reach a faster speed
than the fastest single-processor system.
• A multiprocessor is more cost-effective than a high-
performance single processor.
• If a processor fails, the remaining processors should be
able to provide continued service, albeit with degraded
performance.
Four Decades of Computing
Motivation for Parallel Computing

• One of the clear trends in computing is the substitution

of expensive and specialized parallel machines by th
more cost-effective clusters of workstations.
• A cluster is a collection of stand-alone computers
connected using some interconnection network.
• Additionally, the pervasiveness of the Internet created
interest in network computing and more recently in
grid computing. Grids are geographically distributed
platforms
# Chapter Subtitle
Agenda
• Motivation for parallel computing
• Flynn’s Taxonomy
• Interconnection networks
• Dynamic Networks
• Static Networks
Flynn’s Taxonomy
• The most popular taxonomy of computer architecture
was defined by Flynn in two types of information flow
into a processor: instructions and data.
• The instruction stream is defined as the sequence of
instructions performed by the processing unit.
• The data stream is defined as the data traffic
exchanged between the memory and the processing
unit.
Flynn’s Taxonomy

▪ According to Flynn’s classification, either of the

instruction or data streams can be single or multiple.
▪ Computer architecture can be classified into the
following four distinct categories:
▪ Single-Instruction Single-Data (SISD);
▪ Single-Instruction Multiple-Data (SIMD);
▪ Multiple-Instruction Single-Data (MISD);
▪ Multiple-Instruction Multiple-Data (MIMD).
▪ Conventional single-processor von Neumann
computers are classified as SISD systems
Flynn’s Taxonomy

SISD (SIMD)
Single instruction stream Single instruction stream
Single data stream Multiple data stream

MISD (MIMD)
Multiple instruction stream Multiple instruction stream
Single data stream Multiple data stream

Copyright © 2010, Elsevier Inc. All rights Reserved 9

Single-Instruction Single-Data (SISD)
• Conventional single-processor von Neumann
computers are classified as SISD systems.
Single Instruction Multiple Data (SIMD)
• Consists of 2 parts:
• a front-end Von Neumann computer.
• A processor array: connected to the memory bus of the front
end.
• Applies the same instruction to multiple data items.
SIMD example

n data items
control unit
n ALUs

x[1] x[2] … x[n]

ALU1 ALU2 ALUn

for (i = 0; i < n; i++)

x[i] += y[i];

Copyright © 2010, Elsevier Inc. All rights Reserved 12

SIMD ARCHITECTURE
• SIMD Scheme 1
• Each processor has its own local memory.
• Ex: The ILLIAC IV
SIMD ARCHITECTURE
• SIMD Scheme 2
• Processors and memory modules communicate with each
other via interconnection network.
• Ex: The BSP (Burroughs’ Scientific Processor)
SIMD drawbacks
◼ All ALUs are required to execute the same
instruction, or remain idle.
◼ In classic design, they must also operate
synchronously.
◼ The ALUs have no instruction storage.
◼ Efficient for large data parallel problems,
but not other types of more complex
parallel problems.

Copyright © 2010, Elsevier Inc. All rights Reserved 15

Graphics Processing Units (GPU)
◼ Real time graphics application
programming interfaces or API’s use
points, lines, and triangles to internally
represent the surface of an object.

Copyright © 2010, Elsevier Inc. All rights Reserved 16

GPUs
◼ A graphics processing pipeline converts
the internal representation into an array of
pixels that can be sent to a computer
screen.

◼ Several stages of this pipeline

(called shader functions) are
programmable.
◼ Typically just a few lines of C code.

Copyright © 2010, Elsevier Inc. All rights Reserved 17

GPUs
◼ Shader functions are also implicitly
parallel, since they can be applied to
multiple elements in the graphics stream.

◼ GPU’s can often optimize performance by

using SIMD parallelism.
◼ The current generation of GPU’s use SIMD
parallelism.
◼ Although they are not pure SIMD systems.

Copyright © 2010, Elsevier Inc. All rights Reserved 18

Multiple Instruction Multiple Data (MIMD)
• Made of multiple independent processors (own
control and ALU) and multiple memory modules
connected via some interconnection network.
• 2 broad categories:
• Shared memory
• Message passing
MIMD Architecture

Shared Memory Message Passing Architecture

MIMD Architecture – Shared Memory
• Shared Memory Organization
• Inter-processor coordination is accomplished by reading and
writing in a global memory shared by all processors.
• Typically consists of servers that communicate through a bus
and cache memory controller.
• It requires Access control, synchronization, protection, and
security.
Shared Memory Organization - Access Control

 Access control determines which process accesses are

possible to which resources.
 Access control models make the required check for
every access request issued by the processors to the
shared memory, against the contents of the access
control table.
 All disallowed access attempts and illegal processes
are blocked.
Shared Memory Organization - Synchronization

• Synchronization constraints limit the time of accesses

from sharing processes to shared resources.
• Appropriate synchronization ensures that the
information flows properly and ensures system
functionality.
Shared Memory Organization - Protection
• Sharing and protection are incompatible
• The simplest shared memory system consists of one
memory module that can be accessed from two
processors.
• Depending on the interconnection network, a shared
memory system leads to systems that can be classified
as:
• uniform memory access (UMA),
• nonuniform memory access (NUMA), and
• cache-only memory architecture (COMA).
Shared Memory Organization - UMA
• In the UMA system, a shared memory is accessible by
all processors through an interconnection network in
the same way a single processor accesses its memory.
• Therefore, all processors have equal access time to
any memory location
• The interconnection network used in the UMA can be a
single bus, multiple buses, a crossbar, or a multiport
memory.
UMA multicore system

Time to access all

the memory locations
will be the same for Figure 2.5
all the cores.
Shared Memory Organization - NUMA, and COMA

• In the NUMA system, each processor has part of the

shared memory attached.
• The memory has a single address space. Therefore,
any processor could access any memory location
directly using its real address.
• In COMA, the shared memory consists of cache
memory, and data are required to migrate to the
processor requesting it.
NUMA multicore system

A memory location a core is directly

Figure 2.6
connected to can be accessed
faster than a memory location that
must be accessed through another
chip.

28
MIMD Architecture – Message Passing
• Message Passing Organization
• Each processor has access to its own local memory. No shared
memory
• Communications are performed via send-and-receive
operations. (data copy – consistency issues)
• Message-passing multiprocessors employ a variety of static
networks in local communications.
MIMD Architecture

• Programming in the shared memory model was easier but

designing in the message passing model provided scalability.
• The distributed-shared memory (DSM) architecture began to
appear in systems like the SGI Origin2000, and others.
• In DSM, memory is physically distributed
• The architecture behaves like a shared memory machine, but a
message passing architecture lives underneath the software.
• Thus, the DSM machine is a hybrid that takes advantage of both
design schools.
Distributed Memory System

Figure 2.4

Copyright © 2010, Elsevier Inc. All rights Reserved 31

# Chapter Subtitle
Agenda
• Motivation for parallel computing
• Flynn’s Taxonomy
• Interconnection networks
• Dynamic Networks
• Static Networks
1.5 Interconnection Networks (INs)

• Mode of Operation
– Synchronous:
• a single global clock is used by all components in the system
(lock-step manner).

– Asynchronous:
• No global clock required
• Hand shaking signals are used to coordinate the operation of
asynchronous systems.

Hesham El-Rewini & ADVANCED COMPUTER ARCHITECTURE AND

Mostafa Abd-El-Barr PARALLEL PROCESSING
1.5 Interconnection Networks (INs)

• Control Strategy
– Centralized: one central control unit is used to control
the operations of the components of the system.

– Decentralized: the control function is distributed

among different components in the system.

Hesham El-Rewini & ADVANCED COMPUTER ARCHITECTURE AND

Mostafa Abd-El-Barr PARALLEL PROCESSING
1.5 Interconnection Networks (INs)

• Switching Techniques
– Circuit switching: a complete path has to be
established prior to the start of communication
between a source and a destination.

– Packet switching: communication between a source

and a destination takes place via messages divided
into smaller entities, called packets

Hesham El-Rewini & ADVANCED COMPUTER ARCHITECTURE AND

Mostafa Abd-El-Barr PARALLEL PROCESSING
1.5 Interconnection Networks (INs)

• Topology
– Describes how to connect processors and memories
to other processors and memories.
– Static: direct fixed links are established among nodes
to form a fixed network.
– Dynamic: connections are established when needed.

Hesham El-Rewini & ADVANCED COMPUTER ARCHITECTURE AND

Mostafa Abd-El-Barr PARALLEL PROCESSING
2.1 Interconnection Networks Taxonomy

Interconnection Network

Static Dynamic

1-D 2-D HC Bus-based Switch-based

Single Multiple SS MS Crossbar

Hesham El-Rewini & ADVANCED COMPUTER ARCHITECTURE AND

• Single Bus Systems

– Simplest way to connect multiprocessor systems.
– The use of local caches reduces the processor-
memory traffic.
– Size of such system varies between 2 and 50
processors.
– Single bus multiprocessors are inherently limited by:
• Bandwidth of bus.
• 1 processor can access the bus.
• 1 memory access can take place at any given time.

Hesham El-Rewini & ADVANCED COMPUTER ARCHITECTURE AND

Mostafa Abd-El-Barr PARALLEL PROCESSING
2.2 Bus-Based Dynamic Interconnection
Networks

• Single Bus Systems

p1 p2 ••• pN −1 pN

Shared Memory I/O

Hesham El-Rewini & ADVANCED COMPUTER ARCHITECTURE AND

Mostafa Abd-El-Barr PARALLEL PROCESSING
2.2 Bus-Based Dynamic
Interconnection Networks
• Multiple Bus Systems
– Several parallel buses to interconnect multiple
processors and multiple memory modules.
– Many connection schemes are possible.
– Examples:
• Multiple Bus with Full Bus – Memory Connection (MBFBMC).
• Multiple Bus with Single Bus – Memory Connection
(MBSBMC).
• Multiple Bus with Partial Bus – Memory Connection
(MBPBMC).
• Multiple Bus with Class-based Bus – Memory Connection
(MBCBMC).

Hesham El-Rewini & ADVANCED COMPUTER ARCHITECTURE AND

Mostafa Abd-El-Barr PARALLEL PROCESSING
2.2 Bus-Based Dynamic Interconnection
Networks
• Multiple Bus Systems:
– Multiple Bus with Full Bus – Memory
Connection (MBFBMC).

P1 P2 P3 P4 P5 P6

M1 M2 M3 M4

Hesham El-Rewini & ADVANCED COMPUTER ARCHITECTURE AND

Mostafa Abd-El-Barr PARALLEL PROCESSING
2.2 Bus-Based Dynamic
Interconnection Networks
• Multiple Bus Systems:
– Multiple Bus with Single Bus – Memory
Connection (MBSBMC).

P1 P2 P3 P4 P5 P6

M1 M2 M3 M4

Hesham El-Rewini & ADVANCED COMPUTER ARCHITECTURE AND

Mostafa Abd-El-Barr PARALLEL PROCESSING
2.2 Bus-Based Dynamic
Interconnection Networks
• Multiple Bus Systems:
– Multiple Bus with Partial Bus – Memory
Connection (MBPBMC).

P1 P2 P3 P4 P5 P6

M1 M2 M3 M4

Hesham El-Rewini & ADVANCED COMPUTER ARCHITECTURE AND

Mostafa Abd-El-Barr PARALLEL PROCESSING
2.2 Bus-Based Dynamic
Interconnection Networks
• Multiple Bus Systems:
– Multiple Bus with Class-based Memory
Connection (MBCBMC).

P1 P2 P3 P4 P5 P6

M1 M2 M3 M4 M5 M6

Class 1 Class 2 Class 3

Hesham El-Rewini & ADVANCED COMPUTER ARCHITECTURE AND
Mostafa Abd-El-Barr PARALLEL PROCESSING
2.2 Bus-Based Dynamic
Interconnection Networks
• Bus Synchronization
– A bus can be synchronous:
• Time for any transaction is known in advance.
– A bus can be asynchronous:
• Depends on the availability of data and readiness of devices
to initiate bus transactions.
– Bus arbitration logic is required to resolve bus
contention when more than 1 processor compete to
access the bus in single bus multiprocessor.
• Process of passing mastership from 1 processor to another is
called handshaking
– Requires a bus request and a bus grant.

Hesham El-Rewini & ADVANCED COMPUTER ARCHITECTURE AND

Mostafa Abd-El-Barr PARALLEL PROCESSING
2.2 Bus-Based Dynamic
Interconnection Networks
• Bus
Synchronization

Hesham El-Rewini & ADVANCED COMPUTER ARCHITECTURE AND

Mostafa Abd-El-Barr PARALLEL PROCESSING
2.2 Bus-Based Dynamic
Interconnection Networks
• Bus Synchronization
– Bus arbitration logic uses a a predefined
priority scheme:
• Random
• Simple rotating
• Equal priority
• Least Recently Used (LRU)

Hesham El-Rewini & ADVANCED COMPUTER ARCHITECTURE AND

Mostafa Abd-El-Barr PARALLEL PROCESSING
2.3 Switch-Based Interconnection
Networks
– Crossbar Networks
• Provide simultaneous connections among all its
inputs and all its outputs.
• A Switching Element (SE) is at the intersection of
any 2 lines extended horizontally or vertically
inside the switch.
• It is a non-blocking network allowing multiple input-
output connection pattern to be achieved
simultaneously.

Hesham El-Rewini & ADVANCED COMPUTER ARCHITECTURE AND

Mostafa Abd-El-Barr PARALLEL PROCESSING
2.3 Switch-Based Interconnection
Networks
– Crossbar Networks
M1 M2 M3 M4 M5 M6 M7 M8
P1
P2
P3
P4
P5
P6
P7
P8

Straight Switch Setting Diagonal Switch Setting

Hesham El-Rewini & ADVANCED COMPUTER ARCHITECTURE AND

Mostafa Abd-El-Barr PARALLEL PROCESSING
Figure 2.7

(a)
A crossbar switch connecting 4 processors
(Pi) and 4 memory modules (Mj)

(b)
Configuration of internal switches in a
crossbar

(c) Simultaneous memory accesses

by the processors

Copyright © 2010, Elsevier Inc. All rights Reserved 51

2.3 Switch-Based Interconnection
Networks
• Single-Stage Networks
– A single stage of SE exists between the inputs
and outputs of the network.
– Possible settings of a 2x2 SE are:

Straight Exchange Upper-broadcast Lower-broadcast

Hesham El-Rewini & ADVANCED COMPUTER ARCHITECTURE AND

Mostafa Abd-El-Barr PARALLEL PROCESSING
2.3 Switch-Based Interconnection
Networks
• Single-Stage Networks
– A well-known connection pattern is the
Shuffle–Exchange.
– These can be defined using an m bit-wise
address pattern of the inputs, pm-1pm-2 . . .
p1p0, as follows:

Hesham El-Rewini & ADVANCED COMPUTER ARCHITECTURE AND

Mostafa Abd-El-Barr PARALLEL PROCESSING
2.3 Switch-Based Interconnection
Networks
• Multistage Interconnection Networks
(MINs)
– A MIN consists of a number of stages each
consisting of a set of 2x2 SEs.
– Stages are connected to each other using
Inter-Stage Connection (ISC) pattern.
– MINs provide a number of simultaneous
paths between the processors and the
memory modules.

Hesham El-Rewini & ADVANCED COMPUTER ARCHITECTURE AND

Mostafa Abd-El-Barr PARALLEL PROCESSING
2.3 Switch-Based Interconnection
Networks
• Multistage Interconnection Networks
(MINs)
• In MINs the routing of a message from a
given source to a given destination is based
on the destination address (self-routing).
• The destination address bits are scanned
and the stages accordingly.
• If the bit in the destination address is 0, then
the message is routed to the upper output of
the switch. If the bit is 1, the message is
routed to the lower output of the switch.

Hesham El-Rewini & ADVANCED COMPUTER ARCHITECTURE AND

Mostafa Abd-El-Barr PARALLEL PROCESSING
2.3 Switch-Based Interconnection
Networks
• Multistage Networks (MINs)

ISC 1 ISC
x-1

Switches Switches Switches

(Stage 1) (Stage 2) (Stage x)

Hesham El-Rewini & ADVANCED COMPUTER ARCHITECTURE AND

Mostafa Abd-El-Barr PARALLEL PROCESSING
2.3 Switch-Based Interconnection
Networks
• Shuffle-Exchange Network
Construct: Shuffle
function,
P5 (101) → 3 (011) → 6
(110)
Self-routing:
left-to-right of the
destination address
e.g., From 101 to 011

Hesham El-Rewini & ADVANCED COMPUTER ARCHITECTURE AND

Mostafa Abd-El-Barr PARALLEL PROCESSING
2.3 Switch-Based Interconnection
Networks
• Banyan Network
Self-routing:
right-to-left of the
destination address

For example:
From 101 to 011

Hesham El-Rewini & ADVANCED COMPUTER ARCHITECTURE AND

Mostafa Abd-El-Barr PARALLEL PROCESSING
2.3 Switch-Based Interconnection
Networks
• Omega Network
inputs to each stage follow the shuffle interconnection pattern.
Construct: Shuffle
function,
P5 (101) → 3 (011) → 6
(110) → 5 (101)

Routing: use destination

bit for each stage
From 101 → 011

Hesham El-Rewini & ADVANCED COMPUTER ARCHITECTURE AND

Mostafa Abd-El-Barr PARALLEL PROCESSING
2.3 Switch-Based Interconnection
Networks
• Blockage in Multistage Interconnection
Networks
– Blocking networks:
• when an interconnection between a pair of
input/output is currently established, the arrival of a
request for a new interconnection between 2
arbitrary unused input and output may or may not
be possible.

Hesham El-Rewini & ADVANCED COMPUTER ARCHITECTURE AND

Mostafa Abd-El-Barr PARALLEL PROCESSING
2.3 Switch-Based Interconnection
Networks
• Shuffle-Exchange Network
When 101 to 011 established
- 001 to 010 not possible
- 100 to 110 possible

Hesham El-Rewini & ADVANCED COMPUTER ARCHITECTURE AND

Mostafa Abd-El-Barr PARALLEL PROCESSING
2.3 Switch-Based Interconnection
Networks
• Blockage in Multistage Interconnection
Networks
– Rearrangeable networks:
• Always possible to rearrange already established
connections in order to make allowance for other
connections to be established simultaneously

Hesham El-Rewini & ADVANCED COMPUTER ARCHITECTURE AND

Mostafa Abd-El-Barr PARALLEL PROCESSING
2.3 Switch-Based Interconnection
Networks
• Blockage in Multistage Interconnection
Networks
– Rearrangeable networks
000 000
001 001
010 010
011 011
100 100
101 101
110 110
111 111

000 000
001 001
010 010
011 011
100 100
101 101
110 110
111 111

Hesham El-Rewini & ADVANCED COMPUTER ARCHITECTURE AND

Mostafa Abd-El-Barr PARALLEL PROCESSING
2.3 Switch-Based Interconnection
Networks
• Blockage in Multistage Interconnection
Networks
– Non-blocking networks:
• In presence of a currently established connection
between any pair of input/output, it is always
possible to establish a connection between any
arbitrary unused pair of input/output.

Hesham El-Rewini & ADVANCED COMPUTER ARCHITECTURE AND

Mostafa Abd-El-Barr PARALLEL PROCESSING
2.3 Switch-Based Interconnection
Networks
000
001

010
011

100
101

110
111
Hesham El-Rewini & ADVANCED COMPUTER ARCHITECTURE AND
Mostafa Abd-El-Barr PARALLEL PROCESSING
2.5 Analysis and Performance Metrics

• Dynamic Networks
• The network cost is the number of switching points
• The delay (latency) measured in terms of the amount of the
input to output delay
• Nonblocking network allows multiple output connection
patterns (permutation) to be achieved
• A fault-tolerant system can be simply defined as a system that
can still function even in the presence of faulty components
inside the system.

Hesham El-Rewini & ADVANCED COMPUTER ARCHITECTURE AND

Mostafa Abd-El-Barr PARALLEL PROCESSING
2.5 Analysis and Performance
Metrics
• Dynamic Networks
Networks Delay Cost Blockin Degree of FT
g
Bus O(N) O(1) Yes 0

Multiple-bus O(mN) O(m) Yes (m-1)

MIN O(logN) O(NlogN) Yes 0

Crossbar O(1) O(N2) No 0

Hesham El-Rewini & ADVANCED COMPUTER ARCHITECTURE AND

Mostafa Abd-El-Barr PARALLEL PROCESSING
# Chapter Subtitle
Agenda
• Motivation for parallel computing
• Flynn’s Taxonomy
• Interconnection networks
• Dynamic Networks
• Static Networks
2.4 Static Interconnection Networks
• Have fixed paths, unidirectional or bi-
directional, between processors.
• Types:
– Completely connected networks (CCN):
Number of links: O(N2), delay complexity:
O(1).
1 2

6 3 completely connected
network.

5 4

Hesham El-Rewini & ADVANCED COMPUTER ARCHITECTURE AND

Mostafa Abd-El-Barr PARALLEL PROCESSING
2.4 Static Interconnection Networks

– Limited Connection Networks:

• Linear arrays
• Ring (Loop) networks
• Two-dimensional arrays
• Tree networks
• Cube network

Hesham El-Rewini & ADVANCED COMPUTER ARCHITECTURE AND

Mostafa Abd-El-Barr PARALLEL PROCESSING
2.4 Static Interconnection Networks

Linear arrays

Ring (Loop) networks

Tree networks
Two-dimensional arrays

Cube network

Hesham El-Rewini & ADVANCED COMPUTER ARCHITECTURE AND

Mostafa Abd-El-Barr PARALLEL PROCESSING
2.4 Static Interconnection Networks
– Cube Connected Networks:
• Patterned after the n-cube structure
• In an n-cube, every processor is connected to n others
• vertices are connected iff the binary representation of
their addresses differ by one and only one bit
Ex: a 4-cube:

Hesham El-Rewini & ADVANCED COMPUTER ARCHITECTURE AND

Mostafa Abd-El-Barr PARALLEL PROCESSING
2.4 Static Interconnection Networks
– Cube Connected Networks:
• The route at node i and destined for node j can be found by
XOR-ing the binary address representation of i and j.
• If the XOR-ing operation results in a 1 in a given bit position,
then the message has to be sent along the link that spans the
corresponding dimension.
• E.g.: S 0101 → D 1011 : XOR = 1110
• Send to 0111, 0001, 1101

Hesham El-Rewini & ADVANCED COMPUTER ARCHITECTURE AND

Mostafa Abd-El-Barr PARALLEL PROCESSING
2.4 Static Interconnection Networks

– Mesh Connected Networks:

D
D

S S

Example 3X3X2 mesh network

Hesham El-Rewini & ADVANCED COMPUTER ARCHITECTURE AND

Mostafa Abd-El-Barr PARALLEL PROCESSING
2.5 Analysis and Performance Metrics

• Static Networks
• Degree of a node, d, is defined as the number of channels
incident on the node.
• Diameter, D, of a network having N nodes is defined as the
longest path, p, of the shortest paths between any two nodes
• A network is said to be symmetric if it is isomorphic to itself
with any node labeled as the origin
• Cost means the total number of links in the network

Hesham El-Rewini & ADVANCED COMPUTER ARCHITECTURE AND

Mostafa Abd-El-Barr PARALLEL PROCESSING
2.5 Analysis and Performance
Metrics
• Static Networks

Networks Degree Diameter (D) Cost (No. Symmetry Worst

(d) of links) Delay
CCNs N-1 1 N(N-1)/2 Yes 1
Linear array 2 N -1 N -1 No N
Binary tree 3 2(Log2 N-1) N -1 No Log2 N
n-cube Log2 N Log2 N nN/2 Yes Log2 N
2D-mesh 4 2(n-1) 2(N-n) No N
K-ary n-cube 2n N/k/2 nxN Yes k x log2 N

Hesham El-Rewini & ADVANCED COMPUTER ARCHITECTURE AND

Mostafa Abd-El-Barr PARALLEL PROCESSING
2.6 Summary

• Different topologies used for interconnecting

multiprocessors were discussed.
• Taxonomy for interconnection networks based
on their topology is introduced.
• Dynamic and static interconnection schemes
have been studied.
• A number of basic performance aspects related
to both dynamic and static interconnection
networks have been introduced.

Hesham El-Rewini & ADVANCED COMPUTER ARCHITECTURE AND

Mostafa Abd-El-Barr PARALLEL PROCESSING
References
• Hesham El-Rewini and Mostafa Abd-El-Barr, “Advanced
Computer Architecture and Parallel Processing”,
chapters 1 and 2

• Peter S. Pacheco, Matthew Malensek “An Introduction

to Parallel Programming” 2nd ed, chapter 2

Automatic Shoe Polish Machine
40% (5)
Automatic Shoe Polish Machine
51 pages
Romm Warm Up
100% (1)
Romm Warm Up
8 pages
1/1 Multiprocessors (Or) Shared Memory Multi-Processor Model
No ratings yet
1/1 Multiprocessors (Or) Shared Memory Multi-Processor Model
17 pages
Parallel Processing Lecture2
No ratings yet
Parallel Processing Lecture2
62 pages
Chapter 1 (Parallel Computer Models)
No ratings yet
Chapter 1 (Parallel Computer Models)
20 pages
Pda 2
No ratings yet
Pda 2
105 pages
Cloud Computing - Lecture 3
No ratings yet
Cloud Computing - Lecture 3
22 pages
Introduction To Parallel Programming
No ratings yet
Introduction To Parallel Programming
268 pages
Unit IV CA
No ratings yet
Unit IV CA
73 pages
Organization of Multiprocessor Systems
No ratings yet
Organization of Multiprocessor Systems
87 pages
Introduction Mod1
No ratings yet
Introduction Mod1
120 pages
Architecture
No ratings yet
Architecture
67 pages
3 4 Flayynn Taxonomy, Network
No ratings yet
3 4 Flayynn Taxonomy, Network
84 pages
Ca - Unit 4
No ratings yet
Ca - Unit 4
77 pages
CC Unit 1.2
No ratings yet
CC Unit 1.2
39 pages
U1-Theory of Parallelism
No ratings yet
U1-Theory of Parallelism
43 pages
Unit 1 - Part 1
No ratings yet
Unit 1 - Part 1
51 pages
CS-3006 3 ParallelArchitectures
No ratings yet
CS-3006 3 ParallelArchitectures
53 pages
PTK40A Training Kit: User Manual V2.0 August 2014
100% (1)
PTK40A Training Kit: User Manual V2.0 August 2014
88 pages
RG1 Intro ParallelArch HPCAI Jan2020
No ratings yet
RG1 Intro ParallelArch HPCAI Jan2020
47 pages
Introduction To Parallel Processing Architecture
No ratings yet
Introduction To Parallel Processing Architecture
31 pages
Beginning Exercises With VBA
No ratings yet
Beginning Exercises With VBA
3 pages
Unit Iv Parallelism
No ratings yet
Unit Iv Parallelism
80 pages
Introduction To Parallel Processing
No ratings yet
Introduction To Parallel Processing
49 pages
COA U5 PPT Full
No ratings yet
COA U5 PPT Full
43 pages
Flynn's Classification
No ratings yet
Flynn's Classification
46 pages
UNIT 4 COA Parallelism
No ratings yet
UNIT 4 COA Parallelism
29 pages
Chapter2 Part 3
No ratings yet
Chapter2 Part 3
27 pages
Module 2 - Parallel Computing
No ratings yet
Module 2 - Parallel Computing
55 pages
Parallel Computers
No ratings yet
Parallel Computers
39 pages
CS516: Parallelization of Programs: Overview of Parallel Architectures
No ratings yet
CS516: Parallelization of Programs: Overview of Parallel Architectures
43 pages
Explicitly Parallel Platforms
No ratings yet
Explicitly Parallel Platforms
90 pages
L32 SMP
No ratings yet
L32 SMP
47 pages
Ceg4131 Models
No ratings yet
Ceg4131 Models
27 pages
Parallel Computer Models: PCA Chapter 1
No ratings yet
Parallel Computer Models: PCA Chapter 1
61 pages
Parallel Computing
No ratings yet
Parallel Computing
32 pages
CSA Presentation
No ratings yet
CSA Presentation
37 pages
Baker CHPT 5 SIMD Good
No ratings yet
Baker CHPT 5 SIMD Good
94 pages
Unit 1 - Part - 2
No ratings yet
Unit 1 - Part - 2
30 pages
Unit 4
No ratings yet
Unit 4
16 pages
Unit - 01 Easid
No ratings yet
Unit - 01 Easid
18 pages
Aca Unit 1.1
No ratings yet
Aca Unit 1.1
20 pages
Lect6-SPC - Flynns
No ratings yet
Lect6-SPC - Flynns
16 pages
Downloadfile
No ratings yet
Downloadfile
16 pages
NOTES
No ratings yet
NOTES
19 pages
atII Bks Lec 2021 31 32
No ratings yet
atII Bks Lec 2021 31 32
16 pages
Part 1 - Lecture 2 - Parallel Hardware
No ratings yet
Part 1 - Lecture 2 - Parallel Hardware
60 pages
Parallel Processors: Session 2
No ratings yet
Parallel Processors: Session 2
32 pages
Parallel Computer Models: CEG 4131 Computer Architecture III Miodrag Bolic
No ratings yet
Parallel Computer Models: CEG 4131 Computer Architecture III Miodrag Bolic
27 pages
Ch12 Parallel Proc3-Aula
No ratings yet
Ch12 Parallel Proc3-Aula
35 pages
Computer Architecture and Parallel Processing
No ratings yet
Computer Architecture and Parallel Processing
29 pages
Bondstrand Fiberglass Flanges Assembly Instructions - Ameron
No ratings yet
Bondstrand Fiberglass Flanges Assembly Instructions - Ameron
8 pages
CS82 Advanced Computer Architecture: Parallel Computer Models 1.2 Multiprocessors and Multicomputers
No ratings yet
CS82 Advanced Computer Architecture: Parallel Computer Models 1.2 Multiprocessors and Multicomputers
19 pages
Slide02 Parallel Computers
No ratings yet
Slide02 Parallel Computers
44 pages
Advanced Computer Architecture Unit 1
No ratings yet
Advanced Computer Architecture Unit 1
23 pages
CS 213: Parallel Processing Architectures: Laxmi Narayan Bhuyan
No ratings yet
CS 213: Parallel Processing Architectures: Laxmi Narayan Bhuyan
26 pages
J2SE (Core JAVA) Notes
No ratings yet
J2SE (Core JAVA) Notes
146 pages
PP16 Lec4 Arch3
No ratings yet
PP16 Lec4 Arch3
23 pages
L2
No ratings yet
L2
27 pages
Parallel Prrocessor
No ratings yet
Parallel Prrocessor
12 pages
(SMC), (SMP), (MPP) : Symmetric Multi-Computers Symmetric Multi-Processors
No ratings yet
(SMC), (SMP), (MPP) : Symmetric Multi-Computers Symmetric Multi-Processors
13 pages
Elna Excellence 780 Sewing Machine Service Manual
No ratings yet
Elna Excellence 780 Sewing Machine Service Manual
43 pages
Slides Taken From: Parallel Computing Platforms
No ratings yet
Slides Taken From: Parallel Computing Platforms
11 pages
General System Architecture
No ratings yet
General System Architecture
28 pages
HPA - Notes
No ratings yet
HPA - Notes
5 pages
Project Plan and Gantt Chart: Project Name Project Manager Start Date End Date Overall Progress Project Deliverable
No ratings yet
Project Plan and Gantt Chart: Project Name Project Manager Start Date End Date Overall Progress Project Deliverable
4 pages
2cvcity Catalogue
No ratings yet
2cvcity Catalogue
52 pages
Gabinetes Pentair
No ratings yet
Gabinetes Pentair
8 pages
TYROSE1
No ratings yet
TYROSE1
176 pages
OpenVSwitch PDF
No ratings yet
OpenVSwitch PDF
21 pages
SP-HF1100X Service Manual
No ratings yet
SP-HF1100X Service Manual
17 pages
Cassette-Scissor Lift 7'x12' PDF
No ratings yet
Cassette-Scissor Lift 7'x12' PDF
76 pages
Power9 Family Details - 2018 Nov 06
No ratings yet
Power9 Family Details - 2018 Nov 06
20 pages
GE Fanuc CMM
No ratings yet
GE Fanuc CMM
6 pages
4K Ultra HD PTZ Camera SDI-4K17-16B: Product Feature
No ratings yet
4K Ultra HD PTZ Camera SDI-4K17-16B: Product Feature
5 pages
WA7 Fireflies Owners Manual
No ratings yet
WA7 Fireflies Owners Manual
17 pages
Employee User & Attendance Manual
No ratings yet
Employee User & Attendance Manual
50 pages
Shuttle Valve PDF
No ratings yet
Shuttle Valve PDF
17 pages
Project ReportBIIRECTIONAL VISITOR COUNTER
No ratings yet
Project ReportBIIRECTIONAL VISITOR COUNTER
51 pages
Unix by Rahul Singh3
No ratings yet
Unix by Rahul Singh3
8 pages
Force10 Open Automation - Service Manual8 - en Us
No ratings yet
Force10 Open Automation - Service Manual8 - en Us
231 pages
SolidWorks2018 PDF
No ratings yet
SolidWorks2018 PDF
1 page
Eaton 93e Production Firmware History
No ratings yet
Eaton 93e Production Firmware History
4 pages
4010 SFIO Programmed IC Installation Instructions
No ratings yet
4010 SFIO Programmed IC Installation Instructions
4 pages
Disassembling Buffalo MiniStation External Hard Drive (Thunderbolt & USB 3)
No ratings yet
Disassembling Buffalo MiniStation External Hard Drive (Thunderbolt & USB 3)
6 pages
All About TransactionScope - CodeProject
No ratings yet
All About TransactionScope - CodeProject
16 pages
Bca3010 Unit 01 SLM
No ratings yet
Bca3010 Unit 01 SLM
12 pages
Trila 2
No ratings yet
Trila 2
5 pages