0% found this document useful (0 votes)

20 views29 pages

1 Module 1 Introduction To Multiprocessors September 29 2024

Uploaded by

Omar Amer

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

20 views29 pages

1 Module 1 Introduction To Multiprocessors September 29 2024

Uploaded by

Omar Amer

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 29

Module #1

Introduction to Parallel Processing

Professor Mostafa Abd-El-Barr

Fall Term 2024-2025

Sunday, September 29, 2024 1

Outline
❑ Introduction on Parallelism
❑ Multiprocessor Interconnection Networks
❑ Introduction to Parallel Processing

2 Sunday, September 29, 2024

Introduction To Parallelism
❑ Definition “Parallelism”
Ability to execute different parts of a program concurrently on
different processors
❑ Goal: shorten execution time

Measures of Performance

• To computer scientists: speedup, execution time.

• To applications people: size of problem, accuracy of solution,
etc.
Speedup of Algorithm
✓ Speedup of algorithm = sequential execution time /
execution time on p processors (for the same data
set).

speedup

p
What Speedups Can You Get?
✓ Linear speedup
– implicitly means a 1-to-1 speedup per processor.
– (almost always) as good as you can do.
✓ Sub-linear speedup: This is more normal due to overhead of
startup, synchronization, communication, etc.
speedup linear

actual

p
Scalability
✓ Roughly speaking, a program is said to scale to a certain number of
processors p, if going from p-1 to p processors results in some
acceptable improvement in speedup (for instance, an increase of 0.5).
Amdahl’s Law
✓ If 1/s of the program is sequential, then you can never get a
speedup better than s.
– (Normalized) sequential execution time = 1/s + (1- 1/s) = 1
– Best parallel execution time on p processors = 1/s + (1 - 1/s) /p
– When p goes to infinity, parallel execution =1/s
– Speedup = s.

Why keep something sequential?

✓ Some parts of the program are not parallelizable (because of

dependences)
✓ Some parts may be parallelizable, but the overhead
overshadows the increased speedup.
Fundamental Assumption
• Processors execute independently: no control
over order of execution between processors

When can 2 statements execute in parallel?

Possibility 1 •
Processor2: Processor1:
statement1;
statement2;
Possibility 2 •
Processor2: Processor1:
statement2:
statement1;
When can 2 statements execute in parallel?
• Their order of execution must not matter!
• In other words,
statement1; statement2;
must be equivalent to
statement2; statement1;
Example 1 a = 1;
b = a;
• Statements cannot be executed in parallel
Example 2 a = 1;
a = 2;
• Statements cannot be executed in parallel.
Types of Dependence
Suppose we have two statements: Statements S1, S2
1. True dependence

S2 has a true dependence on S1 iff S2 reads a value written by S1

2. Anti-dependence

S2 has an anti-dependence on S1 iff S2 writes a value read by S1.

3. Output Dependence

S2 has an output dependence on S1 iff S2 writes a variable written by S1.

When can 2 statements execute in parallel?
S1 and S2 can execute in parallel iff
there are no dependences between S1 and S2
– true dependences
– anti-dependences
– output dependences
Example Status Observation
for(i=0; i<100; i++) No dependences. Iterations can be executed
a[i] = i; in parallel.
for(i=0; i<100; i++) No dependences. Loop is still parallelizable.
a[i] = a[i] + 100;
for(i=0; i<100; i++) { No dependences. Iterations and statements
a[i] = i; can be executed in parallel.
b[i] = 2*i;
}
for( i=0; i<100; i++ ) Dependence between Loop iterations are not
a[i] = f(a[i-1]); a[i] and a[i-1]. parallelizable.
Interconnection Networks
✓ Interconnection Networks Taxonomy

Interconnection Networks (Topology)

Static Dynamic

1D 2D HC
Bus-based Switch-based

Single Multiple SS MS Crossbar

11
Interconnection Networks
✓Multiprocessor interconnection networks (INs) can be classified based on a numbers of
criteria:
(1) Mode of operation (synchronous versus asynchronous),
(2) Control Strategy (centralized versus decentralized),
(3) Switching Techniques (Circuit versus packet), and
(4) Topology (static versus dynamic).

✓ An interconnection network could be either static or dynamic:

1. Dynamic Networks
o Dynamic network are established on the fly as needed.
o Dynamic networks can be classified based on interconnection scheme as bus-based versus
switch-based.
▪ Bus-based networks can further be classified as single bus or multiple buses.
▪ Switch-based dynamic networks can be classified as single-stage (SS), multi-stage (MS), or
crossbar networks.
2. Static Networks Interconnection Networks (Topology)
o Static network connections are fixed links.
o Static networks can be classified as Static Dynamic
▪ one-dimension (1D), 1D 2D HC
▪ two-dimension (2D), or Bus-based Switch-based
▪ hypercube (HC).
Single Multiple SS MS Crossbar
Interconnection Networks
❑ Topology-wise Shared memory systems can be designed using the following:
✓ Single Bus Systems
o A single bus is considered the simplest way to connect multiprocessor systems.
o The figure shows an illustration of a single bus system.
p p ••• p p
1 2 N −1 N

Shared Memory I/O

✓ Consists of N processors, each having its own cache, connected by a shared bus.
✓ The use of local caches reduces the processor-memory traffic.
✓ All processors communicate with a single shared memory.
✓ Typical size of such system varies between 2 to 50 processors.
✓ The actual size is determined by the traffic per processor and the bus bandwidth (defined as the maximum
rate at which the bus can propagate data once transmission has started).
✓ The single bus network complexity, measured in terms of the number of buses used, is O(1), while the time
complexity, measured in terms of the amount of input to output delay is O(N).
Machine Name Maximum # Processor Clock rate Maximum Bandwidth
processors Memory
HP 9000 K640 4 PA-8000 180 MHz 4,096 MB 960 MB/sec
IBM RS/6000 R40 8 PowerPC 604 112 MHz 2,048 MB 1800 MB/sec
Sun Enterprise 6000 30 UltraSPARC 1 167 MHz 30,720 MB 2600 MB/sec
Interconnection Networks
✓ Multiple Bus Systems
o A multiple-bus multiprocessor system uses several parallel buses to interconnect multiple
processors and multiple memory modules.
o Among the possibilities are
▪ multiple-bus with full bus-memory connection (MBFBMC),
▪ multiple-bus with single bus-memory connection (MBSBMC),
▪ multiple-bus with partial bus-memory connection (MBPBMC), and
▪ multiple-bus with class-based memory connection (MBCBMC).
o Illustrations of the multiple bus is shown below.

Connection Type # Connections Load on bus i

MBFBMC B(N+M) N+M

MBSBMC BN + M N+
MBPBMC B(N +M/g) N + M/g
MBCBMC BN + N+
Interconnection Networks
✓ Single-Stage (SS) networks

Straight Exchange Upper broadcast Lower broadcast

S( p p .....p p ) = p p .....p p p
m −1 m − 2 1 0 m−2 m−3 1 0 m −1
E(Pm-1 Pm-2 ….P1 P0) = Pm-1 Pm-2 ….P1 P0^--

o Example
In an 8-input single stage Shuffle-Exchange if the source is 0 (000) and the destination
is 6 (110), then the following is the required sequence of Shuffle/Exchange operations
and circulation of data:

E (000) → 1(001) → S (001) → 2(010) → E (010) → 3(011) → S (011) → 6(110)

The network complexity of the single stage interconnection network is O(N) and the time complexity is O(N).

15 Sunday, September 29, 2024

Interconnection
✓ Multiple-Stage (MS) Networks
Networks
o Example: The Shuffle-Exchange
000 000

001 001

010 010

011 011

100 100
101 101

110 110

111 111
✓ The figure shows an example of an 88 MIN that uses the 22 SEs described before.
✓ This network is known in the literature as the Shuffle-exchange network (SEN).
✓ The settings of the SEs in the figure illustrate how a number of paths (but NOT all) can be established
simultaneously in the network.
✓ Example:
o The figure shows how three simultaneous paths connecting the three pairs of input/output can be
established.
Interconnection Networks
o Example: The Banyan Network

000 1 5 9 000
001
001
010 2 6 10 010
011 011

100 3 7 11 100
101
101

110 4 8 12 110

111 111

17 Sunday, September 29, 2024

Interconnection Networks
✓ The Crossbar Switch
P C
Network complexity, measured in terms of the
number of switching points O(N 2 ) P C

P C

Time complexity, measured in terms of the input to

output delay O(1) P C

M M M M

18 Sunday, September 29,

2024
Interconnection Networks
❑ Topology-wise Message passing INs can be divided into static, dynamic, or random.
✓ Static networks form all connections when the system is designed rather than when the
connection is needed.
✓ In a static network, messages must be routed along established links.
✓ Dynamic INs establish a connection between two or more nodes on the fly as messages are
routed along the links.
✓ In either static or dynamic networks, a single message may have to hop through intermediate
processors on its way to its destination.
✓ Therefore, the ultimate performance of an interconnection network is greatly influenced by
the number of hops taken to traverse the network.
✓ Random network is the most general and widespread, because it is the interconnection
network of the Internet.
✓ There is not regularity in the topology, hence the name "random" Connections are added and
dropped as needed.
✓ The number of hops in a path from source to destination node is equal to the number of point-
to-point links a message must traverse to reach its destination.

19 Sunday, September 29, 2024

Interconnection Networks
✓ Cub Networks
o The interconnection pattern used in the cube network is defined as follows:
Ci ( pm−1 pm−2 ......pi +1 pi pi −1 ......p1 p0 ) = pm−1 pm−2 ......pi +1 pi pi −1 ......p1 p0
o Consider a 3-bit address (N = 8), then we have

C0 = 0 1 2 3 4 5 6 7

C1 = 0 1 2 3 4 5 6 7

20 Sunday, September 29,

2024
Interconnection Networks
✓ The table shows a performance comparison among a number of different dynamic INs.
✓ In this table, m represents the number of multiple busses, while N represents the number
of processors (memory modules) or Input/output of the network.

Network Delay Cost (Complexity)

Bus O(N) O(1)

Multiple-bus O(mN) O(m)

MINs O( log N) O(N log N)

❑ Consider the number of popular static topologies: (a) linear array, (b) ring, (c) mesh, (d) tree,
(e) hypercube.

❑ The following are definitions in this connections is as follows:

✓ The degree of a network is defined as the maximum number of links (channels) connected to any node in
the network.
✓ The diameter of a network is defined as the maximum path, p, of the shortest paths between any two
nodes.
✓ The Degree of a node, d, is defined as the number of channels incident on the node.
Interconnection Networks

o The hypercube is referred to as a logarithmic architecture.

o This is because the maximum number of links a message has to traverse in order to reach its
destination in an n-cube containing N = 2n nodes is log 2 N = n Links.
o One of the desirable features of hypercube networks is the recursive nature of their
constructions. An n-cube can be constructed from two sub-cubes each having an (n-1) degree.
o The 4-cube shown in the Figure is constructed from two sub-cubes each of degree three.
o The construction of the 4-cube out of the two 3-cubes requires an increase in the degree of each node.
o It is worth mentioning that the Intel iPSC is an example of hypercube-based commercially
available multiprocessor systems.
Interconnection Networks
✓ Mesh-connected Networks

o An n-dimensional mesh can be defined as an interconnection structure that has K 0  K1  ...  K n −1

nodes where n is the number of dimensions of the network and K is the radix of dimension i.
i

o The Figure shows an example of a 3  3  2mesh network.

o An advantage of mesh interconnection networks is that they are scalable.
o Larger meshes can be obtained from smaller ones without changing the node degree
o Examples include MPP from Goodyear Aerospace, Paragon from Intel, and J-Machine from MIT.

23 Sunday, September 29,

2024
Introduction to Parallel Processing
I. Flynn’s Taxonomy of Computer Architecture
1. The most popular taxonomy of computer architecture was defined by Flynn in 1966.
2. Flynn’s classification scheme is based on the notion of a stream of information.
3. Two types of information flow into a processor: instructions and data.
4. Instruction stream: is defined as the sequence of instructions performed by the processing unit.
5. Data stream: is defined as the data traffic exchanged between the memory and the processing unit.
6. Flynn’s classification can be classified into the following four distinct categories:
▪ Single-Instruction Single-Data streams (SISD).
▪ Single-Instruction Multiple-Data streams (SIMD).
▪ Multiple-Instruction Single-Data streams (MISD).
▪ Multiple-Instruction Multiple-Data streams (MIMD).
Introduction to Parallel Processing
SIMD Architecture
✓ There are two main configurations that have been used in SIMD machines.
1. The first scheme:
(a) each processor has its own local memory.
(b) Processors can communicate with each other through the interconnection network.
(c) If the interconnection network does not provide direct connection between a given pair of
processors, then this pair can exchange data via an intermediate processor.
(d) The ILLIAC IV used such an interconnection scheme.
(e) The interconnection network in the ILLIAC IV allowed each processor to communicate
directly with four neighboring processors in an 8  8 matrix pattern such that the ith
processor can communicate directly with the (i-1)th, (i+1)th, (i-8)th, and (i+8)th processors.
Introduction to Parallel Processing
2. The second SIMD scheme:
(a) processors and memory modules communicate with each other via the interconnection network.
(b) Two processors can transfer data between each other via intermediate memory module(s) or
possibly via intermediate processor(s).
(c) The BSP (Burroughs’ Scientific Processor) used the second SIMD scheme.

Control Unit

P1 P2 P3 Pn-1 Pn

Interconnection Network

M1 M2 M3 Mn-1 Mn

26 Sunday, September 29, 2024

Introduction to Parallel Processing
MIMD Architecture
✓ MIMD parallel architectures are made of multiple processors and multiple memory modules
connected via some interconnection network.
✓ They fall into two broad categories:
1. Shared Memory System
✓ Typically accomplishes inter-processor coordination through a global memory shared by all
processors.
✓ These are typically server systems that communicate through a bus and cache memory controller.
✓ The bus/cache architecture alleviates the need for expensive multiported memories and interface
circuitry as well as the need to adopt a message-passing paradigm when developing application
software.
✓ In shared memory access is balanced, i.e., each processor has equal opportunity to read/write to
memory.
✓ Therefore, these systems are also called SMP (Symmetric Multiprocessor) systems.
✓ Commercial examples of SMPs are Sun Microsystems multiprocessor servers, and Silicon Graphics
Inc. multiprocessor servers.
Introduction to Parallel Processing
2. Message Passing System (distributed memory)
✓ typically combines local memory and processor at each node of the interconnection network.
✓ There is no global memory so it is necessary to move data from one local memory to another
by means of message passing.
✓ This is typically done by a Send/Receive pair of commands, which must be written into the
application software by a programmer.
✓ Programmers must learn the message-passing paradigm, which involves data copying and
dealing with consistency issues.
✓ Commercial examples of include the nCUBE, iPSC/2, and various Transputer-based systems.
✓ These systems eventually gave way to Internet connected systems whereby the
processor/memory nodes were either Internet servers or clients on individual’s desktops.

28 Sunday, September 29, 2024

References
▪ Textbook Chapters 1 & 2.

29 Sunday, September 29, 2024

Interconnection Networks
No ratings yet
Interconnection Networks
31 pages
Chapter 1.1-8085 Architecture-Introduction
100% (2)
Chapter 1.1-8085 Architecture-Introduction
34 pages
Unit 9
No ratings yet
Unit 9
24 pages
Zoning in Brocade FC SAN Switch For Beginners - SAN Enthusiast
No ratings yet
Zoning in Brocade FC SAN Switch For Beginners - SAN Enthusiast
13 pages
Unit I Introduction
No ratings yet
Unit I Introduction
54 pages
Unit - I - Chapter - 1 - Notes-Distributed Systems
No ratings yet
Unit - I - Chapter - 1 - Notes-Distributed Systems
14 pages
Cloud Computing - Notes
No ratings yet
Cloud Computing - Notes
68 pages
Lecture 5
No ratings yet
Lecture 5
72 pages
Interconnection Networks
No ratings yet
Interconnection Networks
40 pages
Publication 3 3685 213
No ratings yet
Publication 3 3685 213
25 pages
Additional Book Chapters - Service Fabric
No ratings yet
Additional Book Chapters - Service Fabric
313 pages
Multiple Processor Systems: 8.1 Multiprocessors 8.2 Multicomputers 8.3 Distributed Systems
No ratings yet
Multiple Processor Systems: 8.1 Multiprocessors 8.2 Multicomputers 8.3 Distributed Systems
55 pages
Static and Dynamic
No ratings yet
Static and Dynamic
43 pages
Unit-5 Part-2
No ratings yet
Unit-5 Part-2
22 pages
Aca Lect15
No ratings yet
Aca Lect15
15 pages
Parallel Processors: Session 5 Interconnection Networks
No ratings yet
Parallel Processors: Session 5 Interconnection Networks
48 pages
Network 34
No ratings yet
Network 34
76 pages
Parallel Processing Lecture3
No ratings yet
Parallel Processing Lecture3
54 pages
Lecture 5 Network Topologies For Parallel Architectures - Updated
No ratings yet
Lecture 5 Network Topologies For Parallel Architectures - Updated
46 pages
Ch-8 Shared Memory Multiprocessors
No ratings yet
Ch-8 Shared Memory Multiprocessors
45 pages
Pipeline
No ratings yet
Pipeline
43 pages
Multiprocessors
No ratings yet
Multiprocessors
8 pages
ch.4 and 5
No ratings yet
ch.4 and 5
41 pages
Module 3
No ratings yet
Module 3
25 pages
Ca 2-1
No ratings yet
Ca 2-1
48 pages
2.radmi 2013 Vol 2 Multiprocessorinterconnectionnetworks
No ratings yet
2.radmi 2013 Vol 2 Multiprocessorinterconnectionnetworks
8 pages
Unit 3 Interconnection Network: Structure Page Nos
No ratings yet
Unit 3 Interconnection Network: Structure Page Nos
18 pages
Lecture5 (Share Memory" According To Connection)
No ratings yet
Lecture5 (Share Memory" According To Connection)
9 pages
Module 4 Chapter 1
No ratings yet
Module 4 Chapter 1
28 pages
Parallel Programming Platforms (Part 1) : CSE3057Y Parallel and Distributed Systems
No ratings yet
Parallel Programming Platforms (Part 1) : CSE3057Y Parallel and Distributed Systems
38 pages
2ad6a430 1637912349895
No ratings yet
2ad6a430 1637912349895
51 pages
Module-4 Notes
No ratings yet
Module-4 Notes
48 pages
CS621 Final Term
No ratings yet
CS621 Final Term
111 pages
Multi Processors
No ratings yet
Multi Processors
15 pages
atII Bks Lec 2021 31 32
No ratings yet
atII Bks Lec 2021 31 32
16 pages
Lecture 4 Network Topologies For Parallel Architecture
No ratings yet
Lecture 4 Network Topologies For Parallel Architecture
34 pages
Unit 11
No ratings yet
Unit 11
10 pages
Lecture 3.2.4 (Various Interconnection Networks)
No ratings yet
Lecture 3.2.4 (Various Interconnection Networks)
5 pages
@vtucode - in 21CS643 Module 4 2021 Scheme
No ratings yet
@vtucode - in 21CS643 Module 4 2021 Scheme
189 pages
COA Group Assigment
No ratings yet
COA Group Assigment
11 pages
Chapter Thirteen: Multiprocessors
No ratings yet
Chapter Thirteen: Multiprocessors
55 pages
Chapter Ten Architeture
No ratings yet
Chapter Ten Architeture
14 pages
Final Unit5 CO Notes
No ratings yet
Final Unit5 CO Notes
7 pages
1multiprocessors and Multicomputers: A. Multiprocessor System Interconnects
No ratings yet
1multiprocessors and Multicomputers: A. Multiprocessor System Interconnects
16 pages
Interconnection Networks
No ratings yet
Interconnection Networks
7 pages
Multiprocessor Architecture and Programming
No ratings yet
Multiprocessor Architecture and Programming
20 pages
4 - Interconnection Networks
No ratings yet
4 - Interconnection Networks
57 pages
COA Assignment
No ratings yet
COA Assignment
21 pages
MSOFTX3000 ATCA Platform Data Configuration of Hardware and Module Groups-20090622-B-1.0
No ratings yet
MSOFTX3000 ATCA Platform Data Configuration of Hardware and Module Groups-20090622-B-1.0
51 pages
24-25 - Parallel Processing PDF
No ratings yet
24-25 - Parallel Processing PDF
36 pages
Unit6 - Microprocessor - Final 1
No ratings yet
Unit6 - Microprocessor - Final 1
30 pages
Fp2 Fp2sh Usersmanual Arct1f320e10
No ratings yet
Fp2 Fp2sh Usersmanual Arct1f320e10
384 pages
15CS72 IAT2 Solution
No ratings yet
15CS72 IAT2 Solution
13 pages
Multiprocessor
No ratings yet
Multiprocessor
22 pages
4 TH
No ratings yet
4 TH
84 pages
Infineon XMC4500 DS
No ratings yet
Infineon XMC4500 DS
123 pages
COE4590 8 Multiprocessor
No ratings yet
COE4590 8 Multiprocessor
17 pages
NS Networking Guide
No ratings yet
NS Networking Guide
234 pages
07 Multiprocessors MF PDF
No ratings yet
07 Multiprocessors MF PDF
99 pages
What Is An Interconnection Network
No ratings yet
What Is An Interconnection Network
5 pages
Adder (Electronics)
No ratings yet
Adder (Electronics)
4 pages
Co Unit-V
No ratings yet
Co Unit-V
12 pages
Unit 3
No ratings yet
Unit 3
18 pages
Lecture 6 SoC
No ratings yet
Lecture 6 SoC
24 pages
Memory Hierarchy: Memory Hierarchy Design in A Computer System Mainly
No ratings yet
Memory Hierarchy: Memory Hierarchy Design in A Computer System Mainly
16 pages
Multiprocessors Interconnection Networks
No ratings yet
Multiprocessors Interconnection Networks
32 pages
Configution Huawei 6502
No ratings yet
Configution Huawei 6502
8 pages
Battery Firmware Upgrade Manual-Hyper Terminal V5
No ratings yet
Battery Firmware Upgrade Manual-Hyper Terminal V5
11 pages
Windows Powershell Quick Reference Windows Powershell Quick Reference
No ratings yet
Windows Powershell Quick Reference Windows Powershell Quick Reference
2 pages
Linux Pocket Guide: What's in This Book?
No ratings yet
Linux Pocket Guide: What's in This Book?
5 pages
LW153 Manual Eng
No ratings yet
LW153 Manual Eng
16 pages
LSMW Steps For Data Migration
No ratings yet
LSMW Steps For Data Migration
4 pages
Segment Routing - IDNOG V1.4
No ratings yet
Segment Routing - IDNOG V1.4
36 pages
How To Reset The Sap Star Password
No ratings yet
How To Reset The Sap Star Password
6 pages
Y2K38 Problem
No ratings yet
Y2K38 Problem
23 pages
Module 1
No ratings yet
Module 1
6 pages
Networking Fundamentals
No ratings yet
Networking Fundamentals
19 pages
Dynamic Nat
No ratings yet
Dynamic Nat
7 pages
Configuring Cisco Express Forwarding: Overview of CEF
No ratings yet
Configuring Cisco Express Forwarding: Overview of CEF
10 pages
Shortcut and Formulas
No ratings yet
Shortcut and Formulas
10 pages
Computer Solutions
No ratings yet
Computer Solutions
20 pages
Project Report - 7
No ratings yet
Project Report - 7
17 pages
6AV66430AA011AX0 Datasheet en
No ratings yet
6AV66430AA011AX0 Datasheet en
5 pages
Bimmercode Quick Start Guide For Unicarscan Ucsi-2000
No ratings yet
Bimmercode Quick Start Guide For Unicarscan Ucsi-2000
5 pages
I2 Quote 3700016092696.1 2025 03 04
No ratings yet
I2 Quote 3700016092696.1 2025 03 04
5 pages
TCP and UDP Ports Used by FactoryTalk Optix
No ratings yet
TCP and UDP Ports Used by FactoryTalk Optix
3 pages
Oracle 11g Streams Implementer's Guide
From Everand
Oracle 11g Streams Implementer's Guide
Ann L. R. McKinnell
No ratings yet
Study Guide Designing Cisco Data Centre Infrastructure (300-610) Exam
From Everand
Study Guide Designing Cisco Data Centre Infrastructure (300-610) Exam
Anand Vemula
No ratings yet
IGNOU BCA Fundamentals of Computer Networks Previous Year Unsolved Papers BCS 041
From Everand
IGNOU BCA Fundamentals of Computer Networks Previous Year Unsolved Papers BCS 041
Manish Soni
No ratings yet
WAN TECHNOLOGY FRAME-RELAY: An Expert's Handbook of Navigating Frame Relay Networks
From Everand
WAN TECHNOLOGY FRAME-RELAY: An Expert's Handbook of Navigating Frame Relay Networks
Mamta Devi
No ratings yet

1 Module 1 Introduction To Multiprocessors September 29 2024

Uploaded by

1 Module 1 Introduction To Multiprocessors September 29 2024

Uploaded by

Module #1

Introduction to Parallel Processing

Fall Term 2024-2025

Sunday, September 29, 2024 1

2 Sunday, September 29, 2024

• To computer scientists: speedup, execution time.

Why keep something sequential?

✓ Some parts of the program are not parallelizable (because of

When can 2 statements execute in parallel?

S2 has a true dependence on S1 iff S2 reads a value written by S1

S2 has an anti-dependence on S1 iff S2 writes a value read by S1.

S2 has an output dependence on S1 iff S2 writes a variable written by S1.

Interconnection Networks (Topology)

Single Multiple SS MS Crossbar

✓ An interconnection network could be either static or dynamic:

Shared Memory I/O

Connection Type # Connections Load on bus i

MBFBMC B(N+M) N+M

Straight Exchange Upper broadcast Lower broadcast

E (000) → 1(001) → S (001) → 2(010) → E (010) → 3(011) → S (011) → 6(110)

15 Sunday, September 29, 2024

17 Sunday, September 29, 2024

Time complexity, measured in terms of the input to

18 Sunday, September 29,

19 Sunday, September 29, 2024

20 Sunday, September 29,

Network Delay Cost (Complexity)

Bus O(N) O(1)

Multiple-bus O(mN) O(m)

MINs O( log N) O(N log N)

❑ The following are definitions in this connections is as follows:

o The hypercube is referred to as a logarithmic architecture.

o An n-dimensional mesh can be defined as an interconnection structure that has K 0  K1  ...  K n −1

o The Figure shows an example of a 3  3  2mesh network.

23 Sunday, September 29,

26 Sunday, September 29, 2024

28 Sunday, September 29, 2024

29 Sunday, September 29, 2024

You might also like