CSCI 8150 Advanced Computer Architecture

This document discusses different models of parallel computers, including multiprocessors with shared memory and multicomputers with distributed memory. It describes the uniform memory access (UMA) model, non-uniform memory access (NUMA) model, and cache-only memory architecture (COMA) models for shared memory multiprocessors. It also discusses message passing models for multicomputers and provides an example performance calculation comparing sequential and parallel execution.

Uploaded by

sunnynnus

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

100% found this document useful (2 votes)

166 views18 pages

CSCI 8150 Advanced Computer Architecture

Uploaded by

sunnynnus

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 18

CSCI 8150

Advanced Computer Architecture

Hwang, Chapter 1
Parallel Computer Models
1.2 Multiprocessors and
Multicomputers
Categories of Parallel Computers
Considering their architecture only, there are
two main categories of parallel computers:
systems with shared common memories, and
systems with unshared distributed memories.
Shared-Memory Multiprocessors
Shared-memory multiprocessor models:
Uniform-memory-access (UMA)
Nonuniform-memory-access (NUMA)
Cache-only memory architecture (COMA)
These systems differ in how the memory and
peripheral resources are shared or
distributed.
The UMA Model - 1
Physical memory uniformly shared by all
processors, with equal access time to all
words.
Processors may have local cache memories.
Peripherals also shared in some fashion.
Tightly coupled systems use a common bus,
crossbar, or multistage network to connect
processors, peripherals, and memories.
Many manufacturers have multiprocessor
(MP) extensions of uniprocessor (UP) product
lines.
The UMA Model - 2
Synchronization and communication among
processors achieved through shared
variables in common memory.
Symmetric MP systems – all processors have
access to all peripherals, and any processor
can run the OS and I/O device drivers.
Asymmetric MP systems – not all peripherals
accessible by all processors; kernel runs only
on selected processors (master); others are
called attached processors (AP).
The UMA Multiprocessor Model

P1 P2 … Pn

System Interconnect
(Bus, Crossbar, Multistage network)

I/O SM1 … SMm

Example: Performance Calculation
Consider two loops. The first loop adds
corresponding elements of two N-element
vectors to yield a third vector. The second
loop sums elements of the third vector.
Assume each add/assign operation takes 1
cycle, and ignore time spent on other actions
(e.g. loop counter incrementing/testing,
instruction fetch, etc.). Assume
interprocessor communication requires k
cycles.
On a sequential system, each loop will
require N cycles, for a total of 2N cycles of
Example: Performance Calculation
On an M-processor system, we can partition each
loop into M parts, each having L = N / M add/assigns
requiring L cycles. The total time required is thus
2L. This leaves us with M partial sums that must be
totaled.
Computing the final sum from the M partial sums
requires l = log2(M) additions, each requiring k
cycles (to access a non-local term) and 1 cycle (for
the add/assign), for a total of l × (k+1) cycles.
The parallel computation thus requires
2N / M + (k + 1) log2(M) cycles.
Example: Performance Calculation
Assume N = 220.
Sequential execution requires 2N = 221 cycles.
If processor synchronization requires k = 200
cycles, and we have M = 256 processors, parallel
execution requires
2N / M + (k + 1) log2(M)
= 221 / 28 + 201 × 8
= 213 + 1608 = 9800 cycles
Comparing results, the parallel solution is 214 times
faster than the sequential, with the best theoretical
speedup being 256 (since there are 256
processors). Thus the efficiency of the parallel
solution is 214 / 256 = 83.6 %.
The NUMA Model - 1
Shared memories, but access time depends on the
location of the data item.
The shared memory is distributed among the
processors as local memories, but each of these is
still accessible by all processors (with varying
access times).
Memory access is fastest from the locally-connected
processor, with the interconnection network adding
delays for other processor accesses.
Additionally, there may be global memory in a
multiprocessor system, with two separate
interconnection networks, one for clusters of
processors and their cluster memories, and another
for the global shared memories.
Shared Local Memories

LM1 P1

LM2 P2 Inter-
. connection
.
. Network .
. .
LMn Pn
Hierarchical Cluster Model
GSM GSM … GSM

Global Interconnect Network

P CS P CS
M M
P C CS P C CS
…
. I M. . I M.
.
. N .
.
.
. N .
.

P CS P CS
M M
The COMA Model
In the COMA model, processors only have
cache memories; the caches, taken
together, form a global address space.
Each cache has an associated directory that
aids remote machines in their lookups;
hierarchical directories may exist in
machines based on this model.
Initial data placement is not critical, as cache
blocks will eventually migrate to where they
are needed.
Cache-Only Memory Architecture
Interconnection Network

D D D

C C … C

P P P
Other Models
There can be other models used for
multiprocessor systems, based on a
combination of the models just presented.
For example:
cache-coherent non-uniform memory access
(each processor has a cache directory, and the
system has a distributed shared memory)
cache-coherent cache-only model (processors
have caches, no shared memory, caches must be
kept coherent).
Multicomputer Models
Multicomputers consist of multiple computers, or
nodes, interconnected by a message-passing
network.
Each node is autonomous, with its own processor
and local memory, and sometimes local peripherals.
The message-passing network provides point-to-
point static connections among the nodes.
Local memories are not shared, so traditional
multicomputers are sometimes called no-remote-
memory-access (or NORMA) machines.
Inter-node communication is achieved by passing
messages through the static connection network.
Generic Message-Passing Multicomputer
P P
…
M M

M P P M
Message-passing
interconnection
network
M P P M

P P
…
M M
Multicomputer Generations
Each multicomputer uses routers and channels in its
interconnection network, and heterogeneous
systems may involved mixed node types and
uniform data representation and communication
protocols.
First generation: hypercube architecture, software-
controlled message switching, processor boards.
Second generation: mesh-connected architecture,
hardware message switching, software for medium-
grain distributed computing.
Third generation: fine-grained distributed
computing, with each VLSI chip containing the
processor and communication resources.

Microwave
100% (7)
Microwave
142 pages
7 Classification
100% (3)
7 Classification
63 pages
PDS Merged
No ratings yet
PDS Merged
182 pages
Semiconductor Electronics Class 12 Notes - IIT JEE - NEET PDF
100% (1)
Semiconductor Electronics Class 12 Notes - IIT JEE - NEET PDF
38 pages
Multiprocessors and Multicomputers
No ratings yet
Multiprocessors and Multicomputers
27 pages
Voltage Stabilizers Technical Data PDF
No ratings yet
Voltage Stabilizers Technical Data PDF
28 pages
Data Mining: Concepts and Techniques
No ratings yet
Data Mining: Concepts and Techniques
70 pages
Microprocessor Lab Manual SEM IV 2013
No ratings yet
Microprocessor Lab Manual SEM IV 2013
58 pages
Pipeliningandvectorprocessing 140612142847 Phpapp01
No ratings yet
Pipeliningandvectorprocessing 140612142847 Phpapp01
53 pages
Unit Iv Parallelism
No ratings yet
Unit Iv Parallelism
80 pages
9 Module 4
No ratings yet
9 Module 4
49 pages
Cs405-Computer System Architecture: Module - 1 Parallel Computer Models
No ratings yet
Cs405-Computer System Architecture: Module - 1 Parallel Computer Models
91 pages
PDC Notes by Zatch-1
No ratings yet
PDC Notes by Zatch-1
42 pages
Redo The Previous Task Using JSP by Converting The Static Web Pages of
100% (2)
Redo The Previous Task Using JSP by Converting The Static Web Pages of
18 pages
17 Computer Architecture and Organization
No ratings yet
17 Computer Architecture and Organization
28 pages
09 Communication Models of Parallel Platforms
No ratings yet
09 Communication Models of Parallel Platforms
25 pages
Multi Processors and Thread Level Parallelism
No ratings yet
Multi Processors and Thread Level Parallelism
74 pages
Unit 6
No ratings yet
Unit 6
36 pages
09 Communication Models of Parallel Platforms
No ratings yet
09 Communication Models of Parallel Platforms
25 pages
EA5800-X7 Datasheet: Quick Spec
No ratings yet
EA5800-X7 Datasheet: Quick Spec
5 pages
Fermator Operation Manual PDF
100% (3)
Fermator Operation Manual PDF
8 pages
Simulation of Digital Communication Systems Using Matlab
From Everand
Simulation of Digital Communication Systems Using Matlab
Mathuranathan Viswanathan
3.5/5 (22)
CSCI 8150 Advanced Computer Architecture
100% (2)
CSCI 8150 Advanced Computer Architecture
46 pages
Unit 3
No ratings yet
Unit 3
28 pages
Comporg6 ch12
No ratings yet
Comporg6 ch12
36 pages
Unit 2.1
No ratings yet
Unit 2.1
18 pages
Parallel Computers
No ratings yet
Parallel Computers
39 pages
1015p Io Board
No ratings yet
1015p Io Board
13 pages
Parallel Prrocessor
No ratings yet
Parallel Prrocessor
12 pages
Chapter 5 - Shared Memory Multiprocessor
No ratings yet
Chapter 5 - Shared Memory Multiprocessor
96 pages
CSCI 8150 Advanced Computer Architecture
100% (2)
CSCI 8150 Advanced Computer Architecture
28 pages
Cse
100% (2)
Cse
95 pages
Cse
100% (2)
Cse
95 pages
Cse
100% (2)
Cse
95 pages
Seminar
No ratings yet
Seminar
85 pages
CSCI 8150 Advanced Computer Architecture: Hwang, Chapter 1 Parallel Computer Models 1.1 The State of Computing
100% (3)
CSCI 8150 Advanced Computer Architecture: Hwang, Chapter 1 Parallel Computer Models 1.1 The State of Computing
37 pages
Unit4 Session3 Parallel Computing Concepts Terminology Design Issues
No ratings yet
Unit4 Session3 Parallel Computing Concepts Terminology Design Issues
30 pages
Aca Notes
No ratings yet
Aca Notes
63 pages
Data Mining: Concepts and Techniques: Cluster Analysis
No ratings yet
Data Mining: Concepts and Techniques: Cluster Analysis
97 pages
Unit 1 - Part - 2
No ratings yet
Unit 1 - Part - 2
30 pages
Distributed Operating Syst EM: 15SE327E Unit 1
No ratings yet
Distributed Operating Syst EM: 15SE327E Unit 1
49 pages
Unit VI
No ratings yet
Unit VI
50 pages
The Evolution of Mobile Technologies 1g To 2g To 3g To 4g Lte
No ratings yet
The Evolution of Mobile Technologies 1g To 2g To 3g To 4g Lte
41 pages
Lec13 Multiprocessors
No ratings yet
Lec13 Multiprocessors
69 pages
CS Chap7 Multicores Multiprocessors Clusters
No ratings yet
CS Chap7 Multicores Multiprocessors Clusters
65 pages
Multiprocessor
No ratings yet
Multiprocessor
22 pages
15CS72 ACA Module1 Chapter1Final
No ratings yet
15CS72 ACA Module1 Chapter1Final
25 pages
COE4590 8 Multiprocessor
No ratings yet
COE4590 8 Multiprocessor
17 pages
2ad6a430 1637912349895
No ratings yet
2ad6a430 1637912349895
51 pages
Data Mining: Concepts and Techniques
No ratings yet
Data Mining: Concepts and Techniques
50 pages
07 Multiprocessors MF PDF
No ratings yet
07 Multiprocessors MF PDF
99 pages
Introduction
No ratings yet
Introduction
34 pages
COA Assignment
No ratings yet
COA Assignment
21 pages
Keil Software
100% (2)
Keil Software
3 pages
Chap2 Slides Week3
No ratings yet
Chap2 Slides Week3
28 pages
Unit III Multiprocessor Issues
No ratings yet
Unit III Multiprocessor Issues
42 pages
Additional Topics of Unit-I and Unit-II: Syed Rameem Zahra
No ratings yet
Additional Topics of Unit-I and Unit-II: Syed Rameem Zahra
21 pages
A502018463 23825 5 2019 Unit6
No ratings yet
A502018463 23825 5 2019 Unit6
36 pages
Chapter 1 (Parallel Computer Models)
No ratings yet
Chapter 1 (Parallel Computer Models)
20 pages
RS - Pds-Oe 3010
No ratings yet
RS - Pds-Oe 3010
8 pages
Describe in Detail Shared Memory Multiprocessor Models
No ratings yet
Describe in Detail Shared Memory Multiprocessor Models
3 pages
DSM
No ratings yet
DSM
36 pages
Shared Memory. Distributed Memory. Hybrid Distributed-Shared Memory
No ratings yet
Shared Memory. Distributed Memory. Hybrid Distributed-Shared Memory
22 pages
Unit 1
No ratings yet
Unit 1
9 pages
Quantum Computer Vs Traditional Computer
From Everand
Quantum Computer Vs Traditional Computer
Arief Muinnudin
No ratings yet
RCU501 HW Mod Descr
No ratings yet
RCU501 HW Mod Descr
32 pages
Parallel Processors: Session 2
No ratings yet
Parallel Processors: Session 2
32 pages
William Stallings Computer Organization and Architecture: Parallel Processing
No ratings yet
William Stallings Computer Organization and Architecture: Parallel Processing
40 pages
Lecture 3 PDC
No ratings yet
Lecture 3 PDC
21 pages
CS82 Advanced Computer Architecture: Parallel Computer Models 1.2 Multiprocessors and Multicomputers
No ratings yet
CS82 Advanced Computer Architecture: Parallel Computer Models 1.2 Multiprocessors and Multicomputers
19 pages
ACA Assignment 4
No ratings yet
ACA Assignment 4
16 pages
Unit-1 Color Models in Image and Video
No ratings yet
Unit-1 Color Models in Image and Video
58 pages
Advanced Computer Architecture Unit 1
No ratings yet
Advanced Computer Architecture Unit 1
23 pages
COME6102 Chapter 1 Introduction 2 of 2
No ratings yet
COME6102 Chapter 1 Introduction 2 of 2
8 pages
15 Parallel Processing
No ratings yet
15 Parallel Processing
36 pages
5 4 Parallel
No ratings yet
5 4 Parallel
47 pages
Definition of UMA: Basis For Comparison UMA Numa
No ratings yet
Definition of UMA: Basis For Comparison UMA Numa
10 pages
Work Completion and Inspection Report Form 3
No ratings yet
Work Completion and Inspection Report Form 3
3 pages
Unit I 2 Marks With Answer
No ratings yet
Unit I 2 Marks With Answer
6 pages
Parallel Computing: Er. Anupama Singh Department of Computer Science & Engg
No ratings yet
Parallel Computing: Er. Anupama Singh Department of Computer Science & Engg
22 pages
Slides Taken From: Parallel Computing Platforms
No ratings yet
Slides Taken From: Parallel Computing Platforms
11 pages
Aca Notes: Scalability
No ratings yet
Aca Notes: Scalability
13 pages
What Is Parallel Computing
No ratings yet
What Is Parallel Computing
9 pages
Cisco Metro Ethernet
No ratings yet
Cisco Metro Ethernet
11 pages
Parallel Arch 2
No ratings yet
Parallel Arch 2
9 pages
FIR Filter
0% (1)
FIR Filter
13 pages
BM3451 Series: 3/4/5 Cell Battery Protectors
No ratings yet
BM3451 Series: 3/4/5 Cell Battery Protectors
28 pages
Develop Static Pages (Using Only HTML) of An Online Book Store - The The Following Pages
No ratings yet
Develop Static Pages (Using Only HTML) of An Online Book Store - The The Following Pages
9 pages
Data Mining: Concepts and Techniques: Mining Association Rules in Large Databases
No ratings yet
Data Mining: Concepts and Techniques: Mining Association Rules in Large Databases
81 pages
Pro Load
100% (2)
Pro Load
2 pages
LDR: LDRs or Light Dependent Resistors Are Very Useful Especially
75% (4)
LDR: LDRs or Light Dependent Resistors Are Very Useful Especially
1 page
What Is Maximum Power Point Tracking
No ratings yet
What Is Maximum Power Point Tracking
4 pages
T Series Brochure
No ratings yet
T Series Brochure
10 pages
Develop Static Pages (Using Only HTML) of An Online Book Store - The The Following Pages
No ratings yet
Develop Static Pages (Using Only HTML) of An Online Book Store - The The Following Pages
8 pages
Traffic Light Controller Fall2020
No ratings yet
Traffic Light Controller Fall2020
4 pages
Marks Memo All Edited
No ratings yet
Marks Memo All Edited
16 pages
Aim: 6.implement The "Hello World!" Program Using JSP Struts Framework Procedure: Step 1
No ratings yet
Aim: 6.implement The "Hello World!" Program Using JSP Struts Framework Procedure: Step 1
5 pages
md6752 Ds en
No ratings yet
md6752 Ds en
24 pages
Port Description
No ratings yet
Port Description
3 pages
P 222
100% (2)
P 222
16 pages
Theory of Modern Electronic Semiconductor Devices 1st Edition Brennan - The Ebook With All Chapters Is Available With Just One Click
No ratings yet
Theory of Modern Electronic Semiconductor Devices 1st Edition Brennan - The Ebook With All Chapters Is Available With Just One Click
51 pages
ADRF REPEATER - SDR-24-30-33 - v16 - SS
No ratings yet
ADRF REPEATER - SDR-24-30-33 - v16 - SS
3 pages
Stryker Clarity Console Troubleshooting
No ratings yet
Stryker Clarity Console Troubleshooting
2 pages
NJM 4565
No ratings yet
NJM 4565
4 pages
Harman Kardon HK 970 Owners Manual
No ratings yet
Harman Kardon HK 970 Owners Manual
8 pages
Basics of Oscilloscopes - Tutorialspoint PDF
No ratings yet
Basics of Oscilloscopes - Tutorialspoint PDF
3 pages
320,325,330FB, LL
No ratings yet
320,325,330FB, LL
2 pages
FS100 CC-link Manual
No ratings yet
FS100 CC-link Manual
20 pages
Btechcalender of II 2 IV
No ratings yet
Btechcalender of II 2 IV
3 pages
EEG Brain-Computer Interface Project: Capstone Design Program: Electrical and Computer Engineering
No ratings yet
EEG Brain-Computer Interface Project: Capstone Design Program: Electrical and Computer Engineering
33 pages
Practical Manual ELBASFUN 2011-2012
No ratings yet
Practical Manual ELBASFUN 2011-2012
28 pages
EE207 Problem Set 5 - 240217 - 181137
No ratings yet
EE207 Problem Set 5 - 240217 - 181137
10 pages
Ooitech Catalogue
No ratings yet
Ooitech Catalogue
8 pages
1.3.4 Output Devices
No ratings yet
1.3.4 Output Devices
6 pages

CSCI 8150 Advanced Computer Architecture

Uploaded by

CSCI 8150 Advanced Computer Architecture

Uploaded by

CSCI 8150

Advanced Computer Architecture

I/O SM1 … SMm

Global Interconnect Network

You might also like