Introduction to Distributed Memory
Week 12 & 13
Muhammad Imran Saeed
Lecturer & Researcher AI (Machine Learning and Deep Learning)
Department of Software Engineering, Faculty of Computing
Mohammad Ali Jinnah University Karachi
Phd-Scholar (FAST-NUCES-KHI)
[email protected] FALL 2024
MAJU-Mohammad Ali Jinnah University
Mohammad Ali Jinnah University ,22 East Block-6 PECHS
Karachi 75400 Sindh, Pakistan.
+92 (21) -111-87-87-87
Email Address: info @jinnah.edu
Website: www.jinnah.edu
1
Introduction to Distributed Memory
Basic Concepts of DSM
A DSM system provides a logical abstraction of shared
memory which is built using a set of interconnected
nodes having physically distributed memories.
Types of DSMs >:
-Hardware level DSM
-Software level DSM
-Hybrid level DSM
2
Introduction to Distributed Memory
Advantages of DSM
• Simple Abstraction
• Improved portability of distributed
application programs
• Provides better performance in some
applications
• Large memory space at no extra cost
• Better than message passing systems.
3
Introduction to Distributed Memory
Comparison of IPC paradigms
DFS
• Single shared address space
• Communicate, synchronize using load /store
• Can support message passing
Message Passing
• Send / Receive
• Communication + synchronization
• Can support shared memory
4
Introduction to Distributed Memory
Hardware Architecture
• On chip memory
• Bus based multiprocessor
• Ring based multiprocessor
• Switched multiprocessor
5
Introduction to Distributed Memory
Hardware Architecture
Comparison Table:
Bus-Based Ring-Based Switched
On-Chip
Feature Multiprocess Multiprocess Multiproc
Memory
or or essor
Switch-
Communicatio based
Direct access Shared bus Ring topology
n Mechanism interconne
ct
Limited by
Scalability Poor Moderate Excellent
chip size
Moderate (bus Moderate to
Speed Very fast High
contention) high
Complexity Simple Simple Moderate High
Cost Low Low Medium High
6
Introduction to Distributed Memory
Hardware Architecture
• In this CPU portion of
the chip has a address
and data lines that
directly connect to the
memory portion.
• Such chips are used in
cars appliances and
even toys.
7
Introduction to Distributed Memory
Bus based microprocessor
• All CPUs connected to
one bus (backplane)
• Memory and peripherals
are accessed via shared
bus. System looks the
same from any processor
8
Introduction to Distributed Memory
9
Introduction to Distributed Memory
10
Introduction to Distributed Memory
11
Introduction to Distributed Memory
12
Introduction to Distributed Memory
Non- uniform memory access (NUMA)
• Non- uniform memory access (NUMA) is a computer
memory design used multiprocessing, where the
memory access time depends on the memory
location relative to the processor.
• Under NUMA, a processor can access its own local
memory faster than non-local memory.
• The benefits of NUMA are limited to particular
workloads, notably on servers where the data is often
associated strongly with certain tasks or users.
13
Introduction to Distributed Memory
Uniform memory access (UMA)
• Uniform memory access (UMA) is a shared memory architecture
used in parallel computers. All the processors in the UMA model
share the physical memory uniformly.
• In a UMA architecture, access time to a memory location is
independent of which processor makes the request or which
memory chip contains the transferred data.
• In the UMA architecture, each processor may use a private
cache. Peripherals are also shared in some fashion.
• The UMA model is suitable for general purpose and time-sharing
applications by multiple users. It can be used to speed up the
execution of a single large program in time- critical applications.
14
Introduction to Distributed Memory
• D/f b\w two Multiprocessor
Bus Based Multiprocessors Ring Based Multiprocessors
They are tightly coupled with Machines here can be much more
CPU’s normally in a single loosely coupled and this loose
rack. coupling can affect their
performance.
It has separate global memory It has no separate global memory
15
Design Issues in DSM
16
Introduction to Distributed Memory
DSM Design Issues
• Granularity of sharing
• Structure of data
• Consistency models
• Coherence protocols
17
Introduction to Distributed Memory
Granularity
- False Sharing
- Thrashing
18
Introduction to Distributed Memory
Thrashing
- False Sharing
- Techniques to reduce
thrashing:
- Application controlled lock
- Pin the block to a node for
specific time
- Customize algorithm to
shared data usage pattern
19
Introduction to Distributed Memory
20
Introduction to Distributed Memory
21
Introduction to Distributed Memory
22
Introduction to Distributed Memory
23
Introduction to Distributed Memory
24
Introduction to Distributed Memory
25
Introduction to Distributed Memory
Processor consistency
- Adheres to the PRAM consistency.
- Constraint on memory coherence.
- Order in which the memory operations are seen
by two processors need not be identical, but the
order of writes issued by each processor must
be preserved.
26
Introduction to Distributed Memory
27
Introduction to Distributed Memory
Properties of the weak Consistency
Model
- Access to synchronization variables is sequentially
consistent.
- Only when all previous writes are completed everywhere,
access to synchronizations variable is allowed.
- Until all previous accesses to synchronization variables
are performed, no read write data access operations will
be allowed.
28
Introduction to Distributed Memory
29
Introduction to Distributed Memory
30
Introduction to Distributed Memory
31
Introduction to Distributed Memory
32
Introduction to Distributed Memory
33
Introduction to Distributed Memory
34
Introduction to Distributed Memory
35
Introduction to Distributed Memory
36
Introduction to Distributed Memory
37
Introduction to Distributed Memory
38
Introduction to Distributed Memory
39
Introduction to Distributed Memory
40
Lecture End
41