Distributed Shared Memory
Distributed Shared Memory
Outline
Introduction
Consistency Models
DSM Algorithms
Conclusion
Introduction
Shared Memory
Memor
y
CP
CP
U
U
CP
CP
U
U
Multiprocessors
Tightly-Coupled Multiprocessor
Multiprocessor
Shared Memory architecture
architecture
CP
CP
U
U
Memor
y
CP
CP
U
U
Network
Memor
y
CP
CP
U
U
Memor
y
CP
CP
U
U
Multicomputers
Loosely-Coupled
Distributed Memory
Shared Memory
Distributed Memory
Cache Coherent
Lack of scalability
Scalable performance
Expensive to build
Distributed-Shared Memory
Architecture
Definition
General Characteristics
Hybrid architecture
P1
P2
M1
M2
Mn
MM
MM
MM
Pn
Interconnection
Network
Distributed-Shared Memory
Architecture (contd.)
General Characteristics
Covert Communication operations
Heterogeneous Nodes
Communications are still needed to exchange data, but they are hidden
from the programmer. Inter-process communication transparency
Cache-coherent
P1
P2
M1
M2
Mn
MM
MM
MM
Pn
Interconnection
Network
Distributed-Shared Memory
Architecture
(contd.)
Advantages
Programs written for shared memory multiprocessors can be run on DSM systems with
minimum changes
Disadvantages
Distributed-Shared Memory
Architecture
(contd.)
Best Suitable
Less appropriate
When data is accessed by request
Property
DSM
Message Passing
Marshalling
Address space
Data representation
Uniform
Heterogonous
Synchronization
Process execution
Non-overlapping lifetimes
communications cost
Invisible
Obvious
Replacement Strategies
Synchronization Primitives
Consistency Models
Definition
A memory consistency model for a shared address space specifies constraints on the order in which
memory operations must appear to be performed (i.e. to become visible to the processors) with respect to
one another.
Any read to a certain memory location returns the value stored by most recent write
operation to that address, irrespective of the locations of the processors performing the
read and the write operation.
if the result of any execution is the same as if the operations of all processors were
executed in the same sequential order, and the operations of each individual processor
appear in this sequence in the order specified by its program. (Leslie Lamport)
Definition restated: Sequential consistency requires that a shared memory multiprocessor
appears to be amultiprogramming uniprocessor system to any program running on it.
Example 1
P1
P2
Data = 2000
{}
while (Head == 0)
Head = 1
= Data
Example 2
Initially A = B = 0
P1
A=1
P2
P3
if (A == 1)
B=1
if (B
== 1)
register = A
Writes done by a single processor are seen by all other processors in the order in
which they were written on that processor.
Writes from different processors may be seen in a different order by different
processors.
Weak consistency with two types of synchronization operations : acquire and release
Each type of operations is guaranteed to be processor consistent
DSM
Algorithms
Server
Write Request update the data and send acknowledgement to the client
Implementation
If an applications request to access shared data fails repeatedly, a failure condition is sent to the application
Client
Possible solutions
Migration Request
Advantages
Takes advantage of the locality of reference
No communication costs are incurred when a process accesses data held locally
Data Block
DSM can be integrated with the virtual memory of the OS at each node
-
The size of the block is chosen to be equal to a virtual memory page or a multiple thereof
A locally-held shared memory page can be mapped into the applications virtual address space
Access to data items on data blocks not held locally triggers a page fault
When a data block is migrated away, it will be removed from any local address space it was mapped to
Pages can thrash between hosts: to minimize it, set minimum time for data objects to reside at a node
Multiple nodes can have read access or one node write access
Block Request
IVY The owner node of data object knows all nodes that have copies
Data Block
Multicast invalidate
Advantages
The read-replication can lead to substantial performance improvements if the ratio of reads to writes is
large
Disadvantages
Write operations might be more expensive since replies may have to be invalidated or updated to maintain
consistency
DSM Algorithms
(contd.)
The Full-Replication Algorithm
sequencer
Multiple nodes have both read and write access to shared data blocks
write
update
Issues
Solution
Client
s
Sequencer assigns sequence number and sends write request to all sites that have copies
A gap in sequence numbers indicates a missing write request: Node asks for retransmission of missing write
requests
DSM Algorithms
Performance Measure
Parameters
p: cost of sending or receiving
a short packet
P: cost of sending or receiving a data
block, assume P/p equal to 20
S:
r: Read/Write ratio
f: probability of an access fault on a
f `:
Conclusion