Lecture 5 Slides
Lecture 5 Slides
Peter Druschel
with thanks to Deepak Garg
Today’s lecture
●
What consistency property do locks provide?
– By themselves, none! Locks only synchronize access, not data!
●
Locks coupled with additional mechanisms provide consistency:
– Release consistency: Locks + coupled data synchronization
– Sequential consistency: Locks + coupled data synchronization + correct use of locks
●
Concepts:
– Granularity of locks
– Correct use of locks
– Update propagation: eager, lazy
– Release consistency
– Relaxed memory (brief)
●
Many ideas are directly relevant to your project!
2
Distributed Shared Memory
Network
●
Local STs and LDs on machine access local memory
●
How to keep memories consistent? 3
Sequential consistency (SC)
Network
●
There appears to be a total order on LDs and STs
across machines that agrees with each machine’s
local order of events. 4
Sequential consistency (SC)
●
There appears to be a total order on LDs and STs
across machines that agrees with each machine’s
local order of events.
●
Very strong, easy to program with
●
… but generally needs a lot of synchronization
●
… which is unattainable with availability and partition
tolerance (CAP theorem)
5
Ex. Implementation: Global lock
●
One global lock for all addresses in memory
●
To access memory, machine must
●
Acquire lock (wait till free)
●
Operate on local memory
●
Release lock
6
Ex. Implementation: Global lock
●
Attains SC. Why?
●
Total order is determined by the lock’s movement
●
However, there’s a big problem. What is it?
●
Only one machine can access memory at one time.
No concurrency! 7
Variable lock granularity
Memory Memory Memory
●
One lock per (logical) object.
●
Object is a unit of atomic update defined by the application =
set of locations with a shared invariant. May span pages.
●
Correct use of locks: One lock per object, every access to
object bracketed by acquire then release of the object’s lock.
8
Variable lock granularity
Memory Memory Memory
●
Is this (sequentially inconsistent) execution possible?
●
Yes!
●
Why?
●
Local updates under a lock may propagate after release 10
Locks alone provide no consistency
Release-acquire dependency
x denotes an object Update propagation
Correctly synchronized program
P1: W(x, a) W(x, b)
●
Requirement for consistency: Propagate updates when a
lock is acquired by another party
11
Update propagation timing
●
Requirement for consistency: Propagate latest updates when a lock
is acquired by another party
●
With proper locking, this implies SC
●
Two broad update propagation approaches:
●
Eager: Push updates to all threads when releasing lock
●
Why all threads?
●
Don’t know who will take lock next!
●
Lazy: Pull updates from the previous owner when acquiring lock
●
Which one looks more efficient?
12
Eager propagation (Example 1)
Lock release
Update propagation
13
Eager propagation (Example 2)
Lock release
No release, Update propagation
No propagate!
P1: W(x, a) W(x, b)
14
Lazy propagation (Example)
Lock acquire
Update propagation
15
Lazy propagation in DSM
●
Typically implemented in a language-transparent manner
●
DSM is informed of lock acquire/release.
●
DSM does not know which locks protect which objects.
●
DSM learns of reads / writes only at memory page
granularity.
●
All threads on a given computer share physical memory.
16
Lazy propagation with >1 object
Q: Which updates should be propagated? Lock acquire
x y
1)Only those protected by the acquired Lock release
lock, {x}, or
2)All local changes, {x, y}? Update propagation
20
Lock granularity vs page granularity
x
y
r2 1. Invalidate set {r1}
r1
3. Pull {x, y}
P1: W(y, a) W(x, b)
21
Lazy propagation: Transitivity
x y Q: What is the correct invalidate set?
1) Only pages updated by P2, {r1}, or
r1 r2 2) Pages updated by P2 and P1 (that P2
knows), {r1, r2}?
P2: W(x, y)
P3:
R(x), R(y) 22
Lazy propagation: Transitivity
x y Rule: After P2 acquires lock from
P1, P2 should know of all updates
r1 r2 P1 had seen at the time of acquire.
P3:
R(x), R(y) 23
Locks + Update Propagation:
Summary
●
Lock granularity defines the application’s unit of
atomic update.
– One lock for every object.
●
For consistency, DSM transfers knowledge of all
known updates at every lock transfer.
– Implies SC if app uses locks correctly.
●
Actual update data can be propagated selectively
and on demand.
24
(Lazy) Release consistency
●
The consistency guarantee obtained from locks + update
propagation can be defined formally.
●
Release consistency (RC):1 Semantics of locks with
eager propagation. (In full generality, without assumption
that the app uses locks correctly.)
●
Lazy release consistency (LRC):2 Semantics of locks
with lazy propagation.
●
Both are weaker than SC: For SC, app must additionally
use locks correctly.
1,2
The actual definitions of RC and LRC are more complex. 25
RC and LRC are (slightly) different
for improperly synchronized programs
Eager propagation (RC)
P1: W(x, a)
P2: W(x, b)
P2: W(x, b)
30
Example: Multi-writers on a page
P1: P2: P3:
x lock (lx) lock (ly) lock (lx)
for (…) { for (…) { lock (ly)
y x=x+1 y=y+1 print (x, y)
} } unlock (ly)
r unlock (lx) unlock (ly) unlock (lx)
●
Without diff-based updates (Ivy): This code thrashes.
There is almost no parallel execution. Why?
●
Only one of P1, P2 can be allowed to write r at a time
●
P3 can only read r after fetching the latest version
●
The entire page r is transferred at every iteration
31
Example: Multi-writers on a page
P1: P2: P3:
x lock (lx) lock (ly) lock (lx)
for (…) { for (…) { lock (ly)
y x=x+1 y=y+1 print (x, y)
} } unlock (ly)
r unlock (lx) unlock (ly) unlock (lx)
●
With diff-based updates (TreadMarks): P1 and P2 run
in parallel as expected.
●
P1, P2 update x and y in their own copies of the page.
●
P3 pulls updated x from P1 and updated y from P2.
Due to the granularity of the diffs, the update of y from
P2 does not overwrite the updated x from P1.
●
Only x, y are transferred to P3 when P3 reads them. 32
Ivy vs TreadMarks
Attribute Ivy TreadMarks
Lock granularity Page Application defined (object)
Same-page concurrency Multi-reader (strong inv.) Multi-reader, multi-writer
Lock acquisition trigger Page fault Application defined (rel,acq)
Update propagation method Lazy Lazy (all updates), data on
demand
Update data propagation Page Individual bytes on a page
granularity (page diffs)
Update propagation trigger List of modified pages: N/A List of modified pages: Lock
(exactly the faulting page) acquire
Modification data: Page Modification data: Page
fault on acquire fault (on later read/write)
Consistency model Strong page coherence LRC. If app uses locks
(per-page sequential correctly, then SC at object
33
consistency) granularity.
Aside: What happened to DSM?
●
DSM developed synchronization techniques for
different consistency models, which are useful
broadly.
●
The DSM abstraction is useful for porting existing
applications to a distributed setting.
●
Other distributed programming models like
MapReduce or message-passing are used more
widely for application developed today.
34
Aside: Relaxed consistency
●
A relaxed memory model is obtained by relaxing
the requirements of SC.
●
May allow reordering of accesses on a processor.
●
Typically assume some synchronization
mechanism, e.g., locks or barriers.
●
Different relaxations are possible. Yield different
consistency models.
●
RC and LRC are two examples.
35
Relaxed execution example
1 3 2
●
Relaxed memory is common in microprocessors.
- CPU-local caches, memory
- Simplifies processor implementation, higher throughput
38
Summary
●
Relaxed consistency models allow for more efficient, more
concurrent, more flexible programs
– … but rely on the programmer to synchronize correctly
●
(Lazy) release consistency (RC): Consistency obtained by
propagating all known updates at every lock exchange
●
Sequential consistency (SC): LRC + correct lock use by app
●
Future lectures:
– Causal consistency
– Eventual consistency
39