DS CH6 - Consistency and Replication
DS CH6 - Consistency and Replication
Introduction to Distributed
Text Book: Systems
S. Tanenbaum and Maarten Van Steen, Distributed Systems: Principles and Paradigms, 2 nd Ed.
2
Chapter Outline
3
Reasons for Replication
Two major reasons: reliability and performance
Reliability
If a file is replicated, we can switch to other replicas if
there is a crash on our replica
we can provide better protection against corrupted
data; similar to mirroring in non-distributed systems
Performance
If the system has to scale in size and geographical area
Place a copy of data in the proximity of process using
them, reducing the time of access and increasing its
performance; for example a Web server is accessed by
thousands of clients from all over the world
Caching is strongly related to replication; normally by clients
4
Replication as Scaling Technique
• Replication and caching are widely applied as scaling
techniques
– processes can use local copies and limit access time and
traffic. However, we need to keep the copies consistent;
consistent
but this may
9
Consistency Models
A Consistency Model is a contract between processes
and the data store.
10
…Cont’d
Data-Centric Consistency Models
– Strict consistency
– Sequential consistency
– Linearizability consistency
– Causal consistency
– FIFO consistency
– Weak consistency
– Release consistency
– Entry consistency
11
Summary of Data-Centric Consistency
Models
.
Consistency Description
Strict Absolute time ordering of all shared accesses matters.
All processes must see all shared accesses in the same order. Accesses are
Linearizability
furthermore ordered according to a (nonunique) global timestamp
All processes see all shared accesses in the same order.
Sequential
Accesses are not ordered in time
Causal All processes see causally-related shared accesses in the same order.
All processes see writes from each other in the order they were used. Writes from
FIFO
different processes may not always be seen in that order
(a)
Consistency Description
Weak Shared data can be counted on to be consistent only after a synchronization is done
Release Shared data are made consistent when a critical region is exited
Entry Shared data pertaining to a critical region are made consistent when a critical region is
entered.
(b)
a) Consistency models not using synchronization operations.
b) Models with synchronization operations. 18
…Cont’d
19
…Cont’d
Client-Centric Consistency Models
– Eventual Consistency
there are many applications where few processes (or a
single process) update the data while many read it and
there are no write-write conflicts;
conflicts we need to handle only
read-write conflicts; e.g., DNS server, Web site
for such applications, it is even acceptable for readers to
see old versions of the data (e.g., cached versions of a
Web page) until the new version is propagated
with eventual consistency,
consistency it is only required that updates
are guaranteed to gradually propagate to all replicas.
20
…Cont’d
Client-Centric Consistency Models
1. Monotonic read
• Reading (not modifying) incoming mail while you are on the move.
Each time you connect to a different e-mail server, that server fetches
(at least) all the updates from the server you previously visited.
21
…Cont’d
2. Monotonic write
22
…Cont’d
23
…Cont’d
4. Writes follow reads:
•Updates are propagated as result of previous read operations.
24
Consistency Protocols
1. Primary-based Protocols
3. Cache-coherence Protocols
25
Quiz 1 (5%)
26