ds7 Con
ds7 Con
Consistency and
Replication
Chapter 7
1
Reasons for Replication
3
Replication and Scalability
• Replication is a widely-used scalability technique: think of
Web clients and Web proxies.
• When systems scale, the first problems to surface are those
associated with performance – as the systems get bigger
(e.g., more users), they get often slower.
• Replicating the data and moving it closer to where it is
needed helps to solve this scalability problem.
• A problem remains: how to efficiently synchronize all of
the replicas created to solve the scalability issue?
• Dilemma: adding replicas improves scalability, but incurs
the (oftentimes considerable) overhead of keeping the
replicas up-to-date!!!
• As we shall see, the solution often results in a relaxation of
4 any consistency constraints.
Replication and Consistency
• But if there are many replicas of the same thing,
how do we keep all of them up-to-date? How
do we keep the replicas consistent?
• Consistency can be achieved in a number of
ways. We will study a number of consistency
models, as well as protocols for implementing
the models.
• So, what’s the catch?
– It is not easy to keep all those replicas consistent.
5
Data-centric Consistency Models
11
Strict Consistency
18
Causal Consistency (1)
19
Causal Consistency (2)
23
Weak Consistency Properties
• The three properties of Weak Consistency:
1. Accesses to synchronization variables associated
with a data-store are sequentially consistent.
2. No operation on a synchronization variable is
allowed to be performed until all previous writes
have been completed everywhere.
3. No read or write operation on data items are allowed
to be performed until all previous operations to
synchronization variables have been performed.
24
Weak Consistency: What It Means
• So …
• By doing a sync., a process can force the just written
value out to all the other replicas.
• Also, by doing a sync., a process can be sure it’s
getting the most recently written value before it reads.
• In essence, the weak consistency models enforce
consistency on a group of operations, as opposed to
individual reads and writes (as is the case with strict,
sequential, causal and FIFO consistency).
25
Grouping Operations
• Necessary criteria for correct synchronization:
• An acquire access of a synchronization variable, not allowed
to perform until all updates to guarded shared data have been
performed with respect to that process.
• Before exclusive mode access to synchronization variable by
process is allowed to perform with respect to that process, no
other process may hold synchronization variable, not even in
nonexclusive mode.
• After exclusive mode access to synchronization variable has
been performed, any other process’ next nonexclusive mode
access to that synchronization variable may not be performed
until it has performed with respect to that variable’s owner.
26
Entry Consistency
• At an acquire, all remote changes to guarded data
must be brought up to date.
• Before a write to a data item, a process must ensure
that no other process is trying to write at same time.
28
More Client-Centric Consistency
• How fast should updates (writes) be made
available to read-only processes?
– Think of most database systems: mainly read.
– Think of the DNS: write-write conflicts do no
occur, only read-write conflicts.
– Think of WWW: as with DNS, except that heavy
use of client-side caching is present: even the
return of stale pages is acceptable to most users.
• These systems all exhibit a high degree of
acceptable inconsistency … with the replicas
gradually becoming consistent over time.
29
Toward Eventual Consistency
• The only requirement is that all replicas will
eventually be the same.
• All updates must be guaranteed to propagate to
all replicas … eventually!
• This works well if every client always updates
the same replica.
• Things are a little difficult if the clients are
mobile.
30
Eventual Consistency
32
Monotonic Reads (2)
• In a monotonic-write consistent
store, the following condition holds:
– A write operation by a process on a
data item x
is completed before any successive write
operation on x by the same process.
35
Monotonic Writes (2)
38
Read Your Writes (2)
40
(b) A data store that does not
Writes Follow Reads (1)
42
(a) A writes-follow-reads consistent data store
Writes Follow Reads (3)
45
Replica Placement Types
• There are three types of replica:
1. Permanent replicas: tend to be small in number,
organized as COWs (Clusters of Workstations) or
mirrored systems.
2. Server-initiated replicas: used to enhance
performance at the initiation of the owner of the
data-store. Typically used by web hosting
companies to geographically locate replicas close
to where they are needed most. (Often referred to
as “push caches”).
3. Client-initiated replicas: created as a result of client
requests – think of browser caches. Works well
assuming, of course, that the cached data does not
46 go stale too soon.
Server-Initiated Replicas
57
Primary-Based Protocols
58
Remote-Write Protocols
61
Local-Write Protocols Example
63
Local-Write Protocols
65
Active Replication
• A special process carries out the update
operations at each replica.
• Lamport’s timestamps can be used to achieve
total ordering, but this does not scale well
within distributed systems.
• An alternative/variation is to use a sequencer,
which is a process that assigns a unique ID# to
each update, which is then propagated to all
replicas.
• This can lead to another problem: replicated
66 invocations.
Replicated Invocations: The Problem
NR + N W > N
NW > N/2
70
Quorum-Based Protocols