7 Consistency
7 Consistency
Distributed Systems
1
Reasons for Replication
• Data are replicated for
– the reliability of the system
• Servers are replicated for performance
– Scaling in numbers
– Scaling in geographical area
• Dilemma
– Gain in performance
– Cost of maintaining replication
• Keep the replicas up to date and ensure consistency
2
Data-Centric Consistency Models (1)
• Consistency is often discussed in the context of read and write on
– shared memory, shared databases, shared files
4
Continuous Consistency
• Defines three independent axes of inconsistency
– Deviation in numerical values between replicas
• E.g., the number and the values of updates
5
Measuring Inconsistency: An Example
6
Conit Granularity
7
Sequential Consistency
• The symbols for read and write operations
10
Causal Consistency
• For a data store to be considered causally consistent, it
is necessary that the store obeys the following condition
– Writes that are potentially causally related
• must be seen by all processes in the same order.
– Concurrent writes
• may be seen in a different order on different machines.
11
Another Example
12
Grouping Operations
• Sequential and causal consistency is defined at the
level of read and write operations
– However, in practice, such granularity does not match
the granularity provided by the application
• Concurrency is often controlled by synchronization methods
such as mutual exclusion and transactions
14
Entry Consistency
• It requires
– the programmer to use acquire and release at the start
and end of each critical section, respectively.
– each ordinary shared variable to be associated with some
synchronization variable
15
Consistency v.s. Coherence
• Consistency deals with a set of processes operate
on
– a set of data items (they may be replicated)
– This set is consistent if it adheres to the rules defined
by the model
• Coherence deals with a set of processes operate on
– a single data item that is replicated at many places
– It is coherent if all copies abide to the rules defined
by the model
16
Eventual Consistency
• In many distributed systems such as DNS and World Wide Web,
– updates on shared data can only be done by one or a small group of
processes
– most processes only read shared data
– a high degree of inconsistency can be tolerated
• Eventual consistency
– If no updates take place for a long time, all replicas will gradually
become consistent
– Clients are usually fine if they only access the same replica
• However, in some cases, clients may access different replicas
– E.g., a mobile user moves to a different location
• Client-centric consistency:
– Guarantee the consistency of access for a single client
17
Monotonic-Read Consistency
• A data store is said to provide monotonic-read
consistency if the following condition holds:
– If a process reads the value of a data item x, then
– any successive read operation on x by that process
will always return
• that same value or
• a more recent value
• In other words,
– if a process has seen a value of x at time t, it will
never see an older version of x at any later time
18
An Example
• Notations
– xi[t]: the version of x at local copy Li at time t
– WS(xi[t]): the set of all writes at Li on x since initialization
19
Monotonic-Write Consistency
• In a monotonic-write consistent store, the
following condition holds:
– A write operation by a process on a data item x is
completed before
• any successive write operation on x by the same process.
• In other words
– A write on a copy of x is performed only if this
copy is brought up to date by means of
• any preceding write on x, which may take place at other
copies, by the same process
20
An Example
21
Read-Your-Write Consistency
• A data store is said to provide read-your-writes
consistency, if the following condition holds:
– The effect of a write operation by a process on
data item x
• will always be seen by a successive read operation on x
by the same process.
• In other words,
– A write operation is always completed before a
successive read operation by the same process
• no matter where that read takes place
22
An Example
23
Write-Follow-Read Consistency
• A data store is said to provide writes-follow-reads
consistency, if the following holds:
– A write operation by a process on a data item x following
a previous read operation on x by the same process
• is guaranteed to take place on the same or a more recent value
of x that was read.
• In other words,
– Any successive write operation by a process on a date
item x will be performed on a copy of x that
• is up to date with the value most recently read by that process
24
An Example
A writes-follow-reads consistent
data store.
25
Replica Management
• Two key issues for distributed systems that
support replication
– Where, when, and by whom replicas should be
placed? Divided into two subproblems:
• Replica server placement: finding the best location
to place a server that can host a data store
• Content placement: find the best server for placing
content
28
Content Replication and Placement
29
Server-Initiated Replicas
• Observe the client access pattern and dynamically add or remove
replicas to improve the performance
• One example algorithm
– Count the access requests of F from clients and combining points
– If the request drops significantly, delete replica F
– If a lot of requests from one combining point, replicate F at such point
30
Client-Initiated Replicas
• Mainly deals with client cache,
– i.e., a local storage facility that is used by a client
to temporarily store a copy of the data it has just
requested
• The cached data may be outdated
– Let the client checks the version of the data
• Multiple clients may use the same cache
– Data requested by one client may be useful to other
clients as well, e.g., DNS look-up
– This can also improve the chance of cache hit
31
Content Distribution
• Deals with the propagation of updates to all
relevant replicas
• Two key questions
– What to propagate (state v.s. operations)
• Propagate only a notification of an update.
• Transfer data from one copy to another.
• Propagate the update operation to other copies.
34
Continuous Consistency Protocols (1)
• Bounding numerical deviation
– The number of unseen updates, the absolute numerical value, or
the relative numerical value
– E.g., the value of a local copy of x will never deviate from the
real value of x by a threshold
• Let us concern about the number of update unseen
– i.e., the total number of unseen updates to a server shall never
exceed a threshold δ
35
Continuous Consistency Protocols (2)
• Bounding staleness deviation
– Each server maintains a clock T(i), meaning that this server has
seen all writes of i up to time T(i)
– Let T be the local time. If server i notices that T-T(j) exceeds a
threshold, it will pull the writes from server j
• Bounding ordering deviation
– Each server keeps a queue of tentative, uncommitted writes
– If the length of this queue exceeds a threshold,
• the server will stop accepting new writes and
• negotiate with other servers in which order its writes should be
executed, i.e., enforce a globally consistent order of tentative writes
37
Local-Write Protocols
38
Replicated-Write Protocols (1)
• Active replication
– Update are propagated by means of the write operation
that cause the update
• The challenge is that the operations have to be
carried out in the same order everywhere
– Need a totally-ordered multicast mechanism such as the
one based on Lamport’s logical clocks
• However, this algorithm is expensive and does not scale
40
Quorum-Based Protocols
41