0% found this document useful (0 votes)
479 views

Consistency and Replication in Distributed System

The document discusses various consistency models for replicated data in distributed systems. It begins by explaining why data replication is used for reliability and performance reasons and the need to maintain consistency between replicas when one is modified. It then defines consistency models as contracts that specify how data stores and processes interact. The rest of the document describes several specific consistency models including strict, sequential, causal, FIFO, weak, release, and entry consistency. It also covers client-centric models like monotonic read/write consistency and eventual consistency. Consistency models provide rules to determine when updates become visible across replicas.

Uploaded by

Sunita Sahu
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
479 views

Consistency and Replication in Distributed System

The document discusses various consistency models for replicated data in distributed systems. It begins by explaining why data replication is used for reliability and performance reasons and the need to maintain consistency between replicas when one is modified. It then defines consistency models as contracts that specify how data stores and processes interact. The rest of the document describes several specific consistency models including strict, sequential, causal, FIFO, weak, release, and entry consistency. It also covers client-centric models like monotonic read/write consistency and eventual consistency. Consistency models provide rules to determine when updates become visible across replicas.

Uploaded by

Sunita Sahu
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 36

Why replicate?

 Data replication: common technique in distributed


systems
 Reliability
◦ If one replica is unavailable or crashes, use another
◦ Protect against corrupted data
 Performance
◦ Scale with size of the distributed system
(replicated web servers)
◦ Scale in geographically distributed systems
(web proxies)

 Key issue: need to maintain consistency of


replicated data
◦ If one copy is modified, others become inconsistent
What is a Consistency Model?

 A “consistency model” is a CONTRACT


between a DS data-store and its processes.

 If the processes agree to the rules, the data-store


will perform properly and as advertised.

3
Data-Centric Consistency Models
Data-centric Consistency Models
 Strict Consistency
 Linearizability and Sequential Consistency
 Causal Consistency
 FIFO Consistency
 Weak Consistency
 Release Consistency
 Entry Consistency

(5)
Consistency Model Diagram Notation
Wi(x)a – a write by process ‘i’ to item ‘x’
with a value of ‘a’. That is, ‘x’ is set to ‘a’.
(Note: The process is often shown as ‘Pi’).

 Ri(x)b – a read by process ‘i’ from item ‘x’


producing the value ‘b’. That is, reading ‘x’
returns ‘b’.

 Time moves from left to right in all


diagrams.
6
Strict Consistency Diagrams
Condition:
Any read on a data item x returns a value
corresponding to the result of the most recent write on
x.
Disadvantage: Assume the existence of absolute global
time.

Permitted Not permitted

7
Sequential Consistency
Sequential consistency: weaker than strict consistency
◦ Assumes all operations are executed in some sequential order and
each process issues operations in program order
 Any valid interleaving is allowed
 All agree on the same interleaving
 Each process preserves its program order
 Nothing is said about “most recent write”

Permitted Not permitted


Causal consistency
 Causally related writes must be seen by all
processes in the same order.
◦ Concurrent writes may be seen in different
orders on different machines

Not permitted Permitted


FIFO Consistency
 Necessary Condition:
Writes done by a single process are seen by all other
processes in the order in which they were issued, but
writes from different processes may be seen in a
different order by different processes.

A valid sequence of events for FIFO consistency

(11)
Weak Consistency
 Weak consistency model says that not all applications
need to see all writes − the criteria for same order is
relaxed.
 This model ensures consistency on a group of
operations, as against individual reads and writes as seen
in the other consistency models.
 It introduces the concept of a synchronization variable
“S” which is associated with a particular datastore.
 The process acquiring this synchronization variable is
allowed to access the critical region and when it leaves
the variable should be released.
 The synchronization variable for a datastore assures the
process that the value it has got is the most recently
written value of that datastore.

(12)
Weak Consistency
Properties:
 Accesses to synchronization variables
associated with a data store are
sequentially consistent

 No operation on a synchronization variable


is allowed to be performed until all
previous writes have been completed
everywhere

 No read or write operation on data items


are allowed to be performed until all
previous operations to synchronization
variables have been performed.
(13)
Weak Consistency
 Following criteria must be met:
 1. Acquisition of a synchronization variable cannot be
performed for the process until all updates to the
shared data have been performed.

 2. No other process is allowed to hold the


synchronization variable, even not in non-exclusive
mode, before an exclusive access to synchronization variable
is allowed to perform.

 3. After an exclusive mode access to a


synchronization variable has been performed, any
other process’ next non-exclusive mode access to that
synchronization variable may not be performed until it has
performed with respect to that variable’s owner.
Weak Consistency

(15)
Release Consistency
 In weak consistency model no distinction is made for
acquiring synchronization variable for read and write
operations on a datastore.

 Release consistency introduces two sync variables,


acquire and release:

 1. Acquire: This variable assures that the local


copies are consistent with the remote ones.

 2. Release:This variable assures that updates


to a datastore are propagated to all local
copies.

(16)
Release Consistency
Rules:

 Before a read or write operation on shared data


is performed, all previous acquires done by the
process must have completed successfully.

 Before a release is allowed to be performed, all


previous reads and writes by the process must
have completed

 Accesses to synchronization variables are FIFO


consistent (sequential consistency is not
required).

(17)
Release Consistency

A valid event sequence for release consistency.

(18)
Entry Consistency
 In entry consistency, an acquire and a release are used
for individual data items rather than entire datastore.

 The data consistency is offered at the entry. It should


meet the following two main conditions:
 1. At an acquire, all remote changes to data item
must be brought up to date with remote ones.

 2. Before a write to a data item, a process must


take care that no other process is performing
write at the same time; thus the synchronization variable
should be held in exclusive mode.
(19)
Entry Consistency (2)

A valid event sequence for entry consistency

(20)
Summary of Data-centric Consistency Models

Consistency Description

Strict Absolute time ordering of all shared accesses matters.

All processes must see all shared accesses in the same order. Accesses are furthermore
Linearizability
ordered according to a (nonunique) global timestamp

Sequential All processes see all shared accesses in the same order. Accesses are not ordered in time

Causal All processes see causally-related shared accesses in the same order.

All processes see writes from each other in the order they were used. Writes from different
FIFO
processes may not always be seen in that order

(a)

Consistency Description

Weak Shared data can be counted on to be consistent only after a synchronization is done

Release Shared data are made consistent when a critical region is exited

Entry Shared data pertaining to a critical region are made consistent when a critical region is
entered.

(b)
Client-centric Consistency Models
 The data stores characterized by the lack of simultaneous
updates,
 Most operations involve reading data. These data stores
offer a very weak consistency model, called eventual
consistency.
 in many database systems, most processes hardly ever
perform update operations; they mostly read data from
the database. Only one, or very few processes perform
update operations.
 The question then is how fast updates should be made
available to only reading processes.
 another example is the World Wide Web. In virtually all
cases, Web pages are updated by a single authority,
Eventual Consistency

A mobile user accessing different replicas of a distributed database has


problems with eventual consistency
(23)
Client-Centric Consistency

• Four models in client-centric consistency:

– Monotonic Read Consistency

– Monotonic Write Consistency

– Read-your-writes Consistency

– Writes-follows-reads Consistency

(24)
Client-Centric Consistency
• xi[t] -version of data x at copy Li at time t

• WS(xi[t]) is a set of write operations at


Li which have led to version xi of x

• When these operations later (t2) are


performed to x at copy Lj ,it is written as
WS(xi[t1]; xj[t2])

(25)
Monotonic Reads
If a process reads the value of a data item x, any
successive read operations on x by that process will
always return that same value or a more recent value.

a) A monotonic-read consistent data store


b) A data store that does not provide monotonic reads.
Example: distributed email database

(26)
Monotonic Writes
A write operation by a process on a data item x is
completed before any successive write operation on x by
the same process

a) A monotonic-write consistent data store.


b) A data store that does not provide monotonic-write consistency.

Example: The MW-guarantee could be used by a text editor when


editing replicated files

(27)
Read Your Writes
The effect of a write operation by a process on data item x
will always be seen a successive read operation on x by the
same process.

a) A data store that provides read-your-writes consistency.


b) A data store that does not.

Example: updating passwords

(28)
Writes Follow Reads
A write operation by a process on a data item x
following a previous read operation on x by the same
process, it is guaranteed to take place on the same or a
more recent value of x that was read.

a) A writes-follow-reads consistent data store


b) A data store that does not provide writes-follow-reads consistency

Example: replicated bulletin board database


(29)
Client centric model-Summary
 Monotonic read
If a process reads the value of a data item x, any successive
read operation on x by that process will always return that
same value or a more recent value
 Monotonic write
A write operation by a process on a data item x is
completed before any successive write operation on x by the
same process
 Read your writes
The effect of a write operation by a process on a data item
x will always be seen by a successive read operation on x by
the same process
 Writes follow reads
A write operation by a process on a data item x following a
previous read operation on x by the same process, is
garanteed to take place on the same or more recent values of
x that was read

(30)
Replica Placement

The logical organization of different kinds of copies of a data store


into
three concentric rings.
(31)
Replica Placement Types
 There are three types of replica:
1. Permanent replicas: tend to be small in number,
organized as COWs (Clusters of Workstations) or
mirrored systems.
2. Server-initiated replicas: used to enhance
performance at the initiation of the owner of the data-
store. Typically used by web hosting companies to
geographically locate replicas close to where they are
needed most. (Often referred to as “push caches”).
3. Client-initiated replicas: created as a result of client
requests – think of browser caches. Works well
assuming, of course, that the cached data does not go
stale too soon.

32
Replica Placement
 Permanent replicas
◦ Initial set of replicas. Created and maintained by DDS-owner(s)
◦ Writes are allowed
◦ E.g., web mirrors
 Server-initiated replicas
◦ Enhance performance
◦ Not maintained by owner of DDS
◦ Placed close to groups of clients
 Manually
 Dynamically
 Client-initiated replicas
◦ Client caches
◦ Temporary
◦ Owner not aware of replica
◦ Placed closest to a client
◦ Maintained by host (often the client)
Permanent Replicas
• Permanent Replicas
– Initial set of replicas that constitutes a
distributed data store

– Typically, the number of it is small

• Example: Web site

(34)
Server-Initiated Replicas
• Server-Initiated Replicas
– Created at the initiative of the owner of the data
store
– Exist to enhance performance
• Work Scheme
How to decide where and when replicas should be created or deleted?
– Each server keeps track of access counts per file,
and where access requests come from.
– When a server Q decides to reevaluate the
placement of the files it stores, it checks the
access count for each file.
– If the total number of access requests for F at Q
drops below the deletion threshold del (Q,F), it
will delete F unless it is the last copy.
(35)
Client-Initiated Replicas
• Work Scheme
– When a client wants access to some data, it
connects to the nearest copy of the data store from
where it fetches the data it wants to read,
– When most operations involve only reading data,
performance can be improved by letting
the client store requested data in a nearby cache.
– The next time that same data needs to be read, the
client can simply fetch it from this local
cache.

• Client-Initiated Replicas
– Created at the initiative of clients
– Commonly known as client caches
(36)

You might also like