0% found this document useful (0 votes)
36 views46 pages

Chapter Five

Distributed system Lecture Note

Uploaded by

habtamu molla
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
36 views46 pages

Chapter Five

Distributed system Lecture Note

Uploaded by

habtamu molla
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 46

Replication and

Consistency
Chapter Five
Reasons for Replication
 There are two primary reasons for replicating
data: reliability and performance.
 If a file system has been replicated it may be
possible to continue working after one replica
crashes.
 Also, by maintaining multiple copies, it becomes
possible to provide better protection against
corrupted data.
 For example, imagine there are three copies of a
file and every read and write operation is
performed on each copy.
 We can safeguard ourselves against a single,
failing write operation, by considering the value
that is returned by at least two copies as being
Cont..
 The other reason for replicating data is
performance.
 Replication for performance is important
when the distributed system needs to
scale in numbers and geographical area.
 Scaling in numbers. occurs, for example,
when an increasing number of processes
needs to access data that are managed
by a single server.
 In that case, performance can be
improved by replicating the server and
subsequently dividing the work.
Cont..
 Scaling with respect to the size of a
geographical area may also require
replication.
 The basic idea is that by placing a copy
of data in the proximity of the process
using them, the time to access the data
decreases.
 As a consequence, the performance as
perceived by that process increases.
Cont..
 The problem with replication is that
having multiple copies may lead to
consistency problems.
 Whenever a copy is modified, that copy
becomes different from the rest.
 Consequently, modifications have to be
carried out on all copies to ensure
consistency.
 Exactly when and how those
modifications need to be carried out
determines the price of replication.
Consistency
 A read operation performed at any copy
will always return the same result.
 Consequently, when an update operation
is performed on one copy, the update
should be propagated to all copies before
a subsequent operation takes place,
 No matter at which copy that operation
is initiated or performed.
Cont..
 In essence, this means that all replicas
first need to reach agreement on when
exactly an update is to be performed
locally.
 For example, replicas may need to
decide on a global ordering of operations
using Lamport timestamps, or let a
coordinator assign such an order.
 Global synchronization simply takes a lot
of communication time, especially when
replicas are spread across a wide-area
network.
Cont..

Organization of a distributed remote object shared by two


different clients.
Cont..

 A remote object capable of handling concurrent invocations on its


own.
 A remote object for which an object adapter is required to handle
concurrent invocations
Two Consistency Models
 Data Centric Model
 Defined consistency is experienced by all clients, i.e.
we must provide a system wide consistent view on
the data store
 All clients see same sequence of operations at each replica,
hopefully with only minor delay.
 Without additional synchronization hard to achieve in DS
 Client Centric Model
 If there are no concurrent updates (or only few
compared to number of reads) we can weaken
consistency requirements
 Consistency of the data store only reflects one client’s
view
 Different clients might see different sequences of operations
at their replicas
 Relatively easy to implement
1. DATA-CENTRIC CONSISTENCY
 MODELS
A data store may be physically
distributed across multiple machines.
 In particular, each process that can
access data from the store is assumed to
have a local (or nearby) copy available of
the entire store.
 Write operations are propagated to the
other copies, as shown in next figure.
 A data operation is classified as a write
operation when it changes the data, and
is otherwise classified as a read
operation.
Cont..
Consistency Model:
A contract between processes and the data store.

The general organization of a logical data


store, physically distributed and
replicated across multiple processes.
Cont..
 In the absence of a global clock, it is
difficult to define precisely which write
operation is the last one.
 As an alternative, we need to provide
other definitions, leading to a range of
consistency models.
 Each model effectively restricts the
values that a read operation on a data
item can return.
1.1 Strict Consistency
 Any read on a data item x returns a
value corresponding to the result of the
most recent write on x.

 Behaviour of two processes, operating on


the same data item.
 a) A strictly consistent store.
 b) A store that is not strictly consistent.
 A problem: implementation requires
absolute global time.
 Another problem: a solution may be
physically impossible.
1.2 Sequential Consistency
 In the following, we will use a special
notation in which we draw the operations
of a process along a time axis.
 The time axis is always drawn
horizontally, with time increasing from left
to right. The symbols
 Mean that a write by process Pi to data
item x with the value a and a read from
that item by Pi returning b have been
done, respectively.
 We assume that each data item is initially
NIL.
Cont..

 Sequential consistency is an important data-


centric consistency model, which was first
defined by Lamport (1979) in the context of
shared memory for multiprocessor systems.
 In general, a data store is said to be sequentially
consistent when it satisfies the following
condition:
 The result of any execution is the same as if the (read
and write) operations by all processes on the data
store were executed in some sequential order and the
operations of-each individual process appear in this
sequence in the order specified by its program.
Cont..

(a) A sequentially consistent data store.


(b) A data store that is not sequentially
consistent.
1.3 Linearizability
 Consistency
The result of any execution is the same
as if the (read and write) operations by
all processes on the data store were
executed in some sequential order and
the operations of each individual process
appear in this sequence in the order
specified by its program.
 In addition, if TSOP1(x) < TSOP2(y) , then
operation OP1(x) should precede OP2(y)
in this sequence.
1.4 Causal Consistency
 The causal consistency model represents a
weakening of sequential consistency in that it
makes a distinction between events that are
potentially causally related and those that are
not.
 If event b is caused or influenced by an earlier
event a, causality requires that everyone else
first see a, then see b.
 Suppose that process P1 writes a data item x.
Then P2 reads x and writes y.
 Here the reading of x and the writing of y are
potentially causally related because the
computation of y may have depended on the
value of x as read by P2 (i.e., the value written
Cont..
 For a data store to be considered
causally consistent, it is necessary that
the store obeys the following condition:
 Writes that are potentially causally
related must be seen by all processes in
the same order. Concurrent writes may
be seen in a different order on different
machines.
Cont..

(a) A violation of a causally-consistent


store.
(b) A correct sequence of events in a causally-consistent
store.
Cont..
 Implementing causal consistency
requires keeping track of which
processes have seen which writes.
 It effectively means that a dependency
graph of which operation is dependent
on which other operations must be
constructed and maintained.
1.5 FIFO Consistency
 Necessary Condition:
 Writes done by a single process are seen by
all other processes in the order in which they
were issued,
 But writes from different processes may be
seen in a different order by different
processes.
 Guarantee:
 writes from a single source must arrive in order
 no other guarantees.
 Easy to implement
1.6 Entry Consistency
 Consistency combined with “mutual
exclusion”
 Each shared data item is associated with a
synchronization variable S
 S has a current owner (who has exclusive
access to the associated data, which is
guaranteed up-to-date)
 Process P enters a critical section:
Acquire(S)
 retrieve the ownership of S
 the associated variables are made consistent
 Propagation of updates at the next
Cont..
Summary of Consistency
 Strict AbsoluteModels
time ordering of all shared
accesses matters.
 Linearizability All processes see all
shared accesses in the same order.
Accesses are furthermore ordered
according to a (nonunique) global
timestamp.
 Sequential All processes see all shared
accesses in the same order. Accesses are
not ordered in time.
 Causal All processes see causally-related
shared accesses in the same order.
Cont..
 FIFO All processes see writes from each
other in the order they were performed.
Writes from different processes may not
always be seen in the same order by
other processes.

 Entry Shared data associated with a


synchronization variable are made
consistent when a critical section is
entered.
2. CLIENT-CENTRIC CONSISTENCY MODELS
 An important assumption is that concurrent
processes may be simultaneously updating the
data store, and that it is necessary to provide
consistency in the face of such concurrency.
 For example, in the case of object-based entry
consistency, the data store guarantees that
when an object is called, the calling process is
provided with a copy of the object that reflects
all changes to the object that have been made
so far, possibly by other processes.
 During the call, it is also guaranteed that no
other process can interfere that is, mutual
exclusive access is provided to the calling
process.
Cont..
 The data stores we consider are
characterized by the lack of simultaneous
updates, or when such updates happen,
they can easily be resolved.
 Most operations involve reading data.
 These data stores offer a very weak
consistency model, called eventual
consistency.
 By introducing special client-centric
consistency models, it turns out that
many inconsistencies can be hidden in a
relatively cheap way.
2.1 Eventual Consistency
 There are many examples in which
concurrency appears only in a restricted
form.
 For example, in many database systems,
most processes hardly ever perform update
operations;
 they mostly read data from the database.
 Only one, or very few processes perform
update operations.
 The question then is how fast updates
should be made available to only reading
processes.
Cont..
 if no updates take place for a long time, all replicas will
gradually become consistent.
 This form of consistency is called eventual
consistency.
 Data stores that are eventually consistent thus have
the property that in the absence of updates, all
replicas converge toward identical copies of each
other.
 Eventual consistency essentially requires only that
updates are guaranteed to propagate to all replicas.
 Write-write conflicts are often relatively easy to solve
when assuming that only a small group of processes
can perform updates.
 Eventual consistency is therefore often cheap to
implement.
Cont..
 Eventual consistent data stores work tine
as long as clients always access the
same replica.
 However, problems arise when different
replicas are accessed over a short period
of time.
Cont..
Cont..
 This example is typical for eventually-
consistent data stores and is caused by
the fact that users may sometimes
operate on different replicas.
 The problem can be alleviated by
introducing client-centric consistency.
 In essence, client-centric consistency
provides guarantees for a single client
concerning the consistency of accesses
to a data store by that client.
 No guarantees are given concerning
concurrent accesses by different clients.
2.2 Monotonic Reads
 The first client-centric consistency model is
that of monotonic reads.
 A data store is said to provide monotonic-
read consistency if the following condition
holds:
 ..If a process reads the value of a data item x,
any successive read operation on x by that
process will always return that same value or a
more recent value. (Example: e-mail)
 In other words, monotonic-read
consistency guarantees that if a process
has seen a value of x at time t, it will
never see an older version of x at a later
Cont..
 As an example where monotonic reads
are useful, consider a distributed email
database.

(a)A monotonic-read consistent data store.


(b)(b) A data store that does not provide
monotonic reads.
2.3 Monotonic Writes
 In many situations, it is important that write
operations are propagated in the correct
order to all copies of the data store.
 This property is expressed in monotonic-
write consistency.
 In a monotonic-write consistent store, the
following condition holds:
 A write operation by a process on a data
item x is completed before any successive
write operation on X by the same process.
 (Example: software updates)
Cont..
 The essence of FIFO consistency is that
write operations by the same process are
performed in the correct order everywhere.
 This ordering constraint also applies to
monotonic writes, except that we are now
considering consistency only for a single
process instead of for a collection of
concurrent processes.
 Consider, for example, a software library.
In many cases, updating such a library is
done by replacing one or more functions,
leading to a next version.
Cont..
 To ensure monotonic-write consistency,
it is necessary that the previous write
operation at L1 has already been
propagated to L2.

• A monotonic-write consistent data store.


• (b) A data store that does not provide
monotonic-write consistency.
3.4 Read Your Writes
 A client-centric consistency model that is
closely related to monotonic reads is as
follows.
 A data store is said to provide read-your-writes
consistency, if the following condition holds:
 The effect of a write operation by a process
on data item x will always be seen by a
successive read operation on x by the same
process. (Example: edit www-page)
 In other words, a write operation is always
completed before a successive read operation
by the same process, no matter where that
read operation takes place.
Cont..
 The absence of read-your-writes
consistency is sometimes experienced
when updating Web documents and
subsequently viewing the effects.
 Similar effects occur when updating
passwords.

• A data store that provides read-your-writes


consistency.
• (b) A data store that does not.
2.5 Writes Follow Reads
 The last client-centric consistency model
is one in which updates are propagated as
the result of previous read operations.
 A data store is said to provide writes-
follow-reads consistency, if the following
holds.
 A write operation by a process on a data item x
following a previous read operation on x by the
same process is guaranteed to take place on the
same or a more recent value of x that was read.
 In other words, any successive write operation
by a process on a data item x will be performed
on a copy of x that is up to date with the value
most recently read by that process.
Cont..
 Writes-follow-reads consistency can be
used to guarantee that users of a network
newsgroup see a posting of a reaction to
an article only after they have seen the
original article.

(a)A writes-follow-reads consistent data


store.
(b)(b) A data store that does not provide
writes-follow-reads consistency.
Client Centric Consistency model
 Monotonic read: Ifsymmary
a process reads the value of a data
item x, any successive read operation on x by that
process will always return that same value or a more
recent value

 Monotonic write: A write operation by a process on


a data item x is completed before any successive write
operation on x by the same process

 Read your writes: The effect of a write operation by


a process on a data item x will always be seen by a
successive read operation on x by the same process

 Writes follow reads: A write operation by a


process on a data item x following a previous read
operation on x by the same process, is garanteed to take
place on the same or more recent values of x that was
read
The End

Thanks
Quiz 1 (10%)
Take out a piece of paper and Write your
name
1)What are the difference between data
centric and client centric consistency
models?
2)Write the name and difference of the
following models?
1)

2)

You might also like