Chapter Five
Chapter Five
Consistency
Chapter Five
Reasons for Replication
There are two primary reasons for replicating
data: reliability and performance.
If a file system has been replicated it may be
possible to continue working after one replica
crashes.
Also, by maintaining multiple copies, it becomes
possible to provide better protection against
corrupted data.
For example, imagine there are three copies of a
file and every read and write operation is
performed on each copy.
We can safeguard ourselves against a single,
failing write operation, by considering the value
that is returned by at least two copies as being
Cont..
The other reason for replicating data is
performance.
Replication for performance is important
when the distributed system needs to
scale in numbers and geographical area.
Scaling in numbers. occurs, for example,
when an increasing number of processes
needs to access data that are managed
by a single server.
In that case, performance can be
improved by replicating the server and
subsequently dividing the work.
Cont..
Scaling with respect to the size of a
geographical area may also require
replication.
The basic idea is that by placing a copy
of data in the proximity of the process
using them, the time to access the data
decreases.
As a consequence, the performance as
perceived by that process increases.
Cont..
The problem with replication is that
having multiple copies may lead to
consistency problems.
Whenever a copy is modified, that copy
becomes different from the rest.
Consequently, modifications have to be
carried out on all copies to ensure
consistency.
Exactly when and how those
modifications need to be carried out
determines the price of replication.
Consistency
A read operation performed at any copy
will always return the same result.
Consequently, when an update operation
is performed on one copy, the update
should be propagated to all copies before
a subsequent operation takes place,
No matter at which copy that operation
is initiated or performed.
Cont..
In essence, this means that all replicas
first need to reach agreement on when
exactly an update is to be performed
locally.
For example, replicas may need to
decide on a global ordering of operations
using Lamport timestamps, or let a
coordinator assign such an order.
Global synchronization simply takes a lot
of communication time, especially when
replicas are spread across a wide-area
network.
Cont..
Thanks
Quiz 1 (10%)
Take out a piece of paper and Write your
name
1)What are the difference between data
centric and client centric consistency
models?
2)Write the name and difference of the
following models?
1)
2)