0% found this document useful (0 votes)

13 views31 pages

Slides

Uploaded by

ossamasamir.workout

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views31 pages

Slides

Uploaded by

ossamasamir.workout

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 31

CS476

Parallel and Distributed Computing

Module 8
Consistency & Replication

282
Contents
1.Consistency: Performance and scalability, Consistency models serializability,
transactions, Basic architecture, Web-cache consistency.

2. Replication: Replica placement, Content replication, managing replicated

objects, Replicated-write protocols, Implementing client-centric consistency
and aalternatives for caching and replication.

283
Weekly Learning Outcomes
1. Learn about Consistency and Replication.

284
Required Reading
Chapter 7: Consistency and Replication, (Distributed Systems, 4th Edition, Version 4.01
Author(s) Maarten van Steen, Andrew S. Tanenbaum Publisher: CreateSpace; 3.01 edition
(January 2023) ISBN: 978-90-815406-3-6, (Printed version)

Recommended Reading
https://research.iaun.ac.ir/pd/faramarz_safi/pdfs/UploadFile_9481.pdf

https://www.youtube.com/watch?v=pdxGtahoqlY

285
Consistency and Replication
An important issue in distributed systems is the replication of data. Data
are generally replicated to enhance reliability or improve performance.
One of the major problems is keeping replicas consistent. Informally, this
means that when one copy is updated, we need to ensure that the other
copies are updated as well; otherwise, the replicas will no longer be the
same. Main questions here are “why replication is useful? and how it
relates to scalability? what consistency actually means”. First, we start
with concentrating on managing replicas, which considers not only the
placement of replica servers, but also how content is distributed to these
servers. The second issue is how replicas are kept consistent. In most
cases, applications require a strong form of consistency. Informally, this
means that updates are to be propagated more or less immediately
between replicas.
Replication
Why replicate
Assume a simple model in which we make a copy of a specific part of a system
(meaning code and data).
Increase reliability: if one copy does not live up to specifications, switch over to the
other copy while repairing the failing one.
Performance: simply spread requests between different replicated parts to keep
load balanced, or to ensure quick responses by taking proximity into account.
The problem
Having multiple copies, means that when any copy changes, that change should
be made at all copies: replicas need to be kept the same, that is, be kept
consistent.
Performance and scalability
Main issue
To keep replicas consistent, we generally need to ensure that all
conflicting operations are done in the the same order everywhere
Conflicting operations: From the world of transactions
Read–write conflict: a read operation and a write operation act
concurrently
Write–write conflict: two concurrent write operations

Issue
Guaranteeing global ordering on conflicting operations may be a costly
operation, downgrading scalability. Solution: weaken consistency
requirements so that hopefully global synchronization can be avoided
Data-centric consistency models
Consistency model
A contract between a (distributed) data store and processes, in which the data
store specifies precisely what the results of read and write operations are in the
presence of concurrency.
Essential
A data store is a distributed collection of storages:
Sequential consistency
Definition
The result of any execution is the same as if the operations of all processes
were executed in some sequential order, and the operations of each individual
process appear in this sequence in the order specified by its program.

A sequentially consistent data store

A data store that is not sequentially consistent

Causal consistency
Definition
Writes that are potentially causally related must be seen by all processes in the
same order. Concurrent writes may be seen in a different order by different
processes.

A violation of a causally-consistent store

A correct sequence of events in a causally-consistent store

Consistency models, serializability, transactions
Sequential Consistency
Overwhelming, but often already known
Again, from the world of transactions: can we order the execution of all operations
in a set of transactions in such a way that the final result matches a serial
execution of those transactions? The keyword is serializability.
BEGIN TRANSACTION BEGIN TRANSACTION BEGIN TRANSACTION
x =0 x =0 x =0
x =x +1 x =x +2 x =x +3
END TRANSACTION END TRANSACTION END TRANSACTION
Transaction T1 Transaction T2 Transaction T3

A number of schedules
Time − →

S1 x=0 x=x+1 x=0 x=x+2 x=0 x=x+3 Legal

S2 x=0 x=0 x=x+1 x=x+2 x=0 x=x+3 Legal

S3 x=0 x=0 x=x+1 x=0 x=x+2 x=x+3 Illegal

S4 x=0 x=0 x=x+3 x=0 x=x+1 x=x+2 Illegal

Eventual consistency WhatsApp

Definition
Consider a collection of data stores and (concurrent) write operations. The strores are
eventually consistent when in lack of updates from a certain moment, all updates to that point
are propagated in such a way that replicas will have the same data stored (until updates are
accepted again).
Srong eventual consistency
Basic idea: if there are conflicting updates, have a globally determined resolution mechanism
(for example, using NTP, simply let the “most recent” update win).
Network Time Protocol
Program consistency
P is a monotonic problem if for any input sets S and T , P(S) ⊆ P(T ). Observation: A program
solving a monotonic problem can start with incomplete information, but is guaranteed not to
have to roll back when missing information becomes available. Example: filling a shopping cart.
Important observation
In all cases, we are avoiding global synchronization.
Consistency for mobile users
Example
Consider a distributed database to which you have access through your notebook. Assume your
notebook acts as a front end to the database.
At location A you access the database doing reads and updates.
At location B you continue your work, but unless you access the same server as the one at location
A, you may detect inconsistencies:
your updates at A may not have yet been propagated to B
you may be reading newer entries than the ones available at A
your updates at B may eventually conflict with those at A

Note
The only thing you really want is that the entries you updated and/or read at A, are in B the way you
left them in A. In that case, the database will appear to be consistent to you.
Basic architecture
The principle of a mobile user accessing different replicas of a
distributed database
Example: ZooKeeper consistency
Yet another model?
ZooKeeper’s consistency model mixes elements of data-centric and
client-centric models
Take a naive example
Replica placement
Essence
Figure out what the best K places are out of N possible locations.
Select best location out of N −K for which the average distance to clients is
minimal. Then choose the next best server. (Note: The first chosen location
minimizes the average distance to all clients.) Computationally expensive.
Select the K -th largest autonomous system and place a server at the best-
connected host. Computationally expensive.
Position nodes in a d -dimensional geometric space, where distance reflects
latency. Identify the K regions with highest density and place a server in every
one. Computationally cheap.
Content replication

Distinguish different processes

A process is capable of hosting a replica of an object or data:
- Permanent replicas: Process/machine always having a replica.
- Server-initiated replica: Process that can dynamically host a replica on
request of another server in the data store.
-Client-initiated replica: Process that can dynamically host a replica on
request of a client (client cache).
Content replication
The logical organization of different kinds of copies of
a data store into three concentric rings.
Server-initiated replicas
Counting access requests from different clients

Keep track of access counts per file, aggregated by considering

server closest to requesting clients
Number of accesses drops below threshold D ⇒ drop file
Number of accesses exceeds threshold R ⇒ replicate file
Number of access between D and R ⇒ migrate file
Managing replicated objects
Prevent concurrent execution of multiple invocations on the same object:
access to the internal data of an object has to be serialized. Using local
locking mechanisms are sufficient.
Ensure that all changes to the replicated state of the object are the same:
no two independent method invocations take place on different replicas at
the same time: we need deterministic thread scheduling.
Replicated-object invocations
Problem when invocating a replicated object
Replicated-object invocations

Forwarding a request Returning the reply

Primary-based protocols
Primary-backup protocol

Example primary-backup protocol

Traditionally applied in distributed databases and file systems that require a high degree
of fault tolerance. Replicas are often placed on the same LAN.
Replicated-write protocols
Quorum-based protocols
Assume N replicas. Ensure that each operation is carried out in such a way that a
majority vote is established: distinguish read quorum NR and write quorum NW . Ensure:
1. NR + NW > N (prevent read-write conflicts)
2. NW > N/2 (prevent write-write conflicts)

Correct Write-write Correct (ROWA)

conflict
Continuous consistency: Numerical errors

Principal operation
Every server Si has a log, denoted as Li .
Consider a data item x and let val (W ) denote the
numerical change in its value after a write operation W .
Assume that
∀W : val (W ) > 0

W is initially forwarded to one of the N replicas,

denoted as origin(W ). TW [i, j ] are the writes executed
by server Si that originated from Sj :

TW [i, j ] = ∑{val (W )|origin(W ) = Sj & W ∈ Li }

Implementing client-centric consistency
Keeping it simple
Each write operation W is assigned a globally unique identifier by its origin server. For each
client, we keep track of two sets of writes:
Read set: the (identifiers of the) writes relevant for that client’s read operations
Write set: the (identifiers of the) client’s write operations.

Monotonic-read consistency
When client C wants to read at server S, C passes its read set. S can pull in any updates before
executing the read operation, after which the read set is updated.
Monotonic-write consistency
When client C wants to write at server S, C passes its write set. S can pull in any updates,
executes them in the correct order, and then executes the write operation, after which the write set
is updated.
Implementing client-centric consistency

Read-your-writes consistency
When client C wants to read at server S, C passes its write set. S can
pull in any updates before executing the read operation, after which the
read set is updated.
Writes-follows-reads consistency
When client C wants to write at server S, C passes its read set. S can
pull in any updates, executes them in the correct order, and then
executes the write operation, after which the write set is updated.
Example: replication in the Web
Client-side caches
In the browser
At a client’s site, notably through a Web proxy

Caches at ISPs
Internet Service Providers also place caches to (1) reduce
cross-ISP traffic and (2) improve client-side performance. May
get nasty when a request needs to pass many ISPs.
Cooperative caching
Web-cache consistency
How to guarantee freshness?
To prevent that stale information is returned to a client:
Option 1: let the cache contact the original server to see if content is
still up to date.
Option 2: Assign an expiration time Texpire that depends on how long
ago the document was last modified when it is cached. If Tlast modified is
the last modification time of a document (as recorded by its owner),
and Tcached is the time it was cached, then

Texpire = α(Tcached −T ast modified) + Tcached

with α = 0.2. Until Texpire, the document is considered valid.

Alternatives for caching and replication

Database copy: the edge has the same as the origin server
Content-aware cache: check if a (normal query) can be answered with cached data. Requires
that the server knows about which data is cached at the edge.
Content-blind cache: store a query, and its result. When the exact same query is issued
again, return the result from the cache.

Chapter-6 - Transactions-Concurrency and Recovery
No ratings yet
Chapter-6 - Transactions-Concurrency and Recovery
42 pages
Client - Centric Consistency Models
0% (1)
Client - Centric Consistency Models
16 pages
Consistency and Replication - PPT
No ratings yet
Consistency and Replication - PPT
55 pages
Distributed Systems: Chapter 07: Consistency & Replication
No ratings yet
Distributed Systems: Chapter 07: Consistency & Replication
48 pages
Chapter 7-Consistency and Replication
No ratings yet
Chapter 7-Consistency and Replication
63 pages
SYBSc Computer Science
No ratings yet
SYBSc Computer Science
34 pages
6.to Study Data Centric and Client Centric Consistency Model
100% (7)
6.to Study Data Centric and Client Centric Consistency Model
6 pages
Ch07 Ts TK Consistency Replication
No ratings yet
Ch07 Ts TK Consistency Replication
57 pages
Intro To DS Chapter 5
No ratings yet
Intro To DS Chapter 5
76 pages
Data-Centric Consistency Models: Presented by Saadia Jehangir
100% (2)
Data-Centric Consistency Models: Presented by Saadia Jehangir
31 pages
Chap 5
No ratings yet
Chap 5
75 pages
Testing For Serializability of A Schedule Final
No ratings yet
Testing For Serializability of A Schedule Final
41 pages
Chapter - 7 - Consistency and Replication112
No ratings yet
Chapter - 7 - Consistency and Replication112
30 pages
Consistency and Replication
No ratings yet
Consistency and Replication
73 pages
7.distributed Systems-Consistancy Replication
No ratings yet
7.distributed Systems-Consistancy Replication
82 pages
Chapter 7
No ratings yet
Chapter 7
73 pages
CH 7 Part 2 Distributed System
No ratings yet
CH 7 Part 2 Distributed System
67 pages
Big Data Analytics Lecture 2
No ratings yet
Big Data Analytics Lecture 2
42 pages
Consistency Replication
No ratings yet
Consistency Replication
49 pages
Concepts of Distributed Systems 2006/2007: Consistency & Replication
No ratings yet
Concepts of Distributed Systems 2006/2007: Consistency & Replication
53 pages
Dbms Unit-3
No ratings yet
Dbms Unit-3
26 pages
Chapter 8 - Concurrency Control Techniques
No ratings yet
Chapter 8 - Concurrency Control Techniques
28 pages
Ds Chapter 6
No ratings yet
Ds Chapter 6
23 pages
Consistency and Replication1
No ratings yet
Consistency and Replication1
30 pages
Second Semester-Specialization
No ratings yet
Second Semester-Specialization
25 pages
Consistency
No ratings yet
Consistency
23 pages
BCS 413 - Lecture5 - Replication - Consistency
No ratings yet
BCS 413 - Lecture5 - Replication - Consistency
25 pages
Consistency and Replication
No ratings yet
Consistency and Replication
100 pages
DS Lecture Chapter 7
No ratings yet
DS Lecture Chapter 7
38 pages
DS CH6 - Consistency and Replication
No ratings yet
DS CH6 - Consistency and Replication
18 pages
MSBD 5020 L16-22
No ratings yet
MSBD 5020 L16-22
15 pages
CS8492-Database Management Systems
No ratings yet
CS8492-Database Management Systems
15 pages
Lecture 7.2 Consistency
No ratings yet
Lecture 7.2 Consistency
9 pages
Slides 07
No ratings yet
Slides 07
73 pages
Consistency and Replication55
No ratings yet
Consistency and Replication55
17 pages
Chapter 2 Transaction Management
No ratings yet
Chapter 2 Transaction Management
37 pages
Managing Replicated Objects: Deterministic Thread Scheduling
No ratings yet
Managing Replicated Objects: Deterministic Thread Scheduling
12 pages
D.S Consistency and Replication
No ratings yet
D.S Consistency and Replication
44 pages
ds7 Con
No ratings yet
ds7 Con
71 pages
Chapter 6 - Consistency and Replication
No ratings yet
Chapter 6 - Consistency and Replication
24 pages
Chapter 6-Consistency and Replication-Updated
No ratings yet
Chapter 6-Consistency and Replication-Updated
30 pages
Lecture 6
No ratings yet
Lecture 6
48 pages
Consistency and Replication
No ratings yet
Consistency and Replication
8 pages
Introduction To Distributed Computing
No ratings yet
Introduction To Distributed Computing
57 pages
Chapter Five
No ratings yet
Chapter Five
46 pages
Chapter 7kec
No ratings yet
Chapter 7kec
8 pages
Chapter 7 Consistency and Replication
No ratings yet
Chapter 7 Consistency and Replication
43 pages
Consistency and Replication Lecture
No ratings yet
Consistency and Replication Lecture
25 pages
Introduction To Schedule and Serializability in DBMS
No ratings yet
Introduction To Schedule and Serializability in DBMS
8 pages
Deepak and Deepa - Consistency - and - Replication
No ratings yet
Deepak and Deepa - Consistency - and - Replication
38 pages
Sample Questions 4&5
No ratings yet
Sample Questions 4&5
4 pages
A Client-Centric Consistency Model For Distributed Data Stores Using Colored Petri Nets
No ratings yet
A Client-Centric Consistency Model For Distributed Data Stores Using Colored Petri Nets
6 pages
DBMS Ass 4
No ratings yet
DBMS Ass 4
1 page
Homework3 Sol
No ratings yet
Homework3 Sol
4 pages
Bachelor in Data Science Syllabus
No ratings yet
Bachelor in Data Science Syllabus
25 pages
Question Bank DBMS BCS403
No ratings yet
Question Bank DBMS BCS403
8 pages
Chapter 7-Consistency and Replication
No ratings yet
Chapter 7-Consistency and Replication
73 pages
DS Consistancy and Replication (Mod 7)
No ratings yet
DS Consistancy and Replication (Mod 7)
13 pages
7 Consistency
No ratings yet
7 Consistency
41 pages
Chapter 6-Consistency and Replication
No ratings yet
Chapter 6-Consistency and Replication
59 pages
Chapter 7 - Consistency and Replication
No ratings yet
Chapter 7 - Consistency and Replication
28 pages
Model of Questions MCQ
No ratings yet
Model of Questions MCQ
12 pages
DBMS Unti-5
No ratings yet
DBMS Unti-5
57 pages
ch07 Consistency Replication
No ratings yet
ch07 Consistency Replication
30 pages
23cs1303 Unit 4 Dbms
No ratings yet
23cs1303 Unit 4 Dbms
22 pages
Ds Lecture 10 11 11
No ratings yet
Ds Lecture 10 11 11
56 pages
Consistency and Replication SLM
No ratings yet
Consistency and Replication SLM
25 pages
Chapter 7-Consistency and Replication
No ratings yet
Chapter 7-Consistency and Replication
78 pages
CT2 Key
No ratings yet
CT2 Key
9 pages
Chapter 7-Consistency and Replication
No ratings yet
Chapter 7-Consistency and Replication
30 pages
Advanced Distributed Systems Replication: What Is Replication? Reasons For Replication
No ratings yet
Advanced Distributed Systems Replication: What Is Replication? Reasons For Replication
20 pages
IS222 2010 Sol
No ratings yet
IS222 2010 Sol
12 pages
Chapter 7-Consistency and Replication
No ratings yet
Chapter 7-Consistency and Replication
53 pages
DBMS Question Bank
No ratings yet
DBMS Question Bank
12 pages
Chapter 6-Consistency and Replication
No ratings yet
Chapter 6-Consistency and Replication
39 pages
Consistency in Distributed Systems
No ratings yet
Consistency in Distributed Systems
21 pages
Distributed Sys 7
No ratings yet
Distributed Sys 7
54 pages
Chapter-6 Consistency and Replication
No ratings yet
Chapter-6 Consistency and Replication
67 pages
DBMS Unit 5 Notes
No ratings yet
DBMS Unit 5 Notes
31 pages
DBMS Interview
No ratings yet
DBMS Interview
6 pages
Unit 2 DDMS
No ratings yet
Unit 2 DDMS
26 pages
Distributed System Notes
No ratings yet
Distributed System Notes
24 pages
UNIT V Transaction and Indexing
No ratings yet
UNIT V Transaction and Indexing
26 pages
Consistency and Replication: CS403/534 Distributed Systems Erkay Savas Sabanci University
No ratings yet
Consistency and Replication: CS403/534 Distributed Systems Erkay Savas Sabanci University
44 pages
Advanced Database Management Systems
No ratings yet
Advanced Database Management Systems
3 pages
DDM Unit 4
No ratings yet
DDM Unit 4
24 pages
Whats The Difference of Majority Committed Data and The Snapshot of Majority
No ratings yet
Whats The Difference of Majority Committed Data and The Snapshot of Majority
1 page
Dbms Notes
No ratings yet
Dbms Notes
28 pages
CS3492 DBMS-Important-2-Mark With Answer
No ratings yet
CS3492 DBMS-Important-2-Mark With Answer
16 pages

Slides

Uploaded by

Slides

Uploaded by

CS476

Parallel and Distributed Computing

2. Replication: Replica placement, Content replication, managing replicated

A sequentially consistent data store

A data store that is not sequentially consistent

A violation of a causally-consistent store

A correct sequence of events in a causally-consistent store

S1 x=0 x=x+1 x=0 x=x+2 x=0 x=x+3 Legal

S2 x=0 x=0 x=x+1 x=x+2 x=0 x=x+3 Legal

S3 x=0 x=0 x=x+1 x=0 x=x+2 x=x+3 Illegal

S4 x=0 x=0 x=x+3 x=0 x=x+1 x=x+2 Illegal

Distinguish different processes

Keep track of access counts per file, aggregated by considering

Forwarding a request Returning the reply

Example primary-backup protocol

Correct Write-write Correct (ROWA)

W is initially forwarded to one of the N replicas,

TW [i, j ] = ∑{val (W )|origin(W ) = Sj & W ∈ Li }

Texpire = α(Tcached −T ast modified) + Tcached

with α = 0.2. Until Texpire, the document is considered valid.

You might also like