0% found this document useful (0 votes)
10 views48 pages

DisSys Lec7

Replication involves creating multiple copies of data across distributed nodes to improve availability, reliability, and performance. Managing changes to replicated data, such as concurrent writes, is challenging. Various techniques can be used, including quorums, timestamps, and state machine replication to ensure consistency while maintaining availability during network failures. The CAP theorem states it is impossible for a distributed system to provide consistency, availability, and partition tolerance simultaneously. Replication approaches make different tradeoffs between these properties.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views48 pages

DisSys Lec7

Replication involves creating multiple copies of data across distributed nodes to improve availability, reliability, and performance. Managing changes to replicated data, such as concurrent writes, is challenging. Various techniques can be used, including quorums, timestamps, and state machine replication to ensure consistency while maintaining availability during network failures. The CAP theorem states it is impossible for a distributed system to provide consistency, availability, and partition tolerance simultaneously. Replication approaches make different tradeoffs between these properties.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 48

Replication

Replication
• Replication is a technique used in distributed systems to improve the availability,
reliability, and performance of the system.

• Replication involves creating multiple copies of data or resources and distributing


them across multiple nodes in the system.

• Each copy of the data is referred to as a replica, and the process of creating and
maintaining replicas is known as replication.
Replication objectives
• Availability: By replicating data across multiple nodes, the system can continue to
function even if some nodes fail. If one replica becomes unavailable, the system
can switch to another replica, ensuring that the data remains available.

• Reliability: Replication can improve the reliability of the system by reducing the
likelihood of data loss due to node failures or other issues. If one replica becomes
corrupted or lost, the system can use another replica to restore the data.

• Performance: Replication can improve the performance of the system by


reducing the load on individual nodes. By distributing the workload across
multiple replicas, the system can handle more requests and process them faster.
Replication and data changes
• Easy if the data doesn’t change … just copy it!
➢ the main problem in replication is managing changes to the data.

• Some examples of data changes in distributed systems


• Retrying state updates
• Adding and then removing data
• Concurrent writes by different clients
Retrying state updates
Retrying state updates
Retrying state updates
Retrying state updates
Retrying state updates

Deduplicating requests requires that the database tracks which requests it has already seen (in stable storage)
Retry behavior
• Choice of retry behavior:
• At-most-once semantics: send request, don't retry, update may not happen
• At-least-once semantics: retry request until acknowledged, may repeat
update
• Exactly-once semantics: retry + idempotence or deduplication

• A function f is idempotent if f(x) = f(f(x))


• Not idempotent: 𝑓(𝑙𝑖𝑘𝑒𝐶𝑜𝑢𝑛𝑡) = 𝑙𝑖𝑘𝑒𝐶𝑜𝑢𝑛𝑡 + 1
• Idempotent: 𝑓(𝑙𝑖𝑘𝑒𝑆𝑒𝑡) = 𝑙𝑖𝑘𝑒𝑆𝑒𝑡 ∪ (𝑢𝑠𝑒𝑟𝐼𝐷)
Adding and then removing again
Adding and then removing again
Adding and then removing again
Adding and then removing again
Timestamps and tombstones
Timestamps and tombstones
Timestamps and tombstones
Reconciling replicas
Concurrent writes by different clients
Concurrent writes by different clients
Concurrent writes by different clients
Replication and reliability
• Replication is useful since it allows us to improve the reliability of a system: when
one replica is unavailable, the remaining replicas can continue processing
requests.

• A replica may be unavailable due to network partition or node fault (e.g. crash,
hardware problem).

• The details of how exactly the replication is performed have a big impact on the
reliability of the system.

• Without fault tolerance, having multiple replicas would make reliability worse
Read-after-write consistency
Read-after-write consistency
Read-after-write consistency

• Writing to one replica, reading from another: client does not read back the value it has written

• Require writing to/reading from both replicas ⟹ cannot write/read if one replica is unavailable
Read-after-write consistency
• Many systems require read-after- write consistency, in which we ensure that after a
client writes a value, the same client will be able to read back the value it has just
written.

• With read-after-write consistency, after writing a client may not read the value it wrote
because concurrently another client may have overwritten the value.

• Therefore, we say that read-after-write consistency requires reading either the last value
written, or a later value.

• We could guarantee read-after-write consistency by ensuring we always write to both


replicas and/or read from both replicas.

• This would mean that reads and/or writes are no longer fault-tolerant: if one replica is
unavailable, a write or read that requires responses from both replicas would not be able
to complete.
Quorum (2 out of 3)
Quorum (2 out of 3)
Quorum (2 out of 3)
Quorum (2 out of 3)
Quorum (2 out of 3)

Write succeeds on B and C; read succeeds on A and B Choose between (t0, v0) and (t1, v1) based on timestamp
Quorum
• We can solve this problem by using three replicas, We send every read and write request
to all three replicas, but we consider the request successful if we receive 2 responses.

• In the example, the write succeeds on replicas B and C, while the read succeeds on
replicas A and B. With a (2 out of 3) policy for both reads and writes, it is guaranteed that
at least one of the responses to a read is from a replica that saw the most recent write
(in the example, this is replica B).

• In this example, the set of replicas {B, C} that responded to the write request is a write
quorum, and the set {A, B} that responded to the read is a read quorum.

• A quorum is a minimum set of nodes that must respond to some request for it to be
successful. (The term comes from politics, where a quorum refers to the minimum
number of votes required to make a valid decision,
Read and write quorums
• In a system with 𝑛 replicas:
• If a write is acknowledged by w replicas (write quorum),
• and we subsequently read from r replicas (read quorum),
• and 𝑟 + 𝑤 > 𝑛,
• . . . then the read will see the previously written value or a value that subsequently
overwrote it)
• Read quorum and write quorum share ≥ 1 replica
𝑛+1
• Typical: 𝑟 = 𝑤 = , for n = 3, 5, 7, . . . (majority)
2
• Reads can tolerate 𝑛 − 𝑟 unavailable replicas, writes 𝑛 − 𝑤
Read repair
Read repair
Read repair
Replication using broadcast
• So far we have used best-effort broadcast for replication. What about
stronger broadcast models?

• State machine replication (SMR):


• FIFO-total order broadcast every update to all replicas
• Replica delivers update message: apply it to own state
• Applying an update is deterministic
• Replica is a state machine starts in fixed initial state, goes through same
sequence of state transitions in the same order ⟹ all replicas end up in the
same state
State Machine Replication ( SMR)
on request to perform update 𝑢 do
send 𝑢 via FIFO-total order broadcast
end on
on delivering 𝑢 through FIFO-total order broadcast do
update state
end on
• We only require that the update logic is deterministic: any two replicas that are in the same state,
and are given the same input, must end up in the same next state.

• Some distributed database perform replication in this way, with each replica independently
executing the same deterministic transaction code (this is known as active replication).

• This principle also underpins blockchains, cryptocurrencies, and distributed ledgers: the chain of
blocks" in a blockchain is nothing other than the sequence of messages delivered by a total order
broadcast protocol.
Total order broadcast algorithms
• Single leader approach:
• One node is designated as leader (sequencer)
• To broadcast message, send it to the leader; leader broadcasts it via FIFO broadcast.
• Problem: leader crashes ⇒ no more messages delivered

• Lamport clocks approach:


• Attach Lamport timestamp to every message
• Deliver messages in total order of timestamps
Database leader replica
Leader database replica L ensures total order broadcast

Follower F applies transaction log in commit order


Replication using causal (and weaker)
broadcast
• State machine replication uses (FIFO-)total order broadcast.

• If replica state updates are commutative, replicas can process updates in different
orders and still end up in the same state.

• Updates f and g are commutative if f(g(x)) = g(f(x))

Broadcast assumptions about state update function


total-order (SMR)
causal concurrent updates commute
reliable all updates commute
best-effort commutative, idempotent, tolerates message loss
CAP theorem
CAP theorem
• The CAP theorem, is a principle that states that it is impossible for a
distributed system to simultaneously provide all three of the following
guarantees:

• Consistency: Every read from the system returns the most recent write or an
error.
• Availability: Every request receives a response, without guarantee that it
contains the most recent write.
• Partition tolerance: The system continues to function even when network
partitions occur.
Partition tolerance vs. consistency and
availability
• Partition tolerance is a fundamental requirement of distributed systems, and
cannot be sacrificed.

• However, the choice between consistency and availability depends on how the
system handles network partitions.

• For example, a system that sacrifices consistency during a network partition may
be able to provide high availability, while a system that sacrifices availability
during a network partition may be able to provide strong consistency.
Latency and performance
• Achieving high consistency or availability may come at the cost of increased
latency or reduced performance.

• For example, a strongly consistent system may require additional communication


between nodes to ensure consistency, which can increase latency and reduce
performance.
Replication and CAP theorem
• Consistency-oriented replication: In a consistency-oriented replication scheme,
all replicas are kept in sync with each other, so that every read operation returns
the same value from all replicas. This approach prioritizes consistency over
availability and partition tolerance, and is typically used in systems where data
integrity is critical, such as databases or financial systems.

• Availability-oriented replication: In an availability-oriented replication scheme,


replicas may be allowed to diverge temporarily, so that some read operations
return different values from different replicas. This approach prioritizes
availability over consistency and partition tolerance, and is typically used in
systems where responsiveness is critical, such as content delivery networks or
social networks.
Replication and CAP theorem
• It's important to note that the choice between consistency-oriented and
availability-oriented replication schemes depends on the specific requirements
and trade-offs of the system being designed.

• In general, systems that prioritize consistency will sacrifice availability, while


systems that prioritize availability will sacrifice consistency.

You might also like