Replication
Replication
Leader-based (master–slave)
replication
Synchronous Vs.
Asynchronous
• Configurability: Replication can occur synchronously or
asynchronously.
• Communication Flow: Figure illustrates synchronous and
asynchronous communication between leader and followers.
• Synchronous Replication: The leader waits for follower
confirmation, ensuring consistency.
• Asynchronous Replication: The leader sends data but doesn't wait
for the follower’s response, introducing a potential delay.
• Replication Lag: Asynchronous followers might experience delays,
leading to inconsistencies.
In the example, the replication to follower
1 is synchronous: the leader waits until
follower 1 has confirmed that it received
the write before reporting success to the
user, and before making the write visible
to other clients. The replication to follower
2 is asynchronous: the leader sends the
message, but doesn’t wait for a response
from the follower.
Leader-based replication with
one synchronous and one
asynchronous follower.
Setting Up New
Followers
• Snapshot Creation: Consistent snapshot of leader's database taken
without locking the entire database.
• Copy and Connect: Snapshot copied to new follower; follower
connects to leader for data changes.
• Catch-Up Process: Follower processes backlog, catching up to
leader.
• Automated or Manual: Setting up followers varies, from automated
processes to manual workflows.
Handling Node
Outages
• Follower Recovery: Follower easily recovers from crashes or
network interruptions using its log.
• Leader Failure (Failover): This tricky process involves promoting a
follower, reconfiguring clients, and transitioning other followers.
• Failover Detection: Timeout-based detection; leader assumed dead
if no response for a specified period.
• Failover Challenges: Potential issues include conflicting writes, split
brain scenarios, and determining the right timeout.
• Manual vs. Automatic Failover: Operations teams may prefer
manual failover for better control.
Implementation of
Replication Logs
• Replication Methods: Overview of statement-based, write-ahead
log shipping, logical (row-based) log replication, and trigger-based
replication.
• Compatibility Challenges: Write-ahead log shipping closely tied to
storage engine, limiting software version flexibility.
• Logical Log Advantages: Decoupled from storage engine, allowing
backward compatibility and different software versions.
• Trigger-Based Replication: Application-layer approach for flexibility
but with higher overhead and potential limitations.
Problems with
Replication Lag
• Read-Scaling Architecture: Using asynchronous replication for
read-heavy workloads with many followers.
• Eventual Consistency: Replication lag can lead to temporary
inconsistencies between leader and followers.
• Challenges with Replication Lag: Three highlighted problems:
"Reading Your Own Writes," "Cross-Device Consistency," and
"Problems with Replication Lag."
• Solutions for Read-After-Write Consistency: Various techniques,
including leader reads, time-based decisions, and timestamp
tracking.
Reading Your Own
Writes
• User Data Submission: Users can submit data like comments or
records in applications.
• Asynchronous Replication Challenge: Viewing data shortly after
writing may lead to perceived data loss.
• Eventual Consistency: Term coined by Douglas Terry, popularized
by Werner Vogels; a common goal for NoSQL projects.
• Read-After-Write Consistency: Ensures users always see updates
they submitted upon page reload.
• Implementation Techniques: Various methods, e.g., reading from
leader when user might have modified data.
In this situation, we need read-after-write
consistency, also known as read-your-
writes consistency. This guarantees that
users will always see any updates they
submit if they reload the page. It makes
no promises about other users: their
updates may only be visible later.
However, it reassures the user that their
input has been saved correctly. A user makes a write, followed
by a read from a stale replica. To
prevent this anomaly, we need
read-after-write consistency
Monotonic Reads
• Avoiding Time Reversal: Users might observe time moving
backward when reading from different replicas.
• Monotonic Reads Guarantee: Ensures users do not read older data
after previously reading newer data.
• Replica Selection: Consistent replica selection for each user,
possibly based on a hash of the user ID.
For example, Figure shows user 2345
making the same query twice, first to a
follower with little lag, then to a follower
with greater lag.
The first query returns a comment that
was recently added by user 1234, but the
second query doesn’t return anything
because the lagging follower has not yet
A user first reads from a fresh picked up that write.