Unit IV - Distributed Transaction Processing
Unit IV - Distributed Transaction Processing
Processing
What is a Distributed Transaction?
Definition: A distributed transaction spans multiple nodes (or systems) and involves multiple operations across databases or
resources that need to be consistent and reliable.
Goal: Ensure ACID properties (Atomicity, Consistency, Isolation, Durability) even across distributed systems.
Example: An online banking transaction transferring funds from one bank to another, involving two different databases.
https://systemdesignschool.io/fundamentals/distributed-transactions/distributed-transaction-sample.png
Why Distributed Transactions?
Reliability: Protects data integrity even in cases of network failure or partial system failure.
Consistency Across Systems: Ensures all parts of a distributed system reflect the same information.
Real-World Applications: E-commerce, financial services, booking systems, etc., often need reliable distributed transactions.
Core Concepts in Distributed Transactions
Transactions: A series of operations that must all succeed or fail together.
Nested Transactions: A hierarchy of sub-transactions that allow parts of a larger transaction to be isolated or rolled back
independently.
Concurrency Control: Mechanisms to handle multiple transactions accessing the same data simultaneously.
A transaction strategy that delays committing as long as possible to catch business processing failures before they occur. This
strategy relaxes ACID properties across multiple transactional resources, which can lead to system inconsistency in a worst-case
scenario.
Limitations: Not suitable for most distributed systems since it cannot guarantee atomicity if a failure occurs midway (e.g., if one node
commits while another fails).
Use Case: Useful when there is minimal risk of node failure or for simple, non-critical transactions.
Two-Phase Commit (2PC)
Overview: The most common protocol for distributed transactions, ensuring that all participating nodes agree to commit or abort.
Phases:
● Phase 1 (Prepare): The coordinator node sends a “prepare” request to all participants, asking if they are ready to commit.
Each participant responds with a “yes” (prepared) or “no” (not prepared).
● Phase 2 (Commit/Abort): If all participants vote “yes,” the coordinator sends a “commit” command to finalize the transaction.
If any participant votes “no,” the coordinator sends an “abort” command, and the transaction is rolled back on all nodes.
Drawbacks: Vulnerable to blocking; if the coordinator fails during the commit phase, participants may be left waiting indefinitely.
Network delays can also cause a transaction to stay open for longer than desired.
Use Cases: Financial transactions, online payment systems, where strict consistency is necessary.
Types of Failures in Distributed Transactions
Types of Failures in Distributed Transactions
1. Coordinator Failure: The coordinator node, which is responsible for managing and directing the transaction, may fail before
completing the transaction.
2. Participant Failure: A participant node (any node involved in the transaction) may fail during any phase of the transaction.
3. Network Failure: Communication between nodes may be interrupted due to network issues, causing messages to be delayed
or lost.
4. Timeouts: Delays in response from a participant or coordinator may cause timeouts, potentially leaving transactions in an
unknown state.
5. Data Inconsistency: Due to network issues or system errors, data across nodes can become inconsistent if a transaction is
not fully completed on all nodes.
Fault Tolerance of Two Phase Commit
The Two-Phase Commit (2PC) protocol is a commonly used method for ensuring atomicity in distributed transactions across multiple
nodes. However, distributed systems are prone to various types of failures that can interrupt the transaction process.
A. Coordinator Failure Handling
● During the Prepare Phase: If the coordinator fails after sending the “prepare” request but before receiving responses,
participants remain in a waiting state. They don’t proceed to commit until they receive instructions from the coordinator or
timeout.
● During the Commit Phase: If the coordinator fails after sending the “commit” message, participants who received the commit
message complete the transaction. However, participants who didn’t receive it might remain undecided, which can lead to a
blocking problem (they wait indefinitely). To handle this:
○ Timeouts: Participants may implement a timeout and initiate recovery protocols to either retry or abort.
○ Logging: The coordinator typically logs transaction status updates persistently. In case of a coordinator restart, it can
use these logs to determine the transaction’s last known state and decide whether to complete or abort it.
B. Participant Failure Handling
● During the Prepare Phase: If a participant fails before responding to the “prepare” request, the coordinator will wait for a
response. In case of a timeout, the coordinator will assume a failure and send an “abort” message to all participants who
responded “yes” to prevent a partial commit.
● During the Commit Phase: If a participant fails after receiving the “commit” message, it is expected to complete the
transaction once it recovers. This is because the transaction outcome is recorded in the participant’s logs, so it can “replay”
and complete the operation after recovery.
● Logging and Recovery: Each participant logs its transaction state (e.g., prepared, committed). In case of a crash, participants
use these logs to resume from their last state upon recovery, ensuring they complete or abort the transaction as instructed.
C. Network Failure Handling
● Message Loss: If a network issue causes the “prepare” or “commit” message to be lost, the coordinator and participants will
rely on timeouts.
○ If the coordinator doesn’t receive responses from participants, it aborts the transaction.
○ If participants don’t receive the final commit or abort message, they may enter a blocking state, waiting indefinitely
until the coordinator re-sends the message or they timeout.
● Partition Tolerance: 2PC has limited partition tolerance since it does not have mechanisms to allow nodes to continue
independently when isolated from others. This can lead to deadlocks or blocking in cases of network partitioning.
D. Timeout Handling
Coordinator Timeouts: If participants don’t hear back from the coordinator within a predefined time limit, they can assume that the
transaction is aborted and roll back any tentative changes.
Participant Timeouts: If the coordinator does not receive responses from some participants, it can time out and send an “abort”
command to all nodes, assuming that at least one participant was not ready.
E. Data Inconsistency Handling
● The 2PC protocol does not inherently solve data inconsistency if nodes fail after partial completion of the commit phase.
○ Atomic Operations: The protocol relies on participants committing operations atomically to ensure that either all
changes are made, or none are, if a failure occurs.
Limitations of Two-Phase Commit in Handling Failures
While 2PC provides a framework for handling certain failures, it has limitations:
● Blocking Problem: If the coordinator fails after sending the “prepare” message, participants remain in a blocked state, unable
to decide without the coordinator’s input. This blocking issue is particularly problematic if the coordinator fails permanently.
● High Latency: Due to the requirement of confirmation from all participants, 2PC can be slow, especially in high-latency or
unreliable networks.
● Lack of Partition Tolerance: 2PC is not well-suited for network partitions (e.g., in distributed systems with frequent
disconnections), as it depends on all participants being reachable.
For these reasons, more advanced protocols, like Three-Phase Commit (3PC) or Consensus-Based Protocols (e.g., Paxos, Raft),
are sometimes used to provide stronger fault tolerance and reduce blocking.
Three Phase Commit
Overview: A protocol that enhances the 2PC by reducing the likelihood of blocking if the coordinator fails. It introduces an extra phase
to improve fault tolerance.
Phases:
● Phase 1 (CanCommit): Similar to the prepare phase in 2PC, the coordinator asks participants if they are ready to commit.
● Phase 2 (PreCommit): If all participants are ready, the coordinator sends a “pre-commit” command to all participants,
signaling them to prepare for commit but not yet finalize.
● Phase 3 (Commit/Abort): After receiving acknowledgments from all participants, the coordinator sends the final “commit”
command.
Strengths: Reduces the chance of a distributed transaction hanging if the coordinator fails. Allows participants to reach a safe state
and makes it easier to recover from failures.
Drawbacks: More complex than 2PC, with additional communication overhead due to the extra phase.
Use Cases: Suitable for distributed systems with frequent network issues or higher failure risks.
Three Phase Commit Phases
The 3PC protocol consists of three phases, designed to reduce the risk of blocking by ensuring that nodes reach a common state
before the final commit:
1. CanCommit / Vote Collection Phase: The coordinator sends a “prepare” or “can-commit?” request to each participant, asking
if they are ready to commit.
○ Response: Each participant responds with a “Yes” (prepared) or “No” (not prepared).
2. PreCommit Phase / Dissemination Phase: If all participants respond affirmatively in the first phase, the coordinator sends a
“pre-commit” message to signal that the transaction is in a pending state.
○ Action: Upon receiving the pre-commit message, participants enter a prepared-to-commit state and save this
information in their logs to ensure they can commit if the coordinator fails.
3. Commit Phase / Decision Phase: Once all participants acknowledge the pre-commit message, the coordinator sends a final
“commit” command.
○ Action: Participants commit the transaction and confirm completion.
Fault Tolerance of Three Phase Commit
The 3PC protocol handles failures more effectively than 2PC by allowing nodes to
reach a safe state before the commit. Here’s how it addresses each failure type:
A. Coordinator Failure Handling
1. During the CanCommit Phase:
○ If the coordinator fails after sending the “can-commit?” request, participants remain in their current state (either ready or
not ready) and take no further action.
○ Timeouts: Participants may use timeouts. If they don’t hear back from the coordinator, they assume the transaction will
be aborted.
2. Partition Tolerance:
○ In network partitions, 3PC minimizes the risk of blocking by ensuring that nodes can independently reach a decision.
Participants in the pre-commit phase can decide to abort if they detect prolonged network failure or continue once
communication is restored.
D. Timeout Handling
● Coordinator Timeout: If participants don’t receive the final decision (commit or abort) from the coordinator, they can
independently determine the transaction’s outcome based on the pre-commit state.
● Participant Timeout: If a participant doesn’t respond to a “can-commit?” or “pre-commit” message, the coordinator will abort
the transaction after a timeout, ensuring no inconsistent state.
E. Data Inconsistency Handling
● Pre-Commit Phase as a Safe State: The addition of the pre-commit phase ensures that all participants reach a prepared-to-
commit state before the final commit. This minimizes the chance of data inconsistencies, as participants only finalize the
transaction when they are sure of the commit instruction.
● Persistent Logging: Both the coordinator and participants maintain logs of their transaction states. If any node fails, it uses
the log to determine the last known state and resume the transaction without inconsistency.
Advantages of 3PC Over 2PC
● Non-Blocking: Unlike 2PC, 3PC avoids the blocking problem by ensuring all participants enter a safe state (pre-commit)
where they can decide to complete or abort without waiting indefinitely.
● Improved Fault Tolerance: The pre-commit phase allows nodes to determine the final state without needing continuous
coordination, reducing the risk of hanging transactions.
● Higher Reliability: By separating the prepare and commit phases with an intermediate step, 3PC can handle partial failures
and network partitions more gracefully than 2PC.
Sagas
Overview: A long-running transaction model that breaks down a global transaction into a series of smaller, local transactions, each
with a compensating transaction that can be called to undo it if needed.
Process: Each step in the saga is executed independently, with each participant confirming its success. If a step fails, the saga
executes compensating actions to revert the entire sequence.
Strengths: Ideal for systems requiring eventual consistency rather than strict ACID compliance. Offers flexibility and better
performance in distributed systems where strict locking is infeasible.
Drawbacks: Requires careful planning for compensating actions, which may not always be possible in complex workflows.
Use Cases: Common in microservices architectures and cloud-native applications (e.g., travel booking systems where a hotel, flight,
and car rental may all need to be coordinated).
Advantages of Sagas
1. Scalability: Sagas enable scalability by breaking down a large transaction into a series of smaller, independent transactions,
each handled by separate services.
2. Failure Isolation: With compensating transactions, Sagas handle failures gracefully by rolling back only the affected service’s
state, avoiding a full transaction abort.
3. Asynchronous Execution: Each step in a Saga can be processed asynchronously, allowing services to operate
independently and reducing system bottlenecks.
4. Better Performance: Since Sagas don’t lock resources across services, they reduce the likelihood of blocking, leading to
faster execution and improved overall performance.
5. Fault Tolerance: Sagas can handle intermittent failures, as each service transaction can retry independently until successful or
compensated.
Disadvantages of Sagas
1. Complexity in Design: Implementing Sagas requires careful orchestration of each step and compensating actions, which
adds to system complexity.
2. Data Consistency Challenges: Sagas only guarantee eventual consistency, which may not meet the strict consistency needs
of some applications.
3. Error Handling Complexity: Designing and implementing compensating actions for each service can be complex and may not
be feasible for every operation.
4. Risk of Partial Success: If compensating transactions fail, some changes may persist, leading to partially completed
transactions and potential data inconsistencies.
5. Coordination Overhead: Either a central orchestrator or a robust choreography pattern is required to manage and monitor the
transaction flow, adding operational overhead.
Consensus Based Protocols
Overview: Not a transaction protocol per se, but consensus algorithms are used to ensure a consistent state across distributed
systems, particularly for replicated databases or fault-tolerant distributed systems.
Process: Nodes follow a leader election and voting process to agree on a single source of truth (e.g., which transaction is committed).
Strengths: High fault tolerance, ensures all or none commit across a large number of nodes without relying on a single coordinator.
Drawbacks: Typically slower due to the complex consensus mechanism; may not suit high-speed, real-time transaction systems.
Use Cases: Distributed database replication, blockchain systems, and leader election processes in distributed services.
Advantages of Consensus-Based Protocols
● Fault Tolerance: Achieves reliability in distributed systems, tolerating node failures and network issues.
● Strong Consistency: Ensures all nodes agree on the same data, critical for consistency in distributed databases and services.
● Resilience to Single Points of Failure: Avoids dependency on a single node, enhancing system resilience.
● High Integrity for Critical Applications: Suitable for environments requiring high integrity, like financial services and
blockchain.
Disadvantages of Consensus-Based Protocols
● High Latency and Resource Usage: The coordination required for consensus increases latency and requires significant
network and compute resources.
● Scalability Limitations: Consensus becomes more complex and slower as the network size grows.
● Complex Implementation: Protocols like Paxos and Raft are challenging to implement correctly and maintain.
● Vulnerability to Network Partitions: Communication failures or network partitions can prevent nodes from reaching
consensus.