0% found this document useful (0 votes)
2 views51 pages

Module 4

The document discusses distributed transactions, which are operations executed across multiple systems to maintain data integrity and consistency. It covers the ACID properties essential for transactions, the differences between flat and nested transactions, and the importance of transaction recovery techniques in distributed systems. Additionally, it addresses challenges like deadlocks and various recovery strategies, including logging and consensus algorithms.

Uploaded by

shivampoddar171
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views51 pages

Module 4

The document discusses distributed transactions, which are operations executed across multiple systems to maintain data integrity and consistency. It covers the ACID properties essential for transactions, the differences between flat and nested transactions, and the importance of transaction recovery techniques in distributed systems. Additionally, it addresses challenges like deadlocks and various recovery strategies, including logging and consensus algorithms.

Uploaded by

shivampoddar171
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 51

Distributed Transactions,

Mutual exclusions, dead locks


By, Prof. Ankita Mandore
CSE (Data Science), DSCE
Distributed Transactions

 A group of operations that is independently executed for data retrieval or


updates.
 A distributed transaction spans multiple systems, ensuring all operations
either succeed or fail together, crucial for maintaining data integrity and
consistency across diverse and geographically separated resources in modern
computing environments.
Introduction

 A distributed Transaction is a logical unit of work on database that access


objects managed by multiple servers.
 A distributed Transaction is composed of several sub-transaction, each running
on different sites. Each Database Manager can decide to abort.
 When the transaction is submitted the transaction at different sites and
coordinate their activity.
ACID Properties
 Consistency:
 Ensuring that all changes made as part of a transaction are committed or rolled
back atomically.
 Constraints are satisfied
 Maintaining data integrity.
 Isolation:
 Guaranteeing that concurrent transactions do not interfere with each other,
preserving data integrity and preventing conflicts.
 Durability:
 Confirming that committed transactions persist even in the event of system
failures, ensuring reliability.
 Atomicity:
 Ensuring that either all operations within a transaction are completed successfully
or none of them are, avoiding partial updates that could lead to inconsistencies.
Working of Distributed Transactions

 The working of Distributed Transactions is the same as that of simple


transactions but the challenge is to implement them upon multiple databases.
 Due to the use of multiple nodes or database systems, there arises certain
problems such as network failure, to maintain the availability of extra
hardware servers and database servers.
 For a successful distributed transaction the available resources are
coordinated by transaction managers.
Flat & Nested Distributed Transactions
Flat Distributed Transactions
 Flat Transaction :
 Completes each of its request before going on next transaction
 Therefore, each transaction accesses the server’s object sequentially.
 A flat transaction has a single initiating point(Begin) and a single end point(Commit
or abort).
 They are usually very simple and are generally used for short activities rather than
larger ones.
 A client makes requests to multiple servers in a flat transaction.
 Transaction T, for example, is a flat transaction that performs operations on
objects in servers X, Y, and Z.
 Before moving on to the next request, a flat client transaction completes the
previous one.
 As a result, each transaction visits the server object in order.
A transaction can only wait for one object at a time when servers utilize locking.
Limitations of a flat Transaction

 All work is lost in the event of a crash.


 Only one DBMS may be used at a time.
 No partial rollback is possible.
Nested Distributed Transactions
 Sub-transactions at the same level can run concurrently.
 So, T1 and T2 are concurrent, and as they invoke objects in different servers, they
can run in parallel.
 A transaction that includes other transactions within its initiating point and a end
point are known as nested transactions. So the nesting of the transactions is done
in a transaction. The nested transactions here are called sub-transactions.
 The top-level transaction in a nested transaction can open sub-transactions, and
each sub-transaction can open more sub-transactions down to any depth of
nesting.
 A client’s transaction T opens up two sub-transactions, T1 and T2, which access
objects on servers X and Y, as shown in the diagram below.
 T1.1, T1.2, T2.1, and T2.2, which access the objects on the servers M,N, and P,
are opened by the sub-transactions T1 and T2.
 Concurrent Execution of the Sub-transactions is done which are at the same level –
in the nested transaction strategy.
 Here, in the above diagram, T1 and T2 invoke objects on different servers and
hence they can run in parallel and are therefore concurrent.
 T1.1, T1.2, T2.1, and T2.2 are four sub-transactions. These sub-transactions can
also run in parallel.
Advantage of Nested Distributed
Transaction
 The performance is higher than a single transaction in which four operations
are invoked one after the other in sequence.
 Sub transaction at one level may run concurrently with other sub transaction
at the same level.
 Sub Transaction can commit or abort independently.
Nested Banking Transaction
The coordinator of Flat distributed
Transaction
The coordinator of Flat distributed
Transaction
A Flat Banking Transaction
Two-Phase commit protocol for nested
transaction
 Top level transaction T and sub-transaction T1,T2,T11,T12, T21, T22
 A sub-transaction starts after its parent and finishes before it.
 When a sub-tansaction completes, it makes an independent decision either to
commit provisionally or to abort.
 A provisional commit is not the same as being prepared it is a local decision and is
not backed up on permanent storage
 If the server crashes subsequently, its replacement will not be able to carry out a
provisional commit.
 A two phase commit protocol is needed for nested transactions.
Operations in coordinator for nested
transaction
 openSubTransaction(trans)->subTrans
 Opens a new sub-transactions whose parent is trans and returns a unique sub-
transation identifier
 getStatus(trans)->committed, aborted, provisional
 Ask the coordinator to report on the status of the transactions trans.
 Returns values representing one of following: committed, aborted, provisional
Transaction T decides whether to
commit
Information held by coordinators of
nested transactions
 When a top-level transaction commits it carries out a 2PC
 Each coordinator has a list of its sub-transactions
 At provisional commit, a sub-transaction reports its status and the status of
its descendants to its parent.
 If a sub transaction aborts, it tells its parent.
Information held by coordinators of
nested transactions
Transaction Recovery in Distributed System

 In distributed systems, ensuring the reliable recovery of transactions after


failures is crucial.
 This article explores essential recovery techniques, including check-pointing,
logging, and commit protocols, while addressing challenges in maintaining
ACID properties and consistency across nodes to ensure system resilience and
data integrity.
Importance of Transaction Recovery in
Distributed Systems
 Data Consistency: Distributed systems often involve multiple nodes handling
transactions simultaneously. Transaction recovery ensures that, despite failures or
disruptions, all nodes maintain a consistent state, preserving the integrity of the
data.
 Fault Tolerance: Failures—whether due to network issues, hardware malfunctions,
or software bugs—are inevitable. Effective transaction recovery mechanisms help
the system recover gracefully from these failures, preventing data loss or
corruption.
 System Reliability: Reliable transaction recovery enhances overall system
robustness, ensuring that applications and services remain operational even when
individual components fail. This is critical for maintaining user trust and system
uptime.
 Atomicity of Transactions: Transactions must be atomic, meaning they either
complete entirely or not at all. Recovery mechanisms ensure that partial or failed
transactions are rolled back, avoiding inconsistencies and incomplete operations.
 Durability: Once a transaction is committed, its effects must persist even in the
face of failures. Transaction recovery mechanisms ensure that committed changes
are not lost, supporting the durability property of transactions.
Challenges in Distributed Transaction
Recovery
 Network Failures
 Issue: Network partitions or failures can disrupt communication between nodes involved
in a distributed transaction.
 Challenge: Ensuring that transactions remain consistent and recoverable despite
communication breakdowns. This often requires sophisticated protocols to handle retries
and eventual reconnection.
 Partial Failures
 Issue: Some nodes may fail while others continue to operate. This can lead to
inconsistencies if a transaction is partially completed.
 Challenge: Coordinating recovery so that all nodes reach a consistent state, either by
completing or rolling back the transaction. This involves complex recovery protocols to
handle node-specific failures and rollbacks.
 Commit Protocols
 Issue: Distributed commit protocols like Two-Phase Commit (2PC) and Three-Phase
Commit (3PC) are designed to ensure that all nodes agree on a transaction outcome, but
they can be complex and prone to issues such as blocking.
 Challenge: Implementing these protocols requires careful handling of coordinator and
participant states to prevent issues like blocking in 2PC or increased overhead in 3PC.
Challenges in Distributed Transaction
Recovery
 Consistency Models
 Issue: Different distributed systems may use different consistency models (e.g.,
strong, eventual, causal).
 Challenge: Designing recovery mechanisms that align with the consistency model
of the system, ensuring that all nodes eventually converge to the same state.
 Concurrency Control
 Issue: Multiple transactions may be occurring simultaneously, leading to potential
conflicts and inconsistencies.
 Challenge: Implementing effective concurrency control mechanisms that handle
conflicting transactions and ensure that all operations comply with the ACID
properties, particularly isolation.
Recovery Techniques in Distributed
Transactions
1. Checkpointing
 Checkpointing involves periodically saving the state of a system to stable storage to
facilitate recovery in the event of a failure.
 What is Checkpointing?: The process of recording the state of a system at specific
points in time.
 Types:
 Global Checkpointing: Captures the state of all nodes in a distributed system to ensure a
consistent recovery point.
 Local Checkpointing: Captures the state of individual nodes.
 Benefits: Reduces the amount of log data that needs to be processed during
recovery, speeding up the recovery process.
 Challenges: Requires coordination across nodes to ensure that the checkpoint is
consistent across the entire system.
Recovery Techniques in Distributed
Transactions
2. Logging
 Logging involves recording changes made by transactions to support recovery in
case of failures. Logs are used to reconstruct the state of the system.
 Write-Ahead Logging (WAL): Logs changes to a transaction before applying them
to the database. This ensures that if a failure occurs, the changes can be replayed
or rolled back.
 Types of Logs:
 Redo Logs: Used to reapply changes that were committed but not yet reflected in the
system.
 Undo Logs: Used to roll back changes made by a transaction that failed or was aborted.
 Benefits: Provides a way to recover both committed and uncommitted
transactions.
 Challenges: Managing log size and ensuring that logs are not lost or corrupted.
Recovery Techniques in Distributed
Transactions
3. Two-Phase Commit (2PC)
 2PC is a protocol used to ensure that all nodes in a distributed transaction
agree on whether to commit or abort.
 Protocol Overview:
 Prepare Phase: The coordinator sends a prepare request to all participants, who
respond with a vote (commit or abort).
 Commit Phase: If all participants vote to commit, the coordinator sends a commit
message. If any participant votes to abort, the coordinator sends an abort
message.
 Benefits: Ensures atomicity across distributed transactions.
 Challenges: Susceptible to blocking if a node fails during the prepare phase,
and recovery can be complex.
Recovery Techniques in Distributed
Transactions
4. Three-Phase Commit (3PC)
 3PC extends 2PC to reduce the risk of blocking by adding an additional phase.
 Protocol Overview:
 Prepare Phase: Similar to 2PC, participants respond with a readiness vote.
 Pre-Commit Phase: The coordinator asks participants to prepare for commit.
Participants respond with a readiness confirmation.
 Commit Phase: The coordinator sends a commit request if all participants confirm
readiness.
 Benefits: Reduces the likelihood of blocking compared to 2PC.
 Challenges: More complex than 2PC and introduces additional communication
overhead.
Recovery Techniques in Distributed
Transactions
5. Recovery in Replicated Systems
 In systems with replication, maintaining consistency across replicas is crucial.
 Types of Replication:
 Master-Slave Replication: A single master node handles writes and propagates changes to slave
nodes.
 Multi-Master Replication: Multiple nodes handle writes, and changes are synchronized across
all nodes.
 Recovery Strategies:
 Conflict Resolution: Ensuring that conflicting changes are resolved consistently across replicas.
 Consistency Protocols: Using protocols to ensure that replicas converge to a consistent state
after a failure.
 Benefits: Provides fault tolerance and improves system availability.
 Challenges: Managing consistency and handling conflicts can be complex, especially in multi-
master setups.
Recovery Techniques in Distributed
Transactions
6. Distributed Consensus Algorithms
 Consensus algorithms help nodes agree on a single value or decision, such as
the outcome of a transaction.
 Examples:
 Paxos: A protocol for achieving consensus among a group of nodes.
 Raft: A consensus algorithm that is designed to be easier to understand and
implement than Paxos.
 Benefits: Ensures agreement on transaction outcomes and system state.
 Challenges: Achieving consensus in the presence of node failures and network
partitions can be challenging.
Distributed Deadlocks

 Deadlock is a situation where a set of processes are blocked because each


process is holding a resource and waiting for another resource acquired by
some other process.
Types Of Deadlock

 Resource Deadlocks: These occur when processes are unable to proceed


because each is holding a resource and waiting for another resource acquired
by some other process.
 Communication Deadlocks: In distributed systems, this happens when
processes are unable to communicate due to some synchronization issues.
 Livelocks: A livelock is a special case of deadlock where two or more
processes keep responding to each other's actions without making progress
Conditions for Deadlock
 1. Mutual exclusion . At least one resource in the system must be able to be
used by only one process at a time. One Process at a time can Use a resource.
 2. Hold and wait There must be a process that retains a resource while
waiting for another resource acquired by some other process.
 3.Non-Preemption:- It means that resources cannot be taken forcefully from
one process and handed over to another. Number of resources can be feasibly
removed from a processor holding it.
 4.Circular Wait:- All processes must wait for the resource in a cyclic fashion,
with the last process waiting for the first process's resource.
Distributed Deadlock Handling

You might also like