0% found this document useful (0 votes)
0 views15 pages

DSC4

This document provides an overview of transactions and concurrency control in distributed systems, emphasizing the ACID properties that ensure data integrity. It discusses various concurrency control techniques, including locking protocols, optimistic concurrency control, and timestamp ordering, as well as the challenges and methods for managing distributed transactions and deadlocks. Additionally, it covers transaction recovery mechanisms to maintain consistency across multiple nodes in the event of failures.

Uploaded by

satlalahari2
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
0 views15 pages

DSC4

This document provides an overview of transactions and concurrency control in distributed systems, emphasizing the ACID properties that ensure data integrity. It discusses various concurrency control techniques, including locking protocols, optimistic concurrency control, and timestamp ordering, as well as the challenges and methods for managing distributed transactions and deadlocks. Additionally, it covers transaction recovery mechanisms to maintain consistency across multiple nodes in the event of failures.

Uploaded by

satlalahari2
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

UNIT 4

Transactions and Concurrency Control – Introduction


In distributed systems, a transaction is a sequence of operations performed as a single
logical unit of work. It must follow the ACID properties:

●​ Atomicity – All or nothing​

●​ Consistency – Preserves data integrity​

●​ Isolation – No interference from other transactions​

●​ Durability – Changes survive failures​

🔁 Why Concurrency Control?


When multiple transactions run at the same time (concurrently), they may conflict and
cause problems like:

●​ Dirty reads​

●​ Lost updates​

●​ Inconsistent data​

Concurrency control ensures that even with many users, transactions execute safely and
correctly, just like if they were run one after another (serially).

⚙️ Common Techniques:
●​ Locking protocols (e.g., Two-phase locking)​

●​ Timestamp ordering​

●​ Optimistic concurrency control​

🧠 Example:
In a banking app:

●​ Transaction 1 transfers money​

●​ Transaction 2 reads balance​


Without proper concurrency control, the read might show the wrong balance.
UNIT 4
Transactions
A transaction is a group of operations executed as a single unit. It follows the ACID
properties:

●​ Atomicity: All steps succeed or none do​

●​ Consistency: System remains valid before and after​

●​ Isolation: Transactions don't interfere​

●​ Durability: Changes are permanent after commit​

Example:​
Transferring ₹100 from A to B:

1.​ Deduct from A​

2.​ Add to B​
Both steps must succeed together.​

Nested Transactions
A nested transaction is a transaction within a transaction.

●​ The main (parent) transaction may spawn multiple sub-transactions.​

●​ Sub-transactions can run independently and even in parallel.​

Key Points:

●​ If a sub-transaction fails, the parent may retry or abort.​

●​ The parent commits only if all sub-transactions succeed.​

●​ Increases modularity, error recovery, and concurrency.​

Example:​
Booking a holiday package:

●​ Flight booking → Sub-transaction​

●​ Hotel booking → Sub-transaction​

●​ Cab booking → Sub-transaction​


All must succeed; if one fails, the whole booking is rolled back.
UNIT 4
Locks in Distributed Systems
Locks are mechanisms used to control access to shared resources and ensure mutual
exclusion during concurrent transactions.

They help prevent problems like:

●​ Lost updates​

●​ Dirty reads​

●​ Inconsistent states​

🔐 Types of Locks:
1.​ Binary Lock (Mutex Lock):​

○​ Only one transaction can hold the lock.​

○​ Either locked (1) or unlocked (0).​

2.​ Shared (Read) Lock:​

○​ Multiple transactions can read the data.​

3.​ Exclusive (Write) Lock:​

○​ Only one transaction can read/write.​

🔄 Two-Phase Locking (2PL):


Ensures serializability using two phases:

1.​ Growing Phase: Locks are acquired.​

2.​ Shrinking Phase: Locks are released; no new locks can be acquired.​

⚠️ Drawbacks:
●​ Can cause deadlocks (circular wait).​

●​ May reduce concurrency.​


UNIT 4
Example:​
If Transaction A locks a bank account to transfer money, no other transaction can modify
that account until A finishes.

Optimistic Concurrency Control (OCC)


Optimistic Concurrency Control assumes that conflicts are rare, so transactions proceed
without locking. Instead of preventing conflicts, OCC detects and resolves them at the end
of the transaction.

⚙️ How It Works:
OCC works in three phases:

1.​ Read Phase:​

○​ Transaction reads data and performs operations locally (without locking).​

2.​ Validation Phase:​

○​ Before committing, the system checks if this transaction conflicts with others.​

3.​ Write Phase:​

○​ If validation passes, changes are written to the database.​

○​ If not, the transaction is rolled back and retried.​

✅ Advantages:
●​ No locks → High concurrency​

●​ Good for read-heavy or low-conflict systems​

⚠️ Disadvantages:
●​ Costly rollbacks if conflicts are frequent​

🧠 Example:
Two users editing the same profile:

●​ Both read the original data.​


UNIT 4
●​ Both make changes.​

●​ At commit, the system validates and allows the first, but rejects the second if data
was modified.

Timestamp Ordering in Concurrency Control


Timestamp Ordering (TO) is a concurrency control method that uses timestamps to
ensure transactions execute in a serial order based on their start time.

Each transaction gets a unique timestamp (TS) when it starts.

⚙️ Basic Idea:
●​ The system enforces a serial order: older transactions (with smaller timestamps)
appear to execute before newer ones.​

●​ Every read and write is allowed only if it doesn't violate this order.​

📘 Key Rules:
For a data item X:

●​ read_TS(X) = timestamp of the last transaction that read X​

●​ write_TS(X) = timestamp of the last transaction that wrote X​

When a transaction T wants to:

●​ Read X: Allowed only if T_TS ≥ write_TS(X)​

●​ Write X: Allowed only if T_TS ≥ read_TS(X) and T_TS ≥ write_TS(X)​

If not, the transaction is aborted and restarted.

✅ Advantages:
●​ Ensures serializability​

●​ No locks → no deadlocks​

⚠️ Disadvantages:
UNIT 4
●​ More aborts if conflicts are common​

●​ Not ideal for high-contention environments​

🧠 Example:
●​ T1 (TS=5) reads X​

●​ T2 (TS=10) tries to write X — allowed​

●​ T0 (TS=2) tries to write X — aborted (T0 is older, but comes late)

Distributed Transactions – Introduction


A distributed transaction is a transaction that accesses data stored on multiple
networked systems or databases. It ensures that the transaction is executed consistently
across all involved nodes.

⚙️ Why It’s Needed:


In distributed systems, different parts of data may reside on:

●​ Different servers​

●​ Different locations​

●​ Different databases​

A distributed transaction ensures that all parts either commit or roll back together —
maintaining global consistency.

🧱 Example:
Transferring money:

●​ Debit from Bank A (Server 1)​

●​ Credit to Bank B (Server 2)​


If one fails, both must rollback to maintain correctness.​
UNIT 4
🔐 Challenges:
●​ Network delays or failures​

●​ Node crashes​

●​ Ensuring atomicity and consistency across systems​

✅ Handled By:
●​ Two-Phase Commit Protocol (2PC)​

●​ Three-Phase Commit (3PC)​

●​ Consensus-based methods (e.g., Paxos, Raft)

Flat Distributed Transactions


A flat transaction is a single, unified transaction that spans multiple distributed systems or
databases.

●​ It starts, performs all operations, and then commits or aborts as one unit.​

●​ All involved systems participate together.​

Limitation: If one part fails, the entire transaction is rolled back.

Example:​
Booking a flight and hotel together from different servers in a single transaction.

Nested Distributed Transactions


A nested transaction is a hierarchical structure of one main (parent) transaction and
multiple sub-transactions.

●​ Sub-transactions can commit independently, but the parent commits only if all
children succeed.​

●​ Supports modularity, partial recovery, and parallelism.​

Example:​
Online order system:

●​ Parent transaction: Place order​

○​ Sub 1: Deduct inventory (Warehouse server)​


UNIT 4
○​ Sub 2: Process payment (Bank server)​

○​ Sub 3: Update user history (User DB server)

Atomic Commit Protocols


In distributed transactions, atomic commit protocols ensure that all participating
systems either commit or abort a transaction together, maintaining consistency across all
nodes.

They are used to coordinate agreement among multiple databases in a


distributed system.

✅ 1. Two-Phase Commit (2PC)


The most commonly used atomic commit protocol. It works in two phases:

🔸 Phase 1: Prepare
●​ Coordinator sends "prepare to commit" to all participants.​

●​ Each participant replies with "Yes" (vote to commit) or "No" (abort).​

🔸 Phase 2: Commit/Abort
●​ If all reply "Yes", the coordinator sends commit.​

●​ If any reply "No", it sends abort to all.​

Drawback: If the coordinator crashes during phase 2, participants may be stuck waiting →
can cause blocking.

✅ 2. Three-Phase Commit (3PC)


Improves 2PC by adding an extra phase to avoid blocking.

Phases:

1.​ CanCommit: Coordinator asks if participants can commit.​

2.​ PreCommit: If all say yes, coordinator sends "prepare to commit".​

3.​ DoCommit: Then sends final commit.​

Advantage: Non-blocking; safer in case of failures.​


Trade-off: More complex and involves more messages.
UNIT 4
Concurrency Control in Distributed Transactions
In distributed systems, concurrency control ensures that multiple transactions executing
across different nodes do not interfere with each other and maintain consistency, just like
in a single centralized system.

✅ Goal: Ensure serializability — the outcome should be as if


transactions ran one after another, even if they run in parallel across
nodes.

⚙️ Challenges:
●​ Transactions span multiple systems​

●​ Delays, network failures​

●​ Need to synchronize locks or versions across nodes​

🔐 Common Techniques:
🔸 1. Distributed Two-Phase Locking (2PL)
●​ Locks are acquired across all nodes before access​

●​ Two phases: growing (acquire locks) and shrinking (release locks)​

●​ Ensures serializability but can cause deadlocks​

🔸 2. Timestamp Ordering
●​ Each transaction gets a global timestamp​

●​ Transactions execute in timestamp order​

●​ Helps avoid locks but can lead to more aborts​

🔸 3. Optimistic Concurrency Control (OCC)


●​ Transactions run without restrictions​

●​ Validation occurs at commit time​

●​ Good for low-conflict environments​


UNIT 4

🧠 Example:
Two users updating the same bank account across servers:

●​ Without proper control, both may overwrite each other's updates​

●​ With concurrency control, only one update succeeds or they are ordered correctly

Distributed Deadlocks
✅ What is a Deadlock?
A deadlock occurs when two or more transactions are waiting indefinitely for each other to
release resources, and none of them can proceed.

In a distributed system, resources and transactions are spread across multiple nodes,
making deadlock more complex and harder to detect.

🧱 Conditions for Deadlock (Coffman’s Conditions)


A deadlock occurs when all four of these hold:

1.​ Mutual Exclusion: Only one process can use a resource at a time.​

2.​ Hold and Wait: A process holding one resource is waiting for another.​

3.​ No Preemption: Resources cannot be forcibly taken away.​

4.​ Circular Wait: A closed chain of processes exists where each holds a resource the
next wants.​

🧠 Example in a Distributed System:


●​ Transaction T1 (on Node A) locks resource X and requests resource Y​

●​ Transaction T2 (on Node B) locks resource Y and requests resource X​

●​ Both transactions are waiting on each other → Deadlock across nodes.​

This cycle is not visible from a single node, so special techniques are needed.

🔍 Distributed Deadlock Detection


UNIT 4
Since deadlocks are harder to observe in a distributed environment, we use:

🔸 1. Centralized Detection
●​ A central coordinator collects the wait-for graphs from all nodes.​

●​ It merges them into a global wait-for graph.​

●​ If it detects a cycle, it identifies a deadlock.​

Pros: Simple and fast​


Cons: Single point of failure, communication overhead

🔸 2. Distributed Detection
●​ Each node monitors local waits.​

●​ Nodes communicate to share wait dependencies.​

●​ When a cycle is detected across messages, a deadlock is declared.​

Pros: No central point; more fault-tolerant​


Cons: Complex, more messages required

🔸 3. Timeout-Based Detection
●​ If a transaction waits too long, it is assumed to be deadlocked.​

●​ The system automatically aborts the transaction.​

Pros: Simple​
Cons: May abort non-deadlocked transactions (false positives)

🔁 Deadlock Resolution Techniques


Once a deadlock is detected, the system must break the cycle by aborting one or more
transactions.

🔸 Strategies to Choose Victim:


UNIT 4
●​ Youngest transaction (by timestamp)​

●​ Transaction with least progress​

●​ Transaction with smallest cost​

●​ Random selection​

🛡️ Deadlock Prevention Protocols


To avoid deadlocks proactively, systems can use timestamp-based schemes:

🔹 1. Wait-Die Scheme
●​ Older transactions can wait for younger ones.​

●​ Younger transactions trying to lock a resource held by an older one are aborted
(die).​

🔹 2. Wound-Wait Scheme
●​ Older transactions wound (abort) younger ones holding the lock.​

●​ Younger ones wait if the lock is held by an older one.​

Both schemes avoid circular wait conditions.

Transaction Recovery
✅ What is Transaction Recovery?
Transaction recovery is the process of restoring a system to a consistent state after a
failure occurs (e.g., system crash, network failure, transaction abort).

In distributed systems, it ensures that all the nodes involved in a transaction:

●​ Either commit together​

●​ Or abort together​

Ensuring Atomicity and Consistency — the AC in ACID properties.


UNIT 4
🧱 Types of Failures That Need Recovery
1.​ Transaction Failures – Logical errors or user aborts​

2.​ System Failures – Crashes or power loss​

3.​ Media Failures – Disk corruption or data loss​

4.​ Communication Failures – Message lost or delayed in the network​

5.​ Coordinator/Participant Crash – In protocols like 2PC​

🔁 Recovery Mechanisms
🔸 1. Logging (Write-Ahead Log - WAL)
●​ Every transaction's changes and status (BEGIN, COMMIT, ABORT) are written to a
log file before being applied to the database.​

●​ Helps in redoing committed transactions and undoing uncommitted ones during


recovery.​

Log Records Format:

csharp

CopyEdit

[Transaction_ID, Action, Data_Item, Old_Value, New_Value]

🔸 2. Checkpoints
●​ A snapshot of the current system state is saved periodically.​

●​ During recovery, the system restores the last checkpoint and applies log records
after that.​

🔸 3. Undo and Redo Operations


●​ Undo: Applied to uncommitted transactions (rollback).​
UNIT 4
●​ Redo: Applied to committed transactions (reapply changes).​

🔐 Recovery in Distributed Transactions


When a distributed transaction involves multiple nodes, coordinated recovery is essential.

🔸 1. Two-Phase Commit (2PC) Recovery


●​ If a participant crashes after voting YES in 2PC but before receiving commit:​

○​ It checks its log on recovery:​

■​ If log has “VOTE-YES” but no COMMIT/ABORT → it contacts


coordinator to know the final decision.​

■​ If it has ABORT → it rolls back.​

■​ If it has COMMIT → it completes the commit.​

🔸 2. Coordinator Crash Recovery


●​ When the coordinator crashes before sending the final decision:​

○​ Participants may be blocked, waiting indefinitely.​

○​ This is called the blocking problem in 2PC.​

🧠 Example:
Suppose a transaction T1 involves 3 servers:

1.​ All vote YES in phase 1.​

2.​ Coordinator sends COMMIT to 2 servers, but crashes before sending to the 3rd.​

3.​ On recovery:​

○​ The 3rd server checks its logs.​

○​ It may contact other servers to know the outcome and act accordingly.
UNIT 4

You might also like