0% found this document useful (0 votes)

87 views1 page

MIT 6.824 - Lecture 12 - Distributed Transactions

The document discusses distributed transactions and how they ensure the ACID properties. It focuses on concurrency control methods like two-phase locking that guarantee serializable transactions. It also discusses how two-phase commit protocols ensure atomicity across multiple servers to maintain consistency even if some servers fail. Combining two-phase commit with a consensus algorithm like Raft improves fault tolerance by allowing the election of a new coordinator if the original fails.

Uploaded by

Sara Vana

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

87 views1 page

MIT 6.824 - Lecture 12 - Distributed Transactions

Uploaded by

Sara Vana

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 1

Timilearning About

MIT 6.824: Lecture 12 - Distributed

Transactions
13 Aug 2020 · 5 min read

Distributed databases typically divide their tables into partitions spread across
different servers which get accessed by many clients. In these databases, client
transactions often span the different servers as the transactions may need to read
from various partitions. A distributed transaction is a database transaction which spans
multiple servers.

A transaction with the correct behaviour must exhibit the following, also known as the
ACID properties:

Atomicity: Either all writes in the transaction succeed or none, even in the presence
of failures.

Consistency: The transaction must obey application-specific invariants.

Isolation: There must be no interference from concurrently executing transactions.

The ideal isolation level is serializable isolation. This guarantees that the result of the
execution of concurrent transactions will be the same as if the database executed
the transactions one after the order. I've written about serializability more extensively
here and here, and I'll recommend reading them if you're interested in learning more
about it.

Durability: Committed writes must be permanent.

These properties are more difficult to guarantee when a transaction involves multiple
servers. For example, the transaction may succeed on some servers and fail on others.
There needs to be a protocol to ensure that the database maintains atomicity even in
that scenario. Also, if several clients are executing transactions concurrently, we must
take extra care to control access to the shared data for those transactions.

This post will focus on how distributed databases provide atomicity through an atomic
commit protocol known as Two-phase commit, and how concurrency control methods
like Two-phase locking help to guarantee serializability.

Note: I've written about some of these topics in other posts on this site, so I'll be
posting links to them if you want more detail.

Table of Contents

Concurrency Control
Pessimistic Concurrency Control
Simple locking
Two-phase locking
Atomic Commit
Two-phase commit
The coordinator is a bottleneck
Two-phase commit and Raft
Further Reading

Concurrency Control #
Concurrency control ensures that concurrent transactions execute correctly, i.e., that
they are serializable. There are two classes of concurrency control for transactions:

Pessimistic: Here, a transaction must place locks on the shared data objects that it
wants to access before doing any actual reading or writing. When another
transaction wants to access any of those records, it must wait for the original
transaction to release those locks.
Optimistic: In this class, transactions read or modify records without placing any
locks on them. However, when it's time to commit the transaction, the system checks
if the reads/writes were serializable, i.e. if the transaction's results are consistent
with a serial order of execution. If not, the database aborts the transaction and
retries it.

Pessimistic concurrency control is faster if there are frequent conflicts between

concurrent transactions, while optimistic concurrency control is faster when the
conflicts are rare. We'll cover optimistic concurrency control in a later post.

Pessimistic Concurrency Control #

There are two pessimistic concurrency control mechanisms highlighted in the lecture
material for ensuring serializable transactions:

Simple locking
Two-phase locking

Simple locking #

In simple locking, each transaction must first acquire a lock for every shared data
object that it intends to read or write before it does any actual reading or writing. It
then releases its locks only after the transaction has committed or aborted.

One downside of this method is that applications that discover which objects need to
be read by reading other shared data will have to lock every object that they might
need to read. Thus, a transaction may end up locking more data objects than needed.

Two-phase locking #

Two-phase locking (or 2PL) differs from simple locking in that a transaction only
acquires locks as needed. It works as follows:

Each transaction acquires locks as it proceeds its operation i.e. it monotonically

increases the locks it holds until it requires no more locks. This is the first phase of
the process.
After committing or aborting the transaction, it releases the locks. This is the second
phase.

Two-phase locking is prone to deadlocks. A scenario involving two transactions T1 and

T2, as shown below, is a real possibility in this protocol.

T1 T2
get(x) get(y)
get(y) get(x)

The system must be able to detect cycles or specify a lock timeout, after which it must
abort a blocked transaction. This is an issue even for single-node databases, as long
as multiple clients can access the database at the same time. The database must be
able to detect deadlocks and abort a transaction when that happens. This post I wrote
earlier goes into more detail about 2PL and transaction isolation levels.

Atomic Commit #
So far, we have discussed how concurrency control methods ensure that transactions
are serializable. This next challenge, however, is more peculiar to distributed
transactions. As stated earlier, the outcome on the individual servers involved in a
distributed transaction may vary if one or more servers fail. To guarantee the atomicity
property of transactions, we must take extra care to ensure that all the servers
involved come to the same decision on the transaction outcome.

Two-phase commit #

Two-phase commit(or 2PC) is a protocol used to guarantee atomicity in distributed

transactions. Note that the only similarity it shares with Two-phase locking is in the
naming, they do different things.

Figure 1: A successful execution of two-phase commit (2PC)\[1\]

Two-phase commit works as follows for a distributed transaction:

The database adds another entity, known as the transaction coordinator, to be in

charge of the transaction.

All the other servers involved in the transaction are called participants.

The transaction coordinator first delegates the writes in the transaction to the
participants. Each participant creates a nested transaction from the original one,
executes the operations which may require holding locks, and sends an
acknowledgement to the coordinator.

When the coordinator receives the acknowledgement messages, it begins the first
phase of the protocol. In this phase, the coordinator sends PREPARE messages to
the participants. Each participant then responds to the coordinator by telling it
whether it is PREPARED to commit or abort the transaction, based on the outcome of
the nested transaction.

If any of the participants responds with an abort message, the coordinator decides
to abort the whole transaction. The coordinator commits a transaction only if all the
participants are ready to commit. The second phase starts when the coordinator
creates a COMMITTED or ABORTED record for the overall transaction based on
these conditions, and stores that outcome in its durable log. It then broadcasts that
decision to the participant nodes as the outcome of the overall transaction.

Note that once a participant promises that it can commit the transaction, it must fulfil
that promise regardless of failures. This is done by storing its outcome in a durable log
before responding to the coordinator, so it can read from that log on recovery.

The coordinator is a bottleneck #

The major downside of the two-phase commit protocol is if the coordinator fails before
it can broadcast the outcome to the participants, the participants may get stuck in a
waiting state. A participant that has indicated that it's prepared to commit cannot
decide the outcome of the transaction on its own, as another participant may be
prepared to abort. Also, a stuck participant cannot decide on its own to abort the
transaction, because the coordinator might have sent a COMMIT message to another
participant before it crashed.

This is not ideal because the participants may hold locks on shared objects while they
are stuck in the waiting state, and thus may prevent other transactions from
progressing.

We can improve the fault tolerance of 2PC by integrating it with a consensus algorithm,
which will get discussed next.

Two-phase commit and Raft #

Consensus algorithms like Raft solve a different problem from atomic commit
protocols. We use Raft to get high availability by replicating the data on multiple
servers, where all servers do the same thing. This differs from two-phase commit in
that 2PC does not help with availability, and all the participant servers here perform
different operations. 2PC also requires that all the servers must do their part, unlike
Raft, which only needs a majority.

However, we can combine the two-phase commit protocol with a consensus algorithm
as shown below.

Figure 2: Using 2PC with a distributed consensus algorithm

In Figure 2, the transaction coordinator(Tc) and the participants(A and B) each form a
Raft group with three replicas. We can then perform 2PC among the leaders of each
Raft group. This way, we can tolerate failures and still make progress with the system,
as Raft will automatically elect a new leader. The next lecture will be on Google
Spanner, which combines 2PC with the Paxos algorithm.

[1] By Martin Kleppmann in Designing Data-Intensive Applications.

Further Reading #
Chapter 9 of Principles of Computer System Design: An Introduction, Part I. by
Jerome H. Saltzer and M. Frans Kaashoek
Chapters 7 and 9 of Designing Data-Intensive Applications by Martin Kleppmann.
Lecture 12: Distributed Transactions - MIT 6.824 Lecture Notes.
I've gone into more detail about 2PC in another post.

mit-6.824 distributed-systems learning-diary

A small favour
Did you find anything I wrote confusing, outdated, or incorrect? Please
let me know by writing a few words below.

Your name

Your email address

What should I know?

Send Message

Follow along
To get notified when I write something new, you can subscribe to the
RSS feed or enter your email below.

← Home

OS LabBook
No ratings yet
OS LabBook
39 pages
Unit-4 - Chapter 10 - Concurrency Control Techniques
50% (2)
Unit-4 - Chapter 10 - Concurrency Control Techniques
10 pages
Database System II Notes
No ratings yet
Database System II Notes
30 pages
Operating System Interview Questions (Codeofgeeks)
100% (1)
Operating System Interview Questions (Codeofgeeks)
12 pages
2.11 Distributed Transaction
No ratings yet
2.11 Distributed Transaction
27 pages
Process States: State Diagram
No ratings yet
Process States: State Diagram
2 pages
DBMS M5 PPT
No ratings yet
DBMS M5 PPT
62 pages
Transaction Management
No ratings yet
Transaction Management
30 pages
QUESTION BANK UNIT 5 - Computer Organization and Architecture
No ratings yet
QUESTION BANK UNIT 5 - Computer Organization and Architecture
9 pages
Concurrency Control in Dynamic Database Systems
No ratings yet
Concurrency Control in Dynamic Database Systems
22 pages
DBMS Unit4
No ratings yet
DBMS Unit4
37 pages
Concurrency Control in Distributed Database Systems
No ratings yet
Concurrency Control in Distributed Database Systems
5 pages
Distributed Transactions
0% (1)
Distributed Transactions
52 pages
Introduction: Chapter 13: Distributed Transactions
No ratings yet
Introduction: Chapter 13: Distributed Transactions
62 pages
Lecture 9 Distributed Transactions
No ratings yet
Lecture 9 Distributed Transactions
7 pages
Dbms Notes
No ratings yet
Dbms Notes
9 pages
Validation Based Protocol
No ratings yet
Validation Based Protocol
7 pages
DDS Unit - 3
No ratings yet
DDS Unit - 3
15 pages
Short Notes On OS
No ratings yet
Short Notes On OS
26 pages
L6 Transactions II
No ratings yet
L6 Transactions II
20 pages
DBMS Unit4
No ratings yet
DBMS Unit4
37 pages
Distributed Transactions - Database Systems
No ratings yet
Distributed Transactions - Database Systems
10 pages
Transaction Management, Concurrency Control and Deadlocks
No ratings yet
Transaction Management, Concurrency Control and Deadlocks
30 pages
Unit 5 Btech DBMS - 231213 - 110040
No ratings yet
Unit 5 Btech DBMS - 231213 - 110040
8 pages
ACID Properties
No ratings yet
ACID Properties
12 pages
DBMS Unit 5
No ratings yet
DBMS Unit 5
6 pages
Difference Between Serial Access and Concurrent Access ?
No ratings yet
Difference Between Serial Access and Concurrent Access ?
12 pages
Transaction Concurrency
No ratings yet
Transaction Concurrency
24 pages
Atomic Commit and Concurrency Control: COS 418: Distributed Systems Wyatt Lloyd
No ratings yet
Atomic Commit and Concurrency Control: COS 418: Distributed Systems Wyatt Lloyd
40 pages
Locking Based Concurrency Control Protocols
No ratings yet
Locking Based Concurrency Control Protocols
14 pages
Unit IV Notes
No ratings yet
Unit IV Notes
11 pages
10 DS - Ch17
No ratings yet
10 DS - Ch17
16 pages
DDB Unit3
No ratings yet
DDB Unit3
11 pages
Unit 4 - Distributed System - WWW - Rgpvnotes.in
No ratings yet
Unit 4 - Distributed System - WWW - Rgpvnotes.in
9 pages
Fert
No ratings yet
Fert
38 pages
Mod 5 (DS)
No ratings yet
Mod 5 (DS)
25 pages
Concurrency Control Techniques
No ratings yet
Concurrency Control Techniques
45 pages
Unit-5 DBMS
No ratings yet
Unit-5 DBMS
11 pages
cp4 1
No ratings yet
cp4 1
13 pages
ACID Properties
No ratings yet
ACID Properties
7 pages
Nested Transactions Nested Transactions
No ratings yet
Nested Transactions Nested Transactions
11 pages
ACID Property With Two Phase Commit
No ratings yet
ACID Property With Two Phase Commit
2 pages
Concurrency Control
No ratings yet
Concurrency Control
25 pages
Transaction Management - Handout
No ratings yet
Transaction Management - Handout
5 pages
4 Unit PDF
No ratings yet
4 Unit PDF
3 pages
Distributed Transaction
No ratings yet
Distributed Transaction
46 pages
CS542: Topics in Distributed Systems
No ratings yet
CS542: Topics in Distributed Systems
11 pages
Concurrency Control
No ratings yet
Concurrency Control
4 pages
9 ConcurrencyControl
No ratings yet
9 ConcurrencyControl
37 pages
Atomic Commit Protocol
No ratings yet
Atomic Commit Protocol
14 pages
DBMS Unit-5 2025
No ratings yet
DBMS Unit-5 2025
23 pages
What Is A Transaction
No ratings yet
What Is A Transaction
7 pages
Distributed Transactions
No ratings yet
Distributed Transactions
37 pages
Dbms - Concurrency Control
No ratings yet
Dbms - Concurrency Control
20 pages
CO4 Notes Concurrency Control
No ratings yet
CO4 Notes Concurrency Control
13 pages
IPC Linux
No ratings yet
IPC Linux
58 pages
5th Module Notes
No ratings yet
5th Module Notes
46 pages
Concurrency Control-:: 2. Shared (S) Mode. Data Item Can Only Be Read. S-Lock Is
No ratings yet
Concurrency Control-:: 2. Shared (S) Mode. Data Item Can Only Be Read. S-Lock Is
6 pages
Time Stamping Con Currency Control
No ratings yet
Time Stamping Con Currency Control
8 pages
13 - Distributed Transactions
No ratings yet
13 - Distributed Transactions
28 pages
Unit-3 Part1
No ratings yet
Unit-3 Part1
57 pages
Final DBMS Unit-6
No ratings yet
Final DBMS Unit-6
57 pages
Distributed Transaction Model
No ratings yet
Distributed Transaction Model
5 pages
PDC Final
No ratings yet
PDC Final
18 pages
Concurrency Control Techniques
No ratings yet
Concurrency Control Techniques
9 pages
Chapter 4-3
No ratings yet
Chapter 4-3
11 pages
Threads: Creating A Thread
No ratings yet
Threads: Creating A Thread
10 pages
PPL Unit-4
No ratings yet
PPL Unit-4
9 pages
Operating Systems Lecture Notes Deadlock: Martin C. Rinard
No ratings yet
Operating Systems Lecture Notes Deadlock: Martin C. Rinard
6 pages
OS Question Bank
No ratings yet
OS Question Bank
3 pages
SPOS - Unit 4
No ratings yet
SPOS - Unit 4
79 pages
DSBDA Manual Assignment 11
No ratings yet
DSBDA Manual Assignment 11
6 pages
Author Jim Gray
No ratings yet
Author Jim Gray
6 pages
Multi Threading
No ratings yet
Multi Threading
168 pages
QP Ktu Oct 23
No ratings yet
QP Ktu Oct 23
2 pages
Voucher 26-10-2024
No ratings yet
Voucher 26-10-2024
21 pages
Linux System Programming Part 5 - Interprocess Communication (IPC)
No ratings yet
Linux System Programming Part 5 - Interprocess Communication (IPC)
21 pages
Os Lab4
No ratings yet
Os Lab4
18 pages
Week 1 Lec 1
No ratings yet
Week 1 Lec 1
24 pages
HPC Cluster Tuning Guide On 3rd Generation Intel Xeon Scalable Processors 1
No ratings yet
HPC Cluster Tuning Guide On 3rd Generation Intel Xeon Scalable Processors 1
10 pages
OS Assignment1
No ratings yet
OS Assignment1
23 pages
Quizz OS
No ratings yet
Quizz OS
11 pages
CH 6 Synchronization
No ratings yet
CH 6 Synchronization
69 pages
CH 5 CPU Scedulling EDITED Stud
No ratings yet
CH 5 CPU Scedulling EDITED Stud
96 pages
Chapter 5: CPU Scheduling: Silberschatz, Galvin and Gagne ©2011 Operating System Concepts Essentials - 8 Edition
No ratings yet
Chapter 5: CPU Scheduling: Silberschatz, Galvin and Gagne ©2011 Operating System Concepts Essentials - 8 Edition
27 pages
Implementing Apriori Algorithm in Parallel
No ratings yet
Implementing Apriori Algorithm in Parallel
4 pages
Distributed System Assignment
No ratings yet
Distributed System Assignment
4 pages
PDC Assignment 3
No ratings yet
PDC Assignment 3
3 pages
QB ACA B.TEch 7th
No ratings yet
QB ACA B.TEch 7th
1 page
Blockchain Security from the Bottom Up: Securing and Preventing Attacks on Cryptocurrencies, Decentralized Applications, NFTs, and Smart Contracts
From Everand
Blockchain Security from the Bottom Up: Securing and Preventing Attacks on Cryptocurrencies, Decentralized Applications, NFTs, and Smart Contracts
Howard E. Poston, III
No ratings yet

MIT 6.824 - Lecture 12 - Distributed Transactions

Uploaded by

MIT 6.824 - Lecture 12 - Distributed Transactions

Uploaded by

Timilearning About

MIT 6.824: Lecture 12 - Distributed

Consistency: The transaction must obey application-specific invariants.

Isolation: There must be no interference from concurrently executing transactions.

Durability: Committed writes must be permanent.

Pessimistic concurrency control is faster if there are frequent conflicts between

Pessimistic Concurrency Control #

Each transaction acquires locks as it proceeds its operation i.e. it monotonically

Two-phase locking is prone to deadlocks. A scenario involving two transactions T1 and

Two-phase commit(or 2PC) is a protocol used to guarantee atomicity in distributed

Figure 1: A successful execution of two-phase commit (2PC)\[1\]

Two-phase commit works as follows for a distributed transaction:

The database adds another entity, known as the transaction coordinator, to be in

The coordinator is a bottleneck #

Two-phase commit and Raft #

Figure 2: Using 2PC with a distributed consensus algorithm

[1] By Martin Kleppmann in Designing Data-Intensive Applications.

mit-6.824 distributed-systems learning-diary

Your email address

What should I know?

You might also like