0% found this document useful (0 votes)

30 views7 pages

Lecture 9 Distributed Transactions

The document discusses distributed transactions and how to ensure they are atomic and isolated despite failures. It covers concepts like serializability, two-phase locking, and two-phase commit protocols to provide ACID properties for distributed transactions operating on data partitioned across multiple servers.

Uploaded by

MR. SHREYASH THAKRE

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

30 views7 pages

Lecture 9 Distributed Transactions

Uploaded by

MR. SHREYASH THAKRE

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

Motivation

• Lots of data records, sharded on multiple servers, lots of clients

• Client application actions often involve multiple reads and writes
o Bank transfer: debit and credit
• We'd like to hide interleaving and failure from application writers
• This is a traditional database concern
o But the ideas are used in many distributed systems
Example
• x and y are bank balances
o Records in database tables
• x and y are on different servers (maybe at different banks)
• x and y start out as $10
• T1 and T2 are transactions

T1: transfer $1 from y to x

T2: audit, to check that no money is lost
T1: T2:
begin_transaction begin_transaction
add(x, 1) tmp1 = get(x)
add(y, -1) tmp2 = get(y)
end_transaction print tmp1, tmp2
end_transaction
Correct behavior for a transaction
• Usually called "ACID"
o Atomic
§ All writes or none, despite failures (abortability)
o Consistent
§ Obeys application-specific invariants
o Isolated
§ No interference between transactions (serializable)
o Durable
§ Committed writes are permanent
• We're interested in ACID for distributed transactions
o With data sharded over multiple servers

Serializability
• Execution of some concurrent transactions yield results
o "results" means both changes (T1) and output (T2) in the DB
• The results are serializable if:
o There exists a serial execution order of the transactions
o Yields the same results as the actual execution
§ Serial means one at a time
• No parallel execution
§ This definition should remind you of linearizability
• You can test whether an execution's result is serializable by looking
for an order that yields the same results
o For our example, the possible serial orders are
T1; T2 or T2; T1
• So, the correct (serializable) results are:
T1; T2: x=11 y=9 "11,9"
T2; T1: x=11 y=9 "10,10"
• The results for the two differ; either is okay
o No other result is okay
• The implementation might have executed T1 and T2 in parallel
o But it must still yield results as if in a serial order

What if T1's operations run entirely between T2's two get()s? would the
result be serializable?
• T2 would print 10,9
• But 10,9 is not one of the two serializable results!

What if T2 runs entirely between T1's two adds()s?

• T2 would print 11,10
• But 11,10 is not one of the two serializable results!

Why serializability is popular?

• An easy model for programmers
o They can write complex transactions while ignoring concurrency
• It allows parallel execution of transactions on different records

A transaction can "abort" if something goes wrong

• An abort un-does any record modifications
1. The transaction might voluntarily abort,
§ E.g., if the account doesn't exist, or y's balance is <= 0
2. The system may force an abort, e.g., to break a deadlock
3. Some server’s failures result in abort
• The application might (or might not) try the transaction again

Components of distributed transactions

1. Concurrency control
o To provide isolation/serializability
2. Atomic commit
o To provide atomicity despite failure
Concurrency control
• Correct execution of concurrent transactions
Classes of concurrency control
1. Pessimistic concurrency control
o Lock records before use
o Conflicts cause delays (waiting for locks)
o Faster if conflicts are frequent
§ E.g., 2PL
2. Optimistic concurrency control
o Use records without locking
o Check before commit if reads/writes were serializable
o Conflict causes abort+retry
o Faster if conflicts are rare
§ E.g., FaRM
Pessimistic Concurrency Control
Two-phase locking (2PL)
• Used to implement serializability
Definition:
• A transaction must acquire a record's lock before using it
• A transaction must hold its locks until *after* commit or abort
2PL for our example
• Suppose T1 and T2 start at the same time
• The transaction system automatically acquires locks as needed
• So first of T1/T2 to use x will get the lock
o Get another lock for y as well
• The other waits until the first completely finishes
• This prohibits the non-serializable interleaving

Why hold locks until after commit/abort? why not release as soon as done with
the record?
• Example 1 of a resulting problem:
o Suppose T2 releases x's lock after get(x)
o T1 could then execute between T2's get()s
o T2 would print 10,9
o Oops: that is not a serializable execution: neither T1;T2 nor
T2;T1
• Example 2 of a resulting problem:
o Suppose T1 writes x, then releases x's lock
o T2 reads x and prints
o T1 then aborts
§ Maybe because the value of y is already 0
o Oops: T2 used a value that never really existed
o We should have aborted T2, which would be a "cascading abort"
• Deadlock!
T1 T2
get(x) get(y)
get(y) get(x)
• The system must detect cycles and abort a transaction (e.g., by using
lock timeout)

Optimistic Concurrency Control

• Increases concurrency more than pessimistic concurrency control

o i.e., Increases transactions per second by lowering latency
• Used in Dropbox, Wikipedia, key-value stores like Cassandra, and
Amazon’s Dynamo
• Preferable than pessimistic when conflicts are expected to be rare
o But still need to ensure conflicts are caught

First-cut approach:

• Write and read objects at will

• Check for serial equivalence at commit time
• If abort, roll back updates made
• An abort may result in other transactions that read dirty data, also
being aborted
• Any transactions that read from those transactions also now need to be
aborted
o Cascading aborts
Atomic Commit
How can distributed transactions cope with failures?
• Suppose for our example, x and y are on different "worker" servers
• Suppose x's server adds 1, but y's crashes before subtracting
o Or x's server adds 1, but y realizes the account doesn't exist or
y’s balance is 0
• Or x and y both can do their part, but aren't sure if the other will

We want "atomic commit":

• A bunch of computers are cooperating on some task
• Each computer has a different role
• Want to ensure atomicity: all execute, or none execute

Challenges
• Failures, performance

We're going to use a protocol called "two-phase commit"

• Used by distributed databases for multi-server transactions

The setting

• Data is sharded among multiple servers

• Transactions run on "transaction coordinators" (TCs)
• For each read/write, TC sends RPC to relevant shard server
o Each is a "participant"
o Each participant manages locks for its shard of the data
• There may be many concurrent transactions
o TC assigns unique transaction ID (TID) to each transaction
§ Allows TC to resend commit messages
o Every message, every table entry tagged with TID
Two-phase commit (2PC)

• TC sends get(), put() RPCs to A, B

o A and B lock records
o Modifications are tentative, on a copy
• TC gets to the end of the transaction
• TC sends PREPARE messages to A and B
• If A is willing to commit
o A respond YES
o Then A is in "prepared" state
• Otherwise, A responds NO
• Same for B
• If both A and B say YES, TC sends COMMIT messages to A and B
o If either A or B says NO, TC sends ABORT messages
• A/B commit if they get a COMMIT message from the TC
o I.e., they write tentative records to the real DB
o And release the transaction's locks on their records
• A and B acknowledge COMMIT message

Correctness
• Neither A nor B can commit unless they both agreed

Fault Tolerance
1. Node Failure
What if B crashes and restarts?
• If B sent YES before crash, B must remember (despite crash)!
• Because TC might have sent commit message to A and A might have already
committed
• So, B must be able to commit (or not) even after a reboot
Thus, participants must write persistent (on-disk) state:
• B must remember on disk before saying YES, including modified data
• If B reboots, and disk says YES but no COMMIT
o B must ask TC or wait for TC to re-send
• And meanwhile, B must continue to hold the transaction's locks
• If TC says COMMIT, B copies modified data to real data

What if TC crashes and restarts?

• If TC might have sent COMMIT before crash, TC must remember!
o Since one worker may already have committed
• Thus, TC must write COMMIT to disk before sending COMMIT messages
• And repeat COMMIT if it crashes and reboots,
o Or if a participant asks (i.e., if A/B didn't get COMMIT msg)
• Participants must filter out duplicate COMMITs (using TID)

2. Network Failure
What if TC never gets a YES/NO from B?
• Perhaps B crashed and didn't recover; perhaps network is broken
• TC can time out, and abort (since has not sent any COMMIT messages)
• Good: allows servers to release locks

What if B never gets a PREPARE from TC?

• B has not yet responded to PREPARE, so TC can't have decided commit
• So, B can unilaterally abort, and release locks
o Shouldn’t hold lock for a long time
• Respond NO to future PREPARE

What if B replied YES to PREPARE, but doesn't receive COMMIT or ABORT?

• Can B unilaterally decide to abort?
o No! TC might have gotten YES from both, and sent out COMMIT to A,
but crashed before sending to B
o So, then A would commit, and B would abort: incorrect
• B can't unilaterally commit, either:
o A might have voted NO

So: if B voted YES, it must "block": wait for TC decision

Note:
• The commit/abort decision is made by a single entity => the TC
• This makes two-phase commit relatively straightforward
• The penalty is that A/B, after voting YES, must wait for the TC

When can TC completely forget about a committed transaction?

• If it sees an acknowledgement from every participant for the COMMIT
• Then no participant will ever need to ask again

When can participant completely forgets about a committed transaction?

• After it acknowledges the TC's COMMIT message
• If it gets another COMMIT, and has no record of the transaction,
o It must have already committed and forgotten and can acknowledge
(again)
Two-phase commit perspective
• Used in sharded DBs when a transaction uses data on multiple shards
• But it has a bad reputation:
o Slow: multiple rounds of messages
o Slow: disk writes
o Locks are held over the prepare/commit exchanges
§ Blocks other transactions
o TC crash can cause indefinite blocking, with locks held
• Thus, usually used only in a single small domain
o E.g., not between banks, not between airlines, not over wide area
networks
• Faster distributed transactions are active research area

Raft and two-phase commit solve different problems!

• Use Raft to get high availability by replicating
o I.e., to be able to operate when some servers are crashed
o The servers all do the *same* thing
• Use 2PC when each participant does something different
o And *all* of them must do their part
• 2PC does not help availability
o Since all servers must be up to get anything done
• Raft does not ensure that all servers do something
o Since only a majority have to be alive

What if you want high availability and atomic commit?

• Here's one plan

• The TC and servers should each be replicated with Raft

• Run two-phase commit among the replicated services
• Then you can tolerate failures and still make progress
• Google Spanner uses this arrangement

Spreadsheet Notes For SHS
50% (8)
Spreadsheet Notes For SHS
25 pages
PTS Reference Manual V2.2
100% (3)
PTS Reference Manual V2.2
265 pages
Unit 5
No ratings yet
Unit 5
11 pages
Atomic Commit and Concurrency Control: COS 418: Distributed Systems Wyatt Lloyd
No ratings yet
Atomic Commit and Concurrency Control: COS 418: Distributed Systems Wyatt Lloyd
40 pages
MIT 6.824 - Lecture 12 - Distributed Transactions
No ratings yet
MIT 6.824 - Lecture 12 - Distributed Transactions
1 page
Slides 11 Transactions
No ratings yet
Slides 11 Transactions
34 pages
Transaction Management
No ratings yet
Transaction Management
69 pages
Lecture 10
No ratings yet
Lecture 10
55 pages
Transaction Concept
No ratings yet
Transaction Concept
26 pages
12TransactionProcessing PDF
No ratings yet
12TransactionProcessing PDF
64 pages
Transaction Management - Handout
No ratings yet
Transaction Management - Handout
5 pages
Concurrency Control in Dynamic Database Systems
No ratings yet
Concurrency Control in Dynamic Database Systems
22 pages
Unit 4 DBMS
No ratings yet
Unit 4 DBMS
72 pages
Lecture 3 - Concurrency Control and Fault Tolerance
No ratings yet
Lecture 3 - Concurrency Control and Fault Tolerance
54 pages
Database Concurrency Control
No ratings yet
Database Concurrency Control
30 pages
L6 Transactions II
No ratings yet
L6 Transactions II
20 pages
CNET343 L18 Transactions
No ratings yet
CNET343 L18 Transactions
6 pages
DBMS Unit-5 2025
No ratings yet
DBMS Unit-5 2025
23 pages
Unit 4
No ratings yet
Unit 4
52 pages
Transaction Concurrency
No ratings yet
Transaction Concurrency
24 pages
CH 15 Updated
No ratings yet
CH 15 Updated
28 pages
Concurrency: Database Systems Lecture 15 Natasha Alechina
No ratings yet
Concurrency: Database Systems Lecture 15 Natasha Alechina
26 pages
Class 3
No ratings yet
Class 3
16 pages
Lecture 20
No ratings yet
Lecture 20
64 pages
Ramirez Slides
No ratings yet
Ramirez Slides
24 pages
Time Stamping Con Currency Control
No ratings yet
Time Stamping Con Currency Control
8 pages
Transactions and Concurrecynotes
No ratings yet
Transactions and Concurrecynotes
43 pages
Chapter 5-TP
No ratings yet
Chapter 5-TP
35 pages
Solution FinalExam
No ratings yet
Solution FinalExam
10 pages
Concurrency Control: R&G - Chapter 17
No ratings yet
Concurrency Control: R&G - Chapter 17
28 pages
13 - Distributed Transactions
No ratings yet
13 - Distributed Transactions
28 pages
10-DBMS - Transaction
No ratings yet
10-DBMS - Transaction
50 pages
5-Transaction Processing
No ratings yet
5-Transaction Processing
79 pages
9 DS - Ch16
No ratings yet
9 DS - Ch16
18 pages
Lecture6 GeneralizedIsolation
No ratings yet
Lecture6 GeneralizedIsolation
37 pages
Chap16 17 Transaction Con Currency
No ratings yet
Chap16 17 Transaction Con Currency
58 pages
SerializableSI Fekete
No ratings yet
SerializableSI Fekete
13 pages
Lecture - Transactions and Properties
No ratings yet
Lecture - Transactions and Properties
67 pages
Chameli Devi Group of Institutions: Cs502 Dbms Unit-4 Concurrency Control
No ratings yet
Chameli Devi Group of Institutions: Cs502 Dbms Unit-4 Concurrency Control
33 pages
Transactions & Concurrency Control: CS4262 Distributed Systems
No ratings yet
Transactions & Concurrency Control: CS4262 Distributed Systems
45 pages
Database Systems: Transaction Management Concurrency Control
No ratings yet
Database Systems: Transaction Management Concurrency Control
31 pages
Lecture 6 Locks and CC
No ratings yet
Lecture 6 Locks and CC
29 pages
Final DBMS Unit-6
No ratings yet
Final DBMS Unit-6
57 pages
Week-12 Concurrency Control
No ratings yet
Week-12 Concurrency Control
26 pages
Unit 4 Dbms
No ratings yet
Unit 4 Dbms
85 pages
Topic 3 Concurrency Control
No ratings yet
Topic 3 Concurrency Control
40 pages
Introduction To Transaction Processing
No ratings yet
Introduction To Transaction Processing
44 pages
Transactions
No ratings yet
Transactions
46 pages
18cc 6up
No ratings yet
18cc 6up
4 pages
Database Concurrency
No ratings yet
Database Concurrency
39 pages
Validation Based Protocol
No ratings yet
Validation Based Protocol
7 pages
Transaction
No ratings yet
Transaction
7 pages
08 Transactions
No ratings yet
08 Transactions
15 pages
Ch#22 TRANSACTION - MANAGEMENT
No ratings yet
Ch#22 TRANSACTION - MANAGEMENT
80 pages
Unit 4 - QueryProcessingandTransactionManagementSystem
No ratings yet
Unit 4 - QueryProcessingandTransactionManagementSystem
50 pages
Transaction Management - I
No ratings yet
Transaction Management - I
43 pages
Today's Lecture: Locking and Deadlock Distributed Transactions
No ratings yet
Today's Lecture: Locking and Deadlock Distributed Transactions
14 pages
SGDB
No ratings yet
SGDB
14 pages
03 Concurrency
No ratings yet
03 Concurrency
124 pages
CS 542: Topics in Distributed Systems: Transactions and Concurrency Control
No ratings yet
CS 542: Topics in Distributed Systems: Transactions and Concurrency Control
46 pages
The Beginner’s Guide to Crypto Currency - Crypto Confident Mastering the Essentials of Digital Finance: The Beginner’s Guide to Crypto Currency, #2
From Everand
The Beginner’s Guide to Crypto Currency - Crypto Confident Mastering the Essentials of Digital Finance: The Beginner’s Guide to Crypto Currency, #2
Steven Mcananey
No ratings yet
Projects With Microcontrollers And PICC
From Everand
Projects With Microcontrollers And PICC
Guillermo Perez Guillen
5/5 (1)
UBA-5 2 0-ReleaseNotes
100% (1)
UBA-5 2 0-ReleaseNotes
14 pages
Sanity 2
No ratings yet
Sanity 2
11 pages
Artificial Intelligence Based Person Identification Virtual Assistant
No ratings yet
Artificial Intelligence Based Person Identification Virtual Assistant
5 pages
Lambda DG
No ratings yet
Lambda DG
553 pages
Resume Content Technical Writer
No ratings yet
Resume Content Technical Writer
5 pages
Princeton Applied Re - 2004 - Computer Controlled Potentiostat and Galvanostat Model 263A
No ratings yet
Princeton Applied Re - 2004 - Computer Controlled Potentiostat and Galvanostat Model 263A
4 pages
MySQL Basic SELECT Statement - Exercises, Practice, Solution
No ratings yet
MySQL Basic SELECT Statement - Exercises, Practice, Solution
45 pages
Paragon Hybrid Mail Userguide
No ratings yet
Paragon Hybrid Mail Userguide
18 pages
Week 8 Data Analysis, Interpretation and Presentation
No ratings yet
Week 8 Data Analysis, Interpretation and Presentation
30 pages
End of Semester Exam - Image - Processing and Computer Vision
No ratings yet
End of Semester Exam - Image - Processing and Computer Vision
3 pages
Business Verification: Total Records Unmatched / Data Quality Errors Confidence Code 8 Confidence Code 7
No ratings yet
Business Verification: Total Records Unmatched / Data Quality Errors Confidence Code 8 Confidence Code 7
12 pages
04 Control of Calibrated Equipment
No ratings yet
04 Control of Calibrated Equipment
8 pages
XMLP DevelopmentGuide
No ratings yet
XMLP DevelopmentGuide
26 pages
Ieee SDD
No ratings yet
Ieee SDD
11 pages
JKSSB PAA 2020 Question Paper
No ratings yet
JKSSB PAA 2020 Question Paper
16 pages
Platform Events
No ratings yet
Platform Events
584 pages
Chapter 4 - Defining Project
No ratings yet
Chapter 4 - Defining Project
8 pages
Coa Unit 4 Digital Notes
No ratings yet
Coa Unit 4 Digital Notes
160 pages
Ffmpeg Watch-Folder PDF
No ratings yet
Ffmpeg Watch-Folder PDF
2 pages
CUCM - PBX Assessment Example
No ratings yet
CUCM - PBX Assessment Example
15 pages
Computer Engineering Thesis Ideas
100% (2)
Computer Engineering Thesis Ideas
4 pages
Software Requirements Specification: COMSATS University Islamabad, COMSATS Road, Off GT Road, Sahiwal, Pakistan
No ratings yet
Software Requirements Specification: COMSATS University Islamabad, COMSATS Road, Off GT Road, Sahiwal, Pakistan
13 pages
Handout 10613 AS10613 More Practical Dynamo Marcello Sgambelluri HANDOUT
100% (1)
Handout 10613 AS10613 More Practical Dynamo Marcello Sgambelluri HANDOUT
42 pages
Modicon x80 I - Os - Bmxamo0410
No ratings yet
Modicon x80 I - Os - Bmxamo0410
4 pages
Chapter 06 - Symbolic Functions: What Is A Function?
No ratings yet
Chapter 06 - Symbolic Functions: What Is A Function?
3 pages
Guide For LC-2, LM-2, MTX-L, SCG-1and Multi-Sensor Support
No ratings yet
Guide For LC-2, LM-2, MTX-L, SCG-1and Multi-Sensor Support
2 pages
VMware Scenario Based
100% (1)
VMware Scenario Based
18 pages
Umer Ziyad Resume QualityEngineer
No ratings yet
Umer Ziyad Resume QualityEngineer
3 pages

Lecture 9 Distributed Transactions

Uploaded by

Lecture 9 Distributed Transactions

Uploaded by

Motivation

• Lots of data records, sharded on multiple servers, lots of clients

T1: transfer $1 from y to x

What if T2 runs entirely between T1's two adds()s?

Why serializability is popular?

A transaction can "abort" if something goes wrong

Components of distributed transactions

Optimistic Concurrency Control

• Increases concurrency more than pessimistic concurrency control

• Write and read objects at will

We want "atomic commit":

We're going to use a protocol called "two-phase commit"

• Data is sharded among multiple servers

• TC sends get(), put() RPCs to A, B

What if TC crashes and restarts?

What if B never gets a PREPARE from TC?

What if B replied YES to PREPARE, but doesn't receive COMMIT or ABORT?

So: if B voted YES, it must "block": wait for TC decision

When can TC completely forget about a committed transaction?

When can participant completely forgets about a committed transaction?

Raft and two-phase commit solve different problems!

What if you want high availability *and* atomic commit?

• The TC and servers should each be replicated with Raft

You might also like

What if you want high availability and atomic commit?