0% found this document useful (0 votes)
17 views

Lecture 9 Distributed Transactions

The document discusses distributed transactions and how to ensure they are atomic and isolated despite failures. It covers concepts like serializability, two-phase locking, and two-phase commit protocols to provide ACID properties for distributed transactions operating on data partitioned across multiple servers.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views

Lecture 9 Distributed Transactions

The document discusses distributed transactions and how to ensure they are atomic and isolated despite failures. It covers concepts like serializability, two-phase locking, and two-phase commit protocols to provide ACID properties for distributed transactions operating on data partitioned across multiple servers.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Motivation

• Lots of data records, sharded on multiple servers, lots of clients


• Client application actions often involve multiple reads and writes
o Bank transfer: debit and credit
• We'd like to hide interleaving and failure from application writers
• This is a traditional database concern
o But the ideas are used in many distributed systems
Example
• x and y are bank balances
o Records in database tables
• x and y are on different servers (maybe at different banks)
• x and y start out as $10
• T1 and T2 are transactions

T1: transfer $1 from y to x


T2: audit, to check that no money is lost
T1: T2:
begin_transaction begin_transaction
add(x, 1) tmp1 = get(x)
add(y, -1) tmp2 = get(y)
end_transaction print tmp1, tmp2
end_transaction
Correct behavior for a transaction
• Usually called "ACID"
o Atomic
§ All writes or none, despite failures (abortability)
o Consistent
§ Obeys application-specific invariants
o Isolated
§ No interference between transactions (serializable)
o Durable
§ Committed writes are permanent
• We're interested in ACID for distributed transactions
o With data sharded over multiple servers

Serializability
• Execution of some concurrent transactions yield results
o "results" means both changes (T1) and output (T2) in the DB
• The results are serializable if:
o There exists a serial execution order of the transactions
o Yields the same results as the actual execution
§ Serial means one at a time
• No parallel execution
§ This definition should remind you of linearizability
• You can test whether an execution's result is serializable by looking
for an order that yields the same results
o For our example, the possible serial orders are
T1; T2 or T2; T1
• So, the correct (serializable) results are:
T1; T2: x=11 y=9 "11,9"
T2; T1: x=11 y=9 "10,10"
• The results for the two differ; either is okay
o No other result is okay
• The implementation might have executed T1 and T2 in parallel
o But it must still yield results as if in a serial order

What if T1's operations run entirely between T2's two get()s? would the
result be serializable?
• T2 would print 10,9
• But 10,9 is not one of the two serializable results!

What if T2 runs entirely between T1's two adds()s?


• T2 would print 11,10
• But 11,10 is not one of the two serializable results!

Why serializability is popular?


• An easy model for programmers
o They can write complex transactions while ignoring concurrency
• It allows parallel execution of transactions on different records

A transaction can "abort" if something goes wrong


• An abort un-does any record modifications
1. The transaction might voluntarily abort,
§ E.g., if the account doesn't exist, or y's balance is <= 0
2. The system may force an abort, e.g., to break a deadlock
3. Some server’s failures result in abort
• The application might (or might not) try the transaction again

Components of distributed transactions


1. Concurrency control
o To provide isolation/serializability
2. Atomic commit
o To provide atomicity despite failure
Concurrency control
• Correct execution of concurrent transactions
Classes of concurrency control
1. Pessimistic concurrency control
o Lock records before use
o Conflicts cause delays (waiting for locks)
o Faster if conflicts are frequent
§ E.g., 2PL
2. Optimistic concurrency control
o Use records without locking
o Check before commit if reads/writes were serializable
o Conflict causes abort+retry
o Faster if conflicts are rare
§ E.g., FaRM
Pessimistic Concurrency Control
Two-phase locking (2PL)
• Used to implement serializability
Definition:
• A transaction must acquire a record's lock before using it
• A transaction must hold its locks until *after* commit or abort
2PL for our example
• Suppose T1 and T2 start at the same time
• The transaction system automatically acquires locks as needed
• So first of T1/T2 to use x will get the lock
o Get another lock for y as well
• The other waits until the first completely finishes
• This prohibits the non-serializable interleaving

Why hold locks until after commit/abort? why not release as soon as done with
the record?
• Example 1 of a resulting problem:
o Suppose T2 releases x's lock after get(x)
o T1 could then execute between T2's get()s
o T2 would print 10,9
o Oops: that is not a serializable execution: neither T1;T2 nor
T2;T1
• Example 2 of a resulting problem:
o Suppose T1 writes x, then releases x's lock
o T2 reads x and prints
o T1 then aborts
§ Maybe because the value of y is already 0
o Oops: T2 used a value that never really existed
o We should have aborted T2, which would be a "cascading abort"
• Deadlock!
T1 T2
get(x) get(y)
get(y) get(x)
• The system must detect cycles and abort a transaction (e.g., by using
lock timeout)

Optimistic Concurrency Control

• Increases concurrency more than pessimistic concurrency control


o i.e., Increases transactions per second by lowering latency
• Used in Dropbox, Wikipedia, key-value stores like Cassandra, and
Amazon’s Dynamo
• Preferable than pessimistic when conflicts are expected to be rare
o But still need to ensure conflicts are caught

First-cut approach:

• Write and read objects at will


• Check for serial equivalence at commit time
• If abort, roll back updates made
• An abort may result in other transactions that read dirty data, also
being aborted
• Any transactions that read from those transactions also now need to be
aborted
o Cascading aborts
Atomic Commit
How can distributed transactions cope with failures?
• Suppose for our example, x and y are on different "worker" servers
• Suppose x's server adds 1, but y's crashes before subtracting
o Or x's server adds 1, but y realizes the account doesn't exist or
y’s balance is 0
• Or x and y both can do their part, but aren't sure if the other will

We want "atomic commit":


• A bunch of computers are cooperating on some task
• Each computer has a different role
• Want to ensure atomicity: all execute, or none execute

Challenges
• Failures, performance

We're going to use a protocol called "two-phase commit"


• Used by distributed databases for multi-server transactions

The setting

• Data is sharded among multiple servers


• Transactions run on "transaction coordinators" (TCs)
• For each read/write, TC sends RPC to relevant shard server
o Each is a "participant"
o Each participant manages locks for its shard of the data
• There may be many concurrent transactions
o TC assigns unique transaction ID (TID) to each transaction
§ Allows TC to resend commit messages
o Every message, every table entry tagged with TID
Two-phase commit (2PC)

• TC sends get(), put() RPCs to A, B


o A and B lock records
o Modifications are tentative, on a copy
• TC gets to the end of the transaction
• TC sends PREPARE messages to A and B
• If A is willing to commit
o A respond YES
o Then A is in "prepared" state
• Otherwise, A responds NO
• Same for B
• If both A and B say YES, TC sends COMMIT messages to A and B
o If either A or B says NO, TC sends ABORT messages
• A/B commit if they get a COMMIT message from the TC
o I.e., they write tentative records to the real DB
o And release the transaction's locks on their records
• A and B acknowledge COMMIT message

Correctness
• Neither A nor B can commit unless they both agreed

Fault Tolerance
1. Node Failure
What if B crashes and restarts?
• If B sent YES before crash, B must remember (despite crash)!
• Because TC might have sent commit message to A and A might have already
committed
• So, B must be able to commit (or not) even after a reboot
Thus, participants must write persistent (on-disk) state:
• B must remember on disk before saying YES, including modified data
• If B reboots, and disk says YES but no COMMIT
o B must ask TC or wait for TC to re-send
• And meanwhile, B must continue to hold the transaction's locks
• If TC says COMMIT, B copies modified data to real data

What if TC crashes and restarts?


• If TC might have sent COMMIT before crash, TC must remember!
o Since one worker may already have committed
• Thus, TC must write COMMIT to disk before sending COMMIT messages
• And repeat COMMIT if it crashes and reboots,
o Or if a participant asks (i.e., if A/B didn't get COMMIT msg)
• Participants must filter out duplicate COMMITs (using TID)

2. Network Failure
What if TC never gets a YES/NO from B?
• Perhaps B crashed and didn't recover; perhaps network is broken
• TC can time out, and abort (since has not sent any COMMIT messages)
• Good: allows servers to release locks

What if B never gets a PREPARE from TC?


• B has not yet responded to PREPARE, so TC can't have decided commit
• So, B can unilaterally abort, and release locks
o Shouldn’t hold lock for a long time
• Respond NO to future PREPARE

What if B replied YES to PREPARE, but doesn't receive COMMIT or ABORT?


• Can B unilaterally decide to abort?
o No! TC might have gotten YES from both, and sent out COMMIT to A,
but crashed before sending to B
o So, then A would commit, and B would abort: incorrect
• B can't unilaterally commit, either:
o A might have voted NO

So: if B voted YES, it must "block": wait for TC decision

Note:
• The commit/abort decision is made by a single entity => the TC
• This makes two-phase commit relatively straightforward
• The penalty is that A/B, after voting YES, must wait for the TC

When can TC completely forget about a committed transaction?


• If it sees an acknowledgement from every participant for the COMMIT
• Then no participant will ever need to ask again

When can participant completely forgets about a committed transaction?


• After it acknowledges the TC's COMMIT message
• If it gets another COMMIT, and has no record of the transaction,
o It must have already committed and forgotten and can acknowledge
(again)
Two-phase commit perspective
• Used in sharded DBs when a transaction uses data on multiple shards
• But it has a bad reputation:
o Slow: multiple rounds of messages
o Slow: disk writes
o Locks are held over the prepare/commit exchanges
§ Blocks other transactions
o TC crash can cause indefinite blocking, with locks held
• Thus, usually used only in a single small domain
o E.g., not between banks, not between airlines, not over wide area
networks
• Faster distributed transactions are active research area

Raft and two-phase commit solve different problems!


• Use Raft to get high availability by replicating
o I.e., to be able to operate when some servers are crashed
o The servers all do the *same* thing
• Use 2PC when each participant does something different
o And *all* of them must do their part
• 2PC does not help availability
o Since all servers must be up to get anything done
• Raft does not ensure that all servers do something
o Since only a majority have to be alive

What if you want high availability *and* atomic commit?


• Here's one plan

• The TC and servers should each be replicated with Raft


• Run two-phase commit among the replicated services
• Then you can tolerate failures and still make progress
• Google Spanner uses this arrangement

You might also like