0% found this document useful (0 votes)
141 views

Distributed Systems Unit 4

Distributed transactions may access data at multiple sites. Each site has a local transaction manager that maintains a log and coordinates transaction execution. A transaction coordinator at each site is responsible for starting transactions, distributing subtransactions to other sites, and coordinating commit or abort across all sites. The two phase commit protocol is commonly used to ensure atomicity. It involves a prepare phase where sites agree to commit, and a commit phase where the coordinator instructs all sites to commit or abort. Failure handling protocols address site failures, coordinator failures, and network partitions to resolve transactions' fates.

Uploaded by

David Farmer
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
141 views

Distributed Systems Unit 4

Distributed transactions may access data at multiple sites. Each site has a local transaction manager that maintains a log and coordinates transaction execution. A transaction coordinator at each site is responsible for starting transactions, distributing subtransactions to other sites, and coordinating commit or abort across all sites. The two phase commit protocol is commonly used to ensure atomicity. It involves a prepare phase where sites agree to commit, and a commit phase where the coordinator instructs all sites to commit or abort. Failure handling protocols address site failures, coordinator failures, and network partitions to resolve transactions' fates.

Uploaded by

David Farmer
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 26

Distributed Transactions

 Transaction may access data at several sites.


 Each site has a local transaction manager responsible for:
– Maintaining a log for recovery purposes
– Participating in coordinating the concurrent execution of the transactions
executing at that site.
 Each site has a transaction coordinator, which is responsible for:
– Starting the execution of transactions that originate at the site.
– Distributing subtransactions at appropriate sites for execution.
– Coordinating the termination of each transaction that originates at the site,
which may result in the transaction being committed at all sites or aborted at
all sites.

2
INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR
Transaction System Architecture

3
INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR
System Failure Modes
 Failures unique to distributed systems:
– Failure of a site.
– Loss of massages
• Handled by network transmission control protocols such as TCP-IP
– Failure of a communication link
– Handled by network protocols, by routing messages via alternative links
– Network partition
• A network is said to be partitioned when it has been split into two or more
subsystems that lack any connection between them
– Note: a subsystem may consist of a single node
 Network partitioning and site failures are generally indistinguishable.

4
INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR
Commit Protocols
 Commit protocols are used to ensure atomicity across sites
– a transaction which executes at multiple sites must either be committed at all the
sites, or aborted at all the sites.
– not acceptable to have a transaction committed at one site and aborted at another

 The two-phase commit (2PC) protocol is widely used

 The three-phase commit (3PC) protocol is more complicated and more expensive, but
avoids some drawbacks of two-phase commit protocol. This protocol is not used in
practice.

5
INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR
Two Phase Commit Protocol (2PC)
 Assumes fail-stop model – failed sites simply stop working, and do not cause any
other harm, such as sending incorrect messages to other sites.

 Execution of the protocol is initiated by the coordinator after the last step of the
transaction has been reached.

 The protocol involves all the local sites at which the transaction executed

 Let T be a transaction initiated at site Si, and let the transaction coordinator at Si
be Ci

6
INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR
Phase 1: Obtaining a Decision
 Coordinator asks all participants to prepare to commit transaction Ti.
– Ci adds the records <prepare T> to the log and forces log to stable storage
– sends prepare T messages to all sites at which T executed
 Upon receiving message, transaction manager at site determines if it can commit the
transaction
– if not, add a record <no T> to the log and send abort T message to Ci
– if the transaction can be committed, then:
– add the record <ready T> to the log
– force all records for T to stable storage
– send ready T message to Ci

7
INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR
Phase 2: Recording the Decision
 T can be committed of Ci received a ready T message from all the participating sites:
otherwise T must be aborted.

 Coordinator adds a decision record, <commit T> or <abort T>, to the log and forces
record onto stable storage. Once the record stable storage it is irrevocable (even if
failures occur)

 Coordinator sends a message to each participant informing it of the decision


(commit or abort)

 Participants take appropriate action locally.

8
INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR
Handling of Failures - Site Failure
When site Si recovers, it examines its log to determine the fate of transactions active at
the time of the failure.
 Log contain <commit T> record: site executes redo (T)
 Log contains <abort T> record: site executes undo (T)
 Log contains <ready T> record: site must consult Ci to determine the fate of T.
– If T committed, redo (T)
– If T aborted, undo (T)
 The log contains no control records concerning T replies that Sk failed before
responding to the prepare T message from Ci
– since the failure of Sk precludes the sending of such a response C1 must abort T
– Sk must execute undo (T)

9
INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR
Handling of Failures- Coordinator Failure
 If coordinator fails while the commit protocol for T is executing then participating sites
must decide on T’s fate:

1. If an active site contains a <commit T> record in its log, then T must be committed.
2. If an active site contains an <abort T> record in its log, then T must be aborted.
3. If some active participating site does not contain a <ready T> record in its log, then
the failed coordinator Ci cannot have decided to commit T. Can therefore abort T.
4. If none of the above cases holds, then all active sites must have a <ready T> record
in their logs, but no additional control records (such as <abort T> of <commit T>). In
this case active sites must wait for Ci to recover, to find decision.
 Blocking problem : active sites may have to wait for failed coordinator to recover.

10
INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR
Handling of Failures - Network Partition
 If the coordinator and all its participants remain in one partition, the failure has no
effect on the commit protocol.
 If the coordinator and its participants belong to several partitions:
– Sites that are not in the partition containing the coordinator think the coordinator
has failed, and execute the protocol to deal with failure of the coordinator.
• No harm results, but sites may still have to wait for decision from
coordinator.
 The coordinator and the sites are in the same partition as the coordinator think that the
sites in the other partition have failed, and follow the usual commit protocol.
• Again, no harm results

11
INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR
Recovery and Concurrency Control
 In-doubt transactions have a <ready T>, but neither a
<commit T>, nor an <abort T> log record.
 The recovering site must determine the commit-abort status of such transactions by
contacting other sites; this can slow and potentially block recovery.
 Recovery algorithms can note lock information in the log.
– Instead of <ready T>, write out <ready T, L> L = list of locks held by T when the log is written
(read locks can be omitted).
– For every in-doubt transaction T, all the locks noted in the
<ready T, L> log record are reacquired.
 After lock reacquisition, transaction processing can resume; the commit or rollback of in-
doubt transactions is performed concurrently with the execution of new transactions.

12
INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR
Three Phase Commit (3PC)
 Assumptions:
– No network partitioning
– At any point, at least one site must be up.
– At most K sites (participants as well as coordinator) can fail

 Phase 1: Obtaining Preliminary Decision: Identical to 2PC Phase 1.


– Every site is ready to commit if instructed to do so

13
INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR
Three Phase Commit (3PC)
 Phase 2 of 2PC is split into 2 phases, Phase 2 and Phase 3 of 3PC
– In phase 2 coordinator makes a decision as in 2PC (called the pre-commit
decision) and records it in multiple (at least K) sites
– In phase 3, coordinator sends commit/abort message to all participating sites,

 Under 3PC, knowledge of pre-commit decision can be used to commit despite


coordinator failure
– Avoids blocking problem as long as < K sites fail

 Drawbacks:
– higher overheads
– assumptions may not be satisfied in practice

14
INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR
Definition of self-stabilization
A system S is self-stabilizing with respect to predicate P
if it satisfies the following two properties:

 Closure: P is closed under the execution of S. That is, once P is established in S, it


cannot be falsified.

 Convergence: Starting from an arbitrary global state, S is guaranteed to reach a


global state satisfying P within a finite number of state transitions.

2
INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR
Definition of stabilization
[Arora and Gouda] We define stabilization for system S with respect to two predicates P
and Q, over its set of global states.
Predicate Q denotes a restricted start condition. S satisfies Q  P (read as Q
stabilizes to P) if it satisfies the following two properties:

– Closure: P is closed under the execution of S. That is, once P is established in S,


it cannot be falsified.

– Convergence: If S starts from any global state that satisfies Q, then S is


guaranteed to reach a global state satisfying P within a finite number of state
transitions.

3
INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR
Randomized self-stabilization
A system is said to be randomized self-stabilizing system, if and only if it is self-
stabilizing and the expected number of rounds needed to reach a correct state
(legal state) is bounded by some constant k.

4
INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR
Probabilistic self-stabilization
A system S is said to be probabilistically self stabilizing with respect to a predicate P if it
satisfies the following two properties:

 Closure: P is closed under the execution of S. That is, once P is established in S, it


cannot be falsified.

 Convergence: There exists a function f from natural numbers to [0, 1] satisfying limk 
∞ f(k) = 0, such that the probability of reaching a state satisfying P, starting from an
arbitrary global state within k state transitions, is 1 – f(k).

5
INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR
Issues in design of self-stabilization algos
 Number of states in each of the individual units in a distributed system.

 Uniform and non-uniform algorithms.

 Central and distributed demons.

 Reducing the number of states in a token ring.

 Shared memory models.

 Mutual exclusion.

 Costs of self-stabilization.

6
INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR
Dijkstra’s self-stabilizing token ring
A legitimate state must satisfy the following constraints:

• There must be at least one privilege in the system (liveness or no deadlock).


• Every move from a legal state must again put the system into a legal state (closure).
• During an infinite execution, each machine should enjoy a privilege an infinite
number of times (no starvation).
• Given any two legal states, there is a series of moves that change one legal state to
the other (reachability).

7
INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR
Dijkstra’s self-stabilizing token ring
Dijkstra considered a legitimate (or legal) state as one in which exactly one machine
enjoys the privilege.
– This corresponds to a form of mutual exclusion, because the privileged
process is the only process is the only process that is allowed in its critical
section.
– Once the process leaves the critical section, it passes the privilege to one of
its neighbors.

8
INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR
First solution
For any machine, we use the symbols S, L, and R to denote its own state of the left
neighbor and the state of the right neighbor on the ring, respectively.

The exceptional machine:


If L = S then
S := (S + 1) mod K
End If;

The other machine:


If L ≠ S then
S := L
End if;

9
INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR
Second Solution
The second solution uses only three-state machines. The state of each machine is in {0, 1, 2}.

The bottom machine, machine 0:


If (S + 1) mod 3 = R then
S := (S – 1) mod 3

The top machine, machine n – 1:


If L = R and (L + 1) mod 3 ≠ S then
S := (L + 1) mod 3

10
INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR
Second Solution Continued
The other machines:

If (S + 1) mod 3 = L then
S := L

If (S + 1) mod 3 = R then
S := R

11
INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR
The condition (s + 1) mod 3 covers the three possible states; for s = 0, 1, 2, we have
(s + 1) mod 3 = 1, 2, 0. These result in the following three possibilities:

1. If s = 0 and r = 1, then the state of s is changes to 2.

2. If s = 1 and r = 2, then the state of s is changes to 0.

3. If s = 2 and r = 0, then the state of s is changes to 1.

12
INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR
The top machine, machine n – 1, behaves as follows:
If L = R and (L = 1) mod 3 ≠ S then
S := (L + 1) mod 3

The state of the top machine depends upon both its left and right neighbors (the bottom
machine). The condition specifies that the left neighbor (L) and the right neighbor (R)
should be in the same state and (L + 1) mod 3 should not be equal to S.
(Note that (L + 1) and 3 is 1, 2, 0 when L is 0, 1, 2, respectively). Thus the state of the top
machine is as follows:

1. 1, when its left neighbor is 0.


2. 2, when its left neighbor is 1.
3. 0, when its left neighbor is 2.

13
INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR
All other machines behave as follows:
If (S + 1) mod 3 = L then
S := L
If (S + 1) mod 3 = R then
S := R
While finding out the state of the other machines (machines 1 and 2 in the example
below), we first compare the state of a machine with its left neighbor:

1. If s = 0 and L = 1, then s = 0.
2. If s = 1 and L = 2, then s = 2.
3. If s = 2 and L = 0, then s = 1.

14
INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR

You might also like