Distributed Systems Unit 4
Distributed Systems Unit 4
2
INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR
Transaction System Architecture
3
INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR
System Failure Modes
Failures unique to distributed systems:
– Failure of a site.
– Loss of massages
• Handled by network transmission control protocols such as TCP-IP
– Failure of a communication link
– Handled by network protocols, by routing messages via alternative links
– Network partition
• A network is said to be partitioned when it has been split into two or more
subsystems that lack any connection between them
– Note: a subsystem may consist of a single node
Network partitioning and site failures are generally indistinguishable.
4
INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR
Commit Protocols
Commit protocols are used to ensure atomicity across sites
– a transaction which executes at multiple sites must either be committed at all the
sites, or aborted at all the sites.
– not acceptable to have a transaction committed at one site and aborted at another
The three-phase commit (3PC) protocol is more complicated and more expensive, but
avoids some drawbacks of two-phase commit protocol. This protocol is not used in
practice.
5
INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR
Two Phase Commit Protocol (2PC)
Assumes fail-stop model – failed sites simply stop working, and do not cause any
other harm, such as sending incorrect messages to other sites.
Execution of the protocol is initiated by the coordinator after the last step of the
transaction has been reached.
The protocol involves all the local sites at which the transaction executed
Let T be a transaction initiated at site Si, and let the transaction coordinator at Si
be Ci
6
INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR
Phase 1: Obtaining a Decision
Coordinator asks all participants to prepare to commit transaction Ti.
– Ci adds the records <prepare T> to the log and forces log to stable storage
– sends prepare T messages to all sites at which T executed
Upon receiving message, transaction manager at site determines if it can commit the
transaction
– if not, add a record <no T> to the log and send abort T message to Ci
– if the transaction can be committed, then:
– add the record <ready T> to the log
– force all records for T to stable storage
– send ready T message to Ci
7
INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR
Phase 2: Recording the Decision
T can be committed of Ci received a ready T message from all the participating sites:
otherwise T must be aborted.
Coordinator adds a decision record, <commit T> or <abort T>, to the log and forces
record onto stable storage. Once the record stable storage it is irrevocable (even if
failures occur)
8
INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR
Handling of Failures - Site Failure
When site Si recovers, it examines its log to determine the fate of transactions active at
the time of the failure.
Log contain <commit T> record: site executes redo (T)
Log contains <abort T> record: site executes undo (T)
Log contains <ready T> record: site must consult Ci to determine the fate of T.
– If T committed, redo (T)
– If T aborted, undo (T)
The log contains no control records concerning T replies that Sk failed before
responding to the prepare T message from Ci
– since the failure of Sk precludes the sending of such a response C1 must abort T
– Sk must execute undo (T)
9
INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR
Handling of Failures- Coordinator Failure
If coordinator fails while the commit protocol for T is executing then participating sites
must decide on T’s fate:
1. If an active site contains a <commit T> record in its log, then T must be committed.
2. If an active site contains an <abort T> record in its log, then T must be aborted.
3. If some active participating site does not contain a <ready T> record in its log, then
the failed coordinator Ci cannot have decided to commit T. Can therefore abort T.
4. If none of the above cases holds, then all active sites must have a <ready T> record
in their logs, but no additional control records (such as <abort T> of <commit T>). In
this case active sites must wait for Ci to recover, to find decision.
Blocking problem : active sites may have to wait for failed coordinator to recover.
10
INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR
Handling of Failures - Network Partition
If the coordinator and all its participants remain in one partition, the failure has no
effect on the commit protocol.
If the coordinator and its participants belong to several partitions:
– Sites that are not in the partition containing the coordinator think the coordinator
has failed, and execute the protocol to deal with failure of the coordinator.
• No harm results, but sites may still have to wait for decision from
coordinator.
The coordinator and the sites are in the same partition as the coordinator think that the
sites in the other partition have failed, and follow the usual commit protocol.
• Again, no harm results
11
INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR
Recovery and Concurrency Control
In-doubt transactions have a <ready T>, but neither a
<commit T>, nor an <abort T> log record.
The recovering site must determine the commit-abort status of such transactions by
contacting other sites; this can slow and potentially block recovery.
Recovery algorithms can note lock information in the log.
– Instead of <ready T>, write out <ready T, L> L = list of locks held by T when the log is written
(read locks can be omitted).
– For every in-doubt transaction T, all the locks noted in the
<ready T, L> log record are reacquired.
After lock reacquisition, transaction processing can resume; the commit or rollback of in-
doubt transactions is performed concurrently with the execution of new transactions.
12
INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR
Three Phase Commit (3PC)
Assumptions:
– No network partitioning
– At any point, at least one site must be up.
– At most K sites (participants as well as coordinator) can fail
13
INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR
Three Phase Commit (3PC)
Phase 2 of 2PC is split into 2 phases, Phase 2 and Phase 3 of 3PC
– In phase 2 coordinator makes a decision as in 2PC (called the pre-commit
decision) and records it in multiple (at least K) sites
– In phase 3, coordinator sends commit/abort message to all participating sites,
Drawbacks:
– higher overheads
– assumptions may not be satisfied in practice
14
INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR
Definition of self-stabilization
A system S is self-stabilizing with respect to predicate P
if it satisfies the following two properties:
2
INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR
Definition of stabilization
[Arora and Gouda] We define stabilization for system S with respect to two predicates P
and Q, over its set of global states.
Predicate Q denotes a restricted start condition. S satisfies Q P (read as Q
stabilizes to P) if it satisfies the following two properties:
3
INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR
Randomized self-stabilization
A system is said to be randomized self-stabilizing system, if and only if it is self-
stabilizing and the expected number of rounds needed to reach a correct state
(legal state) is bounded by some constant k.
4
INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR
Probabilistic self-stabilization
A system S is said to be probabilistically self stabilizing with respect to a predicate P if it
satisfies the following two properties:
Convergence: There exists a function f from natural numbers to [0, 1] satisfying limk
∞ f(k) = 0, such that the probability of reaching a state satisfying P, starting from an
arbitrary global state within k state transitions, is 1 – f(k).
5
INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR
Issues in design of self-stabilization algos
Number of states in each of the individual units in a distributed system.
Mutual exclusion.
Costs of self-stabilization.
6
INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR
Dijkstra’s self-stabilizing token ring
A legitimate state must satisfy the following constraints:
7
INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR
Dijkstra’s self-stabilizing token ring
Dijkstra considered a legitimate (or legal) state as one in which exactly one machine
enjoys the privilege.
– This corresponds to a form of mutual exclusion, because the privileged
process is the only process is the only process that is allowed in its critical
section.
– Once the process leaves the critical section, it passes the privilege to one of
its neighbors.
8
INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR
First solution
For any machine, we use the symbols S, L, and R to denote its own state of the left
neighbor and the state of the right neighbor on the ring, respectively.
9
INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR
Second Solution
The second solution uses only three-state machines. The state of each machine is in {0, 1, 2}.
10
INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR
Second Solution Continued
The other machines:
If (S + 1) mod 3 = L then
S := L
If (S + 1) mod 3 = R then
S := R
11
INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR
The condition (s + 1) mod 3 covers the three possible states; for s = 0, 1, 2, we have
(s + 1) mod 3 = 1, 2, 0. These result in the following three possibilities:
12
INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR
The top machine, machine n – 1, behaves as follows:
If L = R and (L = 1) mod 3 ≠ S then
S := (L + 1) mod 3
The state of the top machine depends upon both its left and right neighbors (the bottom
machine). The condition specifies that the left neighbor (L) and the right neighbor (R)
should be in the same state and (L + 1) mod 3 should not be equal to S.
(Note that (L + 1) and 3 is 1, 2, 0 when L is 0, 1, 2, respectively). Thus the state of the top
machine is as follows:
13
INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR
All other machines behave as follows:
If (S + 1) mod 3 = L then
S := L
If (S + 1) mod 3 = R then
S := R
While finding out the state of the other machines (machines 1 and 2 in the example
below), we first compare the state of a machine with its left neighbor:
1. If s = 0 and L = 1, then s = 0.
2. If s = 1 and L = 2, then s = 2.
3. If s = 2 and L = 0, then s = 1.
14
INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR