DistributedTransaction
DistributedTransaction
X
Client T1 N
T T 12
Y
T
T
T
21
T2
Client
Y
P
Z
T
22
X
Client T A a.withdraw(10)
1
T
T = openTransaction Y
openSubTransaction T B b.withdraw(20)
2
a.withdraw(10);
openSubTransaction
b.withdraw(20); Z
openSubTransaction
c.deposit(10); T
3 C c.deposit(10)
openSubTransaction
d.deposit(20); T D d.deposit(20)
4
closeTransaction
Coordinator of a distributed transaction
T = openTransaction
join BranchY
a.withdraw(4);
c.deposit(4); participant
b.withdraw(3);
d.deposit(3); C c.deposit(4);
closeTransaction
D d.deposit(3);
Note: client invoke an operation b.withdraw(),
B will inform participant at BranchY to join coordinator. BranchZ
canCommit?(trans)-> Yes / No
Call from coordinator to participant to ask whether
it can commit a transaction. Participant replies
with its vote.
doCommit(trans)
Call from coordinator to participant to tell
participant to commit its part of a transaction.
doAbort(trans)
Call from coordinator to participant to tell
participant to abort its part of a transaction.
haveCommitted(trans, participant)
Call from participant to coordinator to confirm that
it has committed the transaction.
getDecision(trans) -> Yes / No
Call from participant to coordinator to ask for the
decision on a transaction after it has voted Yes but
has still had no reply after some delay. Used to
recover from server crash or delayed messages.
The two-phase commit protocol
Coordinator Participant
Consider when a participant has voted Yes and is waiting for the
coordinator to report on the outcome of the vote by telling it to
commit or abort.
Such a participant is uncertain and cannot proceed any further. The
objects used by its transaction cannot be released for use by other
transactions.
Participant makes a getDecision request to the coordinator to
determine the outcome. If the coordinator has failed, the participant
will not get the decision until the coordinator is replaced resulting in
extensive delay for participant in uncertain state.
Timeout are used since exchange of information can fail when one
of the servers crashes, or when messages are lost So process will
not block forever.
Performance of two-phase commit protocol
T U
Write(A) at X locks A
Write(B) at Y locks B
Read(B) at Y waits for U
Read(A) at X waits
for T
******************************************************************
T before U in one server X and U before T in server Y.
These different ordering can lead to cyclic dependencies
between transactions and a distributed deadlock situation
arises.
Distributed Deadlock
U V W
d.deposit(10)lockD
at Z
b.deposit(10)lock B
a.deposit(20)lock A at Y
at X
c.deposit(30)lock C
wait atY
b.withdraw(30) at Z
wait at Z
c.withdraw(20)
wait atX
a.withdraw(20)
U V and W: transactions
Objects a and b by server X and Y
Objects c and d by server Z
Figure 14.14
Distributed deadlock
(a) (b)
W
W
Held by Waits for
C D A
Z X V
Held
Waits Held by by
for U
V U
B Waits for
Held
by
Y
Figure 14.14
Local and global wait-for graphs
U V
W U
Y
Global wait for graph is held in part by
X each of the several servers involved.
Communication between these servers is
required to find cycles in the graph.
Simple solution: one server takes on the
role of global deadlock detector. From time
V W to time, each server sends the latest copy
of its local wait-for graph.
Disadvantages: poor availability, lack of
fault tolerance and no ability to scale. The
Z cost of frequent transmission of local wait-
for graph is high.
Phantom deadlock
T
T U V T
U V
X Y
suppose U releases object at X and request object held by
V . U->V
Then the global detector will see deadlock. However, the
edge from T to U no longer exist.
W
W U V W Held by Waits for
Deadlock
detected C
A
Z
Initiation X
W U V
Waits
for W U
V
U
V … …W U V
Three steps
coordinator
T = openTransaction
join BranchY
a.withdraw(4);
c.deposit(4); participant
b.withdraw(3);
d.deposit(3); C c.deposit(4);
closeTransaction
D d.deposit(3);
Note: client invoke an operation b.withdraw(),
B will inform participant at BranchY to join coordinator. BranchZ
At about the same time, T waits for U ( T->U) and W waits for V (W->V).
Two probes occur, two deadlocks detected by different servers.
Using priority can also reduce the number of probes. For example, we only
initiate probe when higher priority transaction starts to wait for lower priority
transaction.
If we say the priority order from high to low is: T, U, V and W. Then only the
probe of T->U will be sent and not the probe of W->V.
Transaction recovery
Type of entry
Description of contents of entry
Object A value of an object.
Transaction
Transaction
status identifier, transactionprepared
status
,committed
(
) aborted
and other status values used for the two-phase
commit protocol.
Intentions
Transaction
list identifier and a sequence of intentions, ea
which consists of <identifier of Position
object>, of
<
value of object>.
Intention list records all of its currently active transactions. A list of a particular
transaction contains a list of the references and the values of all the objects that
are altered. When committed, the committed version of each object is replaced
by the tentative version made by that transaction. When a transaction aborts,
the server uses the intention list to delete all the tentative versions of objects.
When a participant says it is prepared to commit, its recovery manager must have
saved both its intention list for that transaction and the objects in that intention
list in its recovery file, so it will be able to carry out the commitment later on,
even if it crashes in the interim.
Figure 14.19
Log for banking service
P0 P1 P2 P3 P4 P5 P6 Tran
P7
Object:
AObject:
BObject:
CObject:
AObject:
B Trans: s: U
T Trans:T Object:C Objec Bprepared
100 200 300 80 220 prepared committed278 t:
242
<A, P1 > <C, P5>
<B, P2 > <B, P6>
P0 P3 P4
Checkpoint
End
of log
Log technique contains history of all transactions by a server. When
prepared, commits or aborts, the recovery manager is called. It
appends all objects in its intention list followed by the current status.
After a crash, any transaction that does not have a committed
status in the log is aborted.
Each transaction status entry contains a pointer to the position in
the recovery file of the previous transaction status entry to enable
the recovery manager to follow the transaction entries in reverse
order. The last pointer points to the checkpoint.
Recovery of objects
P0 P1 P2 P3 P4 P5 P6 P7
Tran
Object:
AObject:
BObject:
CObject:
AObject:
B Trans: s: U
T Trans:T Object:C Objec Bprepared
100 200 300 80 220 prepared committed278 t:
242
<A, P1 > <C, P5>
<B, P2 > <B, P6>
P0 P3 P4
Checkpoint
End
When a server is replaced after a crash, it first sets default initial values for its objectsofand
log
hands over to its recovery manager, which is responsible for restoring the server’s objects so
that include all effects of all committed transactions in the correct order and none of aborted
transactions. Two approaches:
Starting from the beginning of the most recent checkpoint, reads in the values of each of the objects.
For committed transactions replaces the values of the objects.
Reading the recovery file backwards. Use transactions with committed status to restore those objects
that have not yet been restored. It continues until it has restored all of the server’s object. Advantage is
each object is restored once only.
(U aborted, ignore C and B, then restore A and B as 80 and 220, then C as 300.
Reorganize the log file: use Checkpoin: to write the current committed values of all objects to a new
recovery file. Since all we need is the committed values.
Figure 14.21
Log with entries relating to two-phase commit protocol
Trans:
T Coord’r:
T Trans:
T Trans:U Part’pant:
UTrans:U Trans:U
prepared
part’pant committed
prepared Coord’r: .
uncertain
. committed
list: . . .
intentions intenti
list list
ons
Coordinator uses committed/aborted to indicate that the outcome of the vote is Yes/no
and done to indicate that two-phase commit protocol is complete, prepared before vote.
Participate uses prepared to indicate it has not yet voted and can abort the transaction
and uncertain to indicate that it has voted Yes, but does not yet know the outcome and
committed indicates that has finished.
Above example, this server plays the role of coordinator for transaction T, play participant
role for transaction U.
Log with entries relating to two-phase commit protocol