Quorum

Quorum based protocols

Uploaded by

Eugene Gitonga

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF or read online on Scribd

0% found this document useful (0 votes)

40 views14 pages

Quorum

Quorum based protocols

Uploaded by

Eugene Gitonga

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF or read online on Scribd

You are on page 1/ 14

‘A QUORUM-BASED COMMIT PROTOCOL Dale Skeen TR 82-483 February 1982 Department of Computer Science Cornell University Ithaca, New York 14853A QUORUH-BASED COMMIT PROTOCOL Dale Skeen Computer Science Department Cornell University Ithaca, New York Abstract Herein, we propose a commit protocol and an associated recovery protocol that is resilient to site failures, lost messages» and network partitioning. The protocols do not require that a failure be correctly identified or even detected. The only potential effect of undetected failures is a degradation in performance. The protocols use a weighted voting scheme that supports an arbitrary degree of data replication (including none) and allows unila~ terally aborts by any site. This lact property facilitates the integration of these protocols with concurrency control protocols. Both protocols are centralized protocols with low message overhead.Introduction A transaction is, by definition, an atomic operation on a distributed database system. Either all changes by the transaction are permanently installed in the database, in which case the transaction is said to be con- mitted, or no changes persist, in which case the transaction is said to be aborted. It is the task of a commit protocol to ensure that a transaction is atomically executed. In this paper we propose a commit protocol that is resilient to multiple occurrences of the following classes of benevolent failures: arbitrary site failures, lost messages, and network partitioning. It does not require that the type of failure be correctly determined, in fact, resiliency is guaranteed even if failures go undetected. The protocol uses a weighted voting scheme to resolve conflicts during failures. When failures occur, a transaction is committed only if a minimum number of votes, called a commit quorum and denoted V,, are cast for committing. Similarly, in the presence of failures, a transaction will be aborted only if a minimum mumber of votes, called an abort quorum and denoted V,, are cast for aborting. A commit quorum does not have to equal an abort quorum, but their sun must exceed the total number of votes. Voting schemes have been proposed previously for transaction manage~ ment. Thomas introduced a majority voting scheme to ensure consistency in a fully replicated database ([THOM/9]). Gifford extended the scheme by assigning weights to sites and using quorums rather than a simple majority (LGIFF79]). The proposed protocol differs from the previous work in several important ways: . (1) It is a commit protocol, not a concurrency control schene. It provides atomicity at a pex transaction basis. Nonetheless, it is straightfor- ward to integrate any type of concurrency control protocol into this protocol. (2) It allows unilateral aborts during the first phase of the transaction. A site may decide to abort because of several reasons, for example, a deadlock is detected locally. (3) It is primarily intended for partially replicated distributed databases vhere a transaction can read fron any copy but must update all copies. In addition, the protocol exhibits the following properties? (1) It is a centralized protocol and, thus, benefits from the economy of centralized protocols. (2) In the absence of failures it is no more expensive than previously pro~ posed protocols that are resilient only to coordinator failures (and not to a partitioning of the network). (3) If all failures are eventually repaired, then the protocol will eventually terminate. (4) It is a blocking protocol -- operational sites must occasionally wait until a failure is repaired. This is an undesirable but necessary property exhibited by any protocol that is resilient to network partition~ ing ((SKEE8la]). However, the protocol can be tuned so that thefrequency of blocking is low. This paper is divided into six sections. The second section states our assumptions and defines the terminology used in the remainder of the paper. The third section develops a resilient quorum-based commit protocol, and the fourth section develops @ resilient quorum-based recovery protocol. The recovery protocol is invoked whenever a group of sites can no longer commun- icate with the original coordinator (either it has failed or the network has Partitioned). Like the conmit protocol, it is a centralized protocol. The fifth section discusses performance, and the sixth section concludes the Paper. Although the protocols proposed are resilient to many classes of failures, this paper will focus on the problem of network partitioning. This class of failures is generally agreed to the most difficult class to handle. The other two classes, site feilures and lost messages, can be cast special cases of a partitioned network. In a site failure, a single site is isolated (partitioned) from the remainder: of the network. A lost message can be viewed as a very short lived partitioning. In all cases, the protocols work without modifications. 2. Background We assume that an underlying communications network provides point-to- point communication between any pair of sites. We also assume that it gen- erates no spontaneous messages, and that garbled messages are detected and deleted. We do not assume that messages arrive in order nor that it detects lost messages. A partitioned network occurs when there are two or more disjoint groups of sites such that no communication is possible between the groups. Each of the disjoint groups is called a partition. A distributed transaction T is decomposed into subtransactions 1). T,» ssey Tye where a subtransaction is executed at one of the N participating sites. Any subtransaction can be unilaterally aborted, which results in the abortion of the entire transaction. Hence, for transaction T to be commit~ ted, all sites must agree to conmit their subtransaction. MWe assume that a subtransaction can be atomically executed by a local transaction management system ([GRAY79 ,LIND791). It is the responsibility of a commit protocol to ensure that all subtransactions are consistently committed or aborted. One of the simplest commit protocols is the two-phase protocol ([GRAY79, LAMP76]) depicted in Figure 1. The protocol uses a central site, the coordinator, to direct the execution of the transaction at the.other sites. Each slave has a chance to abort the transaction by replying with a "no" in the first round. A commit protocol can be conveniently described by a set of state diagrams, one for each participating site ([SKEES1a]). The diagram for Site i describes the processing of subtrensaction T;. A state in the diagran is called a local transaction state. In the two-phase conmit protocol, a single state diagran (illustrated in Figure 2.) suffices to describe processing at all sites. For both the coordinator and the slaves, there are four distinct and easily identifiedCCORDIRATOR SLAVE (1) Transaction is received. Subtransactions are sent to each slave. Subtransaction is received. A reply is sent: yea to commits no to abort. (2) I£ all sites respond yes then commit is sent; 7 else, abort is sent. Either commit or abort is received and processed. Figure 1. The two-phase commit protocol. Figure 2. The state diagram for the two-phase commit protocol. loca} transaction states: the imitial state (state q in the diagram), thewait state (w), the abort state (a), and the commit state (c). A site occupies the initial state until it decides whether to unilateral abort the transaction. If the site decides against an abort, then the wait state is entered. This state represents a period of uncertainty for the sites where it has agreed to proceed with the transaction but does not yet know its out come (i.e. committed or aborted). The commit and abort states are self- explanatory. The local transaction states of any protocol form two disjoint subset: the committable states and the noncommittable states. A site occupies a conmittable state only if all sites have agreed to proceed with the transac~ tion. For example, the only committable state in the two-phase commit protocol is the commit state, A state that is not a committable state is a noncomittable state. 3. A Resilient Commit Protocol The two-phase commit protocol is not a-very robust protocol. Whenever the coordinator fails or becomes partitioned from the slaves, the slaves must block until the failure can be repaired. In this section we develop a very resilient commit protocol that allows recovery from both of these types of failures. The section develops the commit protocol in detail; the next section discusses the associated recovery protocols for handling coordinator failures and partitioning. Each site is assigned an integral nonnegative number of votes. (The number can be 0, in which case the site is a passive participant.) The basic idea is that whenever a group of communicating sites establishes a quorum, they are allowed to proceed. There are two distinct types of quoruns - a commit quorum and an abort quorum, Let Vs Vor required for a conmit quorum, and the number required for an abort quorum. A resilient quorum-based protocol must obey the following properties (LSKEEB1¢]): Q) Vet pY where 0V,. One argument concerns protocols allowing unilateral aborts: if a significant number of transactions are unilaterally aborted, then clearly V, should be smaller. A stronger argument is that most site failures are expected to occur during Phase 1 of the commit protocol since most of the transaction execution tine is epent in Phase 1. This phase is time consuming because the majority of the data processing takes place during it; whereas, Phase 2 and Phase 3 syn- chronize state information among the sites and require very little local Processing. If sites fail during Phase 1, then the transaction must be aborted -- hence, it should be easy to abort. Am interesting heuristic for choosing V, is based on a rough estimate of the failure distribution of the sites. ‘This heuristic is useful in environments where site failures, rather than network partitions, predom inate. Let P(V,) be the probability that at least an abort quorum is opera~ tional. P(V,) is a decreasing function in V,. The point is to choose the maximum V, such that V,<=Vg and P(V,) exceeds a minimum level of desired availability. As mentioned before, the weight of a site can be zero, in which case the site contributes nothing toward forming a quorum. (However, such a site can still unilaterally abort the transaction.) When designing a protocol, a zero-weighted site can be eliminated from all phases requiring the formation of a quorum. In the extrone case, where only a single site has a non-zero weight, a quorum based commit protocol degenerates into the standard two- 10phase protocol with all of its disadvantages. Specifically, all sites must block on the failure of the only nonzero weighted site (vhich is normally the coordinator). 6. Conclusion The use of quorums is a standard recovery technique for handling net~ work partitioning (even primary site schemes, e.g. [STON79], are a degen- erate case of using quorums). We have presented a very general quorum-based commit protocol that can be used with both replicated and nonreplicated data. Unlike previous echenes it allows a single site to unilaterally abort the transaction. Quorum-based protocols are resilient because a site is allowed to par- ticipate in only one type of quorum. Quorum sizes are carefully chosen such that the formation of both a commit and an abort quorum requires the parti cipation of a common site. In this way mutual exclusion is assured -~ only one type of quorum can be formed during the execution of a transaction. (owever, it is possible for multiple occurrences of a single type of quorum to be formed. For example, since abort quorums are usually small, more than fone can be formed concurrently.) In such a scheme the concurrent execution of several coordinators, even if they are within the same partition, does not destroy consistency. When a new coordinator is elected in the proposed recovery protocol, it polls all sites about their current local state. In making a coumit deci- sion, only the replies from the latest poll is used -- information obtained in earlier polls is ignored. Less conservative approaches which uses previous information can be found in [SKEES1c]. REFERENCES CaLsB76] Alsberg, P. and Day, J.s "A Principle for Resilient Sharing of Distributed Resources." Proc. 2nd International Conference on Software Ingineering, San Francisco, Ca+, October 1976. {cac79} Garcia-Molina, Hector, Ph.D. Thesis, Stanford University» 1979. Uoarcs1] Garcia-Molina, Hector, "Elections in a Distributed Computing System," TR No. 280, Princeton University, Decenber, 1980. [GIFF791 Gifford, David, "Weighted Voting for Replicated Data" Qperat= ing Systems Reviews 13, 5, Dec.» 1979, pp. 150-9. Coray79] Gray, J. N., "Notes on Database Operating Systems," in Operat- ing Systems: An Advanced Course, Springer-Verlag, 1979+ CHana179] Hammer, M. and Shipman, D., "Reliability Mechanisms for SDD-1: A Systen for Distributed Databases," Computer Corporation of America, Canbridge, Masse» July 1979. a(LAMP761 (Linp79] [skEE81a] (SKEES81b] [skEE81¢] {sT0n79] [TH0n79) Lampson, B. and Sturgis, H., "Crash Recovery in a Distributed Storage System," -Tech. Report, Computer Science Laboratory» Xerox Parc, Palo Alto, California, 1976. Lindsay, B.G. et ale, "Notes on Distributed Databases," IBM Research Report, no. RJ2571 (July 1979). Skeen, D. and M. Stonebraker, "A Formal Model of Crash Recovery in a Distributed System,” IEEE JIransactions on Software Engineering, (to appear). Skeen, De, "Nonblocking Commit Protocols." SIGMOD Intexna- ional Conf. on Management of Data, Ann Arbor, Michigan, 1981. Skeen, D., "Crash Recovery in a Distributed Database System Ph.D. Thesis, University of California, Berkeley (in prepa tion). Stonebraker, M., "Concurrency Control and Consistency of Mul- tiple Copies in Distributed INGRES," IEEE Transactions on Software Engineering, May 1979. ‘Thomas, Robert, "A Majority Consensus Approach to Concurrency Control," Transactions on Database Systems, 4, 2, June 1979. 412

2.11 Distributed Transaction
No ratings yet
2.11 Distributed Transaction
27 pages
Distributed DBMS Reliability - 3 of 3 (Good)
50% (2)
Distributed DBMS Reliability - 3 of 3 (Good)
35 pages
Unit # IV Replication and Fault Tolerance
No ratings yet
Unit # IV Replication and Fault Tolerance
82 pages
Distributed Transactions
0% (1)
Distributed Transactions
52 pages
Distributed Recovery Management: UNIT-4
No ratings yet
Distributed Recovery Management: UNIT-4
31 pages
Commit Protocols Non-Blocking Commit Protocols
No ratings yet
Commit Protocols Non-Blocking Commit Protocols
10 pages
Distributed Transaction
No ratings yet
Distributed Transaction
46 pages
Aks Replication Control
No ratings yet
Aks Replication Control
71 pages
ch23 1
No ratings yet
ch23 1
34 pages
DS Chapter V8.0fault Tolerance
No ratings yet
DS Chapter V8.0fault Tolerance
23 pages
Lecture 13
No ratings yet
Lecture 13
37 pages
Distributed Reliability Protocol
No ratings yet
Distributed Reliability Protocol
10 pages
WINSEM2023-24 CSI2004 TH VL2023240501820 2024-02-07 Reference-Material-I
No ratings yet
WINSEM2023-24 CSI2004 TH VL2023240501820 2024-02-07 Reference-Material-I
75 pages
Word Unit5
No ratings yet
Word Unit5
19 pages
13 - Distributed Transactions
No ratings yet
13 - Distributed Transactions
28 pages
Unit IV - Distributed Transaction Processing
No ratings yet
Unit IV - Distributed Transaction Processing
38 pages
A Beginner's Guide To Paxos
No ratings yet
A Beginner's Guide To Paxos
32 pages
Lec 22
No ratings yet
Lec 22
22 pages
Efficient and Non-Blocking Agreement Protocols: Suyash Gupta Mohammad Sadoghi
No ratings yet
Efficient and Non-Blocking Agreement Protocols: Suyash Gupta Mohammad Sadoghi
47 pages
Lecture 05
No ratings yet
Lecture 05
29 pages
DISTRIBUTEDDATABASESYSTEM
No ratings yet
DISTRIBUTEDDATABASESYSTEM
23 pages
Lect 22
No ratings yet
Lect 22
27 pages
Fault Tolerance in Distributed Systems: A Fault-Tolerant System
No ratings yet
Fault Tolerance in Distributed Systems: A Fault-Tolerant System
15 pages
Nonblocking Commit Protocols: Dale Skeen
No ratings yet
Nonblocking Commit Protocols: Dale Skeen
42 pages
Distributed Systems Unit 4
No ratings yet
Distributed Systems Unit 4
26 pages
Accord
No ratings yet
Accord
25 pages
Two-Phase Commitment (2PC) Protocol: Prepared by Somenath Sengupta For MCA 3rd Year
No ratings yet
Two-Phase Commitment (2PC) Protocol: Prepared by Somenath Sengupta For MCA 3rd Year
5 pages
Logless One-Phase Commit Made Possible For Highly-Available Datastores
No ratings yet
Logless One-Phase Commit Made Possible For Highly-Available Datastores
26 pages
Distributed Systems - Fault Tolerance
No ratings yet
Distributed Systems - Fault Tolerance
21 pages
Reliability and Security in The Distributed Databases
No ratings yet
Reliability and Security in The Distributed Databases
29 pages
PBFT
No ratings yet
PBFT
26 pages
2 Phase
No ratings yet
2 Phase
31 pages
Distributed Transactions
No ratings yet
Distributed Transactions
27 pages
DDS Unit - 4
No ratings yet
DDS Unit - 4
22 pages
3 - Nonblocking Commit Protocols
No ratings yet
3 - Nonblocking Commit Protocols
28 pages
Fert
No ratings yet
Fert
38 pages
25 DistributedCoordination
No ratings yet
25 DistributedCoordination
30 pages
Atomic Commit Protocol
No ratings yet
Atomic Commit Protocol
14 pages
Unit - Iv
No ratings yet
Unit - Iv
15 pages
UNIT - 4B Fault Tolerance
No ratings yet
UNIT - 4B Fault Tolerance
13 pages
Distributed Transactions
No ratings yet
Distributed Transactions
37 pages
DDS Unit - 3
No ratings yet
DDS Unit - 3
15 pages
Consensus On Transaction Commit
No ratings yet
Consensus On Transaction Commit
28 pages
What Is A Transaction
No ratings yet
What Is A Transaction
7 pages
Distributed Commit Protocols
No ratings yet
Distributed Commit Protocols
9 pages
Distributed Transactions - Database Systems
No ratings yet
Distributed Transactions - Database Systems
10 pages
Distributed Transaction Model
No ratings yet
Distributed Transaction Model
17 pages
DDB Unit3
No ratings yet
DDB Unit3
11 pages
CS542: Topics in Distributed Systems
No ratings yet
CS542: Topics in Distributed Systems
11 pages
Ddbs Checkpointing ... Ddbs Checkpointing ... : Phase 1 at Css Phase 2 at CC
No ratings yet
Ddbs Checkpointing ... Ddbs Checkpointing ... : Phase 1 at Css Phase 2 at CC
9 pages
CS 194: Distributed Systems
No ratings yet
CS 194: Distributed Systems
15 pages
Nested Transactions Nested Transactions
No ratings yet
Nested Transactions Nested Transactions
11 pages
Design and Implementation of A Two-Phase Commit Protocol Simulator
No ratings yet
Design and Implementation of A Two-Phase Commit Protocol Simulator
8 pages
Distributed Database Systems: G Advantages
No ratings yet
Distributed Database Systems: G Advantages
11 pages
Session 35
No ratings yet
Session 35
3 pages
Another Advantage of Free Choice: Completely Asynchronous Agreement Protocols (Extended Abstract)
No ratings yet
Another Advantage of Free Choice: Completely Asynchronous Agreement Protocols (Extended Abstract)
5 pages
Ct-2 Assignment DS ECS805
No ratings yet
Ct-2 Assignment DS ECS805
5 pages
Distributed DBMS-Commit Protocols
No ratings yet
Distributed DBMS-Commit Protocols
2 pages
Os Answer 5
No ratings yet
Os Answer 5
2 pages

Quorum

Uploaded by

Quorum

Uploaded by

You might also like