0% found this document useful (0 votes)

17 views79 pages

5-Transaction Processing

The document outlines the principles of distributed database systems, focusing on distributed transaction processing, concurrency control, and reliability. It discusses transaction characteristics, including atomicity, consistency, isolation, and durability, as well as various concurrency control algorithms such as locking-based and timestamp-based methods. Additionally, it covers deadlock detection techniques and the importance of ensuring serializability in distributed environments.

Uploaded by

bharathkumar60785

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

17 views79 pages

5-Transaction Processing

Uploaded by

bharathkumar60785

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 79

Principles of Distributed Database

Systems
M. Tamer Özsu
Patrick Valduriez

© 2020, M.T. Özsu & P. Valduriez 1

Outline
◼ Distributed Transaction Processing
❑ Distributed Concurrency Control
❑ Distributed Reliability

© 2020, M.T. Özsu & P. Valduriez 2

Transaction

A transaction is a collection of actions that make consistent

transformations of system states while preserving system
consistency.
❑ concurrency transparency
❑ failure transparency

© 2020, M.T. Özsu & P. Valduriez 3

Transaction Characterization

Begin_transaction
…
Read
Read
…
Write
Read
…
Commit
◼ Read set (RS)
❑ The set of data items that are read by a transaction
◼ Write set (WS)
❑ The set of data items whose values are changed by this transaction
◼ Base set (BS)
❑ RS and WS

© 2020, M.T. Özsu & P. Valduriez 4

Principles of Transactions

ATOMICITY
❑ all or nothing

CONSISTENCY
❑ no violation of integrity constraints

ISOLATION
❑ concurrent changes invisible  serializable

DURABILITY
❑ committed updates persist

© 2020, M.T. Özsu & P. Valduriez 5

Transactions Provide…

◼ Atomic and reliable execution in the presence of

failures

◼ Correct and fast execution in the presence of

multiple user accesses

◼ Correct management of replicas (if they support

it)

© 2020, M.T. Özsu & P. Valduriez 6

Distributed TM Architecture

© 2020, M.T. Özsu & P. Valduriez 7

Outline
◼ Distributed Transaction Processing
❑ Distributed Concurrency Control
❑

© 2020, M.T. Özsu & P. Valduriez 8

Concurrency Control

◼ The problem of synchronizing concurrent transactions

such that the consistency of the database is maintained
while, at the same time, maximum degree of
concurrency is achieved.
◼ Enforce isolation property
◼ Anomalies:
❑ Lost updates
◼ The effects of some transactions are not reflected on the database.
❑ Inconsistent retrievals
◼ A transaction, if it reads the same data item more than once, should
always read the same value.

© 2020, M.T. Özsu & P. Valduriez 9

Conflict Operations

◼ Two actions are said to be in conflict (conflicting pair) if:

1. The actions belong to different transactions.

2. At least one of the actions is a write operation.

3. The actions access the same object (read or write).

◼ The following set of actions is conflicting:

❑ R1(X), W2(X), W3(X) (3 conflicting pairs)

◼ While the following sets of actions are not:

❑ R1(X), R2(X), R3(X)

❑ R1(X), W2(Y), R3(X)

© 2020, M.T. Özsu & P. Valduriez 10

Example of Conflict Equivalence
conflict
S1: r1(x) r2(x) w2(x) r1(y) w1(y)

S2: r1(x) r2(x) r1(y) w2(x) w1(y) schedules have

the same set of
S3: r1(x) r1(y) r2(x) w2(x) w1(y) conflicting
operations.
S4: r1(x) r1(y) r2(x) w1(y) w2(x)

S5: r1(x) r1(y) w1(y) r2(x) w2(x)

conflicting operations
S1 is equivalent to S5 ordered in same way
S5 is the serial schedule T1, T2
S1 is serializable
S1 is not equivalent to the serial schedule T2, T1
11
Serializable Schedules
initial state final state A schedule is
x=1, y=3 r2(x) w2(y) r1(x) w1(x) x=5, y=1 said to be
T2 T1 serializable
x=1, y=3 r1(x) r2(x) w2(y) w1(x) x=5, y=1 when the
schedule is
x=1, y=3 r2(x) r1(x) w2(y) w1(x) x=5, y=1 conflict-
equivalent to
x=1, y=3 r2(x) r1(x) w1(x) w2(y) x=5, y=1
one or more
x=1, y=3 r1(x) r2(x) w1(x) w2(y) x=5, y=1 serial schedules.
T1: begin transaction T2: begin transaction
read (x, X); read (x,Y);
X = X+4; write (y,Y);
write (x, X); commit;
commit;
12
Serialization Graph of a Schedule, S

◼ Nodes represent transactions

◼ There is a directed edge from node Ti to node Tj if Ti
has an operation pi,k that conflicts with an operation pj,r
of Tj and pi,k precedes pj,r in S
◼ Theorem - A schedule is serializable if and only if its
serialization graph has no cycles

13
Example of Serialization Conflict (*)
Graph S: … p1,i, …, p2,j, ...

T2 T4 S is serializable in order
* T1 T2 T3 T4 T5 T6 T7
T1 T5 T6 T7

T3
S is not serializable due
T2 T4
to cycle T2 T6 T7 T2

T1 T5 T6 T7

T3
14
Serializability in Distributed DBMS

◼ Two histories have to be considered:

❑ local histories
❑ global history

◼ For global transactions (i.e., global history) to be

serializable, two conditions are necessary:
❑ Each local history should be serializable → local serializability
❑ Two conflicting operations should be in the same relative order
in all of the local histories where they appear together →
global serializability

© 2020, M.T. Özsu & P. Valduriez 15

Global Non-serializability

T1: Read(x) T2: Read(x)

x ←x-100 Read(y)
Write(x) Commit
Read(y)
y ←y+100
Write(y)
Commit

◼ x stored at Site 1, y stored at Site 2

◼ LH1, LH2 are individually serializable (in fact serial), and the
two transactions are globally serializable.
LH1={R1(x), W1(x), R2(x)}
LH2={R1(y), W1(y), R2(y)}
© 2020, M.T. Özsu & P. Valduriez 16
Global Non-serializability

T1: Read(x) T2: Read(x)

x ←x-100 Read(y)
Write(x) Commit
Read(y)
y ←y+100
Write(y)
Commit

◼ x stored at Site 1, y stored at Site 2

◼ LH1, LH2 are individually serializable (in fact serial), but the
two transactions are not globally serializable.
LH1={R1(x),W1(x), R2(x)}
LH2={R2(y), R1(y),W1(y)}
© 2020, M.T. Özsu & P. Valduriez 17
Concurrency Control Algorithms

◼ Pessimistic Algorithms
❑ Locking-based Algorithms
◼ Centralized (primary site) 2PL (Two-Phase
Locking)
◼ Distributed 2PL
❑ Timestamp-based Algorithms
◼ Basic TO (Timestamp Ordering)
◼ Conservative TO
❑ Multiversion Concurrency Control

◼ Optimistic Algorithms

© 2020, M.T. Özsu & P. Valduriez 18

Locking-Based Algorithms

◼ Transactions indicate their intentions by requesting locks

from the scheduler (called lock manager).
◼ Locks are either read lock (rl) [also called shared lock] or
write lock (wl) [also called exclusive lock]
◼ Read locks and write locks conflict (because Read and
Write operations are incompatible
rl wl
rl yes no
wl no no
◼ Locking works nicely to allow concurrent processing of
transactions.
© 2020, M.T. Özsu & P. Valduriez 19
Two-Phase Locking (2PL)
◼ Transaction does not release a lock until it
has all the locks it will ever require.
◼ Transaction has a locking phase followed by
an unlocking phase

Ts first unlock

Number
of locks T commits
held by T

time

◼ Guarantees serializability when locking is

done in this way
20
Centralized 2PL (C2PL)
◼ There is only one Coordinating TM, the lock manager at the central site,
and the data processors (DP) at the other participating sites.
◼ The participating sites are those that store the data items on which the
operation is to be carried out.
◼ Lock requests
are issued to
Coordinating
TM.

© 2020, M.T. Özsu & P. Valduriez 21

Distributed 2PL (D2PL)
◼ Lock managers are placed at each site. Each scheduler
handles lock requests for data at that site. The
distributed 2PL is similar to the C2PL, with two major
modifications.
◼ The messages that are sent to the central site lock
manager in C2PL are sent to the lock managers at all
participating sites in D2PL.
◼ The second difference is that the operations are not
passed to the data processors by the coordinating
transaction manager, but by the participating lock
managers.
❑ This means that the coordinating transaction manager does not
wait for a “lock request granted” message.
© 2020, M.T. Özsu & P. Valduriez 22
Distributed 2PL Execution

© 2020, M.T. Özsu & P. Valduriez 23

Deadlock
◼ A transaction is deadlocked if it is blocked and will
remain blocked until there is intervention.
◼ Locking-based CC algorithms may cause deadlocks.
◼ TO-based algorithms that involve waiting may cause
deadlocks.
◼ Wait-for graph
❑ If transaction Ti waits for another transaction Tj to release a lock
on an entity, then Ti → Tj in WFG.

Ti Tj

© 2020, M.T. Özsu & P. Valduriez 24

Local versus Global WFG
◼ T1 and T2 run at site 1, T3 and T4 run at site 2.
◼ T3 waits for a lock held by T4 which waits for a lock held by T1 which
waits for a lock held by T2 which, in turn, waits for a lock held by T3.

Local WFG

Global WFG

© 2020, M.T. Özsu & P. Valduriez 25

Deadlock Detection

◼ Transactions are allowed to wait freely.

◼ Wait-for graphs and cycles.
◼ Topologies for deadlock detection algorithms
❑ Centralized
❑ Hierarchical
❑ Distributed

© 2020, M.T. Özsu & P. Valduriez 26

Centralized Deadlock Detection
◼ One site is designated as the deadlock detector for the
system.
◼ Each scheduler periodically sends its local WFG to the
central site which merges them to a global WFG to
determine cycles.
◼ How often to transmit?
❑ Too often ⇒ higher communication cost but lower
delays due to undetected deadlocks
❑ Too late ⇒ higher delays due to deadlocks, but lower
communication cost
◼ Would be a reasonable choice if the concurrency control
algorithm is also centralized.
◼ Proposed for Distributed INGRES
© 2020, M.T. Özsu & P. Valduriez 27
Hierarchical Deadlock Detection
• An alternative to centralized deadlock detection is the building of a
hierarchy of deadlock detectors (see Fig. below).
• Deadlocks that are local to a single site would be detected at that
site using the LWFG.
• Each site also sends its LWFG to the deadlock detector at the next
level.
• For example, a deadlock at site 1
would be detected by the local
deadlock detector (DD) at site 1
(denoted DD21, 2 for level 2, 1 for
site 1).
• If, however, the deadlock
involves sites 1 and 2, then DD11
detects it.
• Finally, if the deadlock involves
sites 1 and 4, DD0x detects it,
where x is one of 1, 2, 3, or 4.
© 2020, M.T. Özsu & P. Valduriez 28
Distributed Deadlock Detection
◼ There are local deadlock detectors at each site that communicate
their LWFGs with one another. The LWFG at each site is formed and
is modified as follows:
1. Since each site receives the potential deadlock cycles from other
sites, these edges are added to the LWFGs.
2. The edges in the LWFG show that local transactions are waiting
for transactions at other sites.

© 2020, M.T. Özsu & P. Valduriez 29

Distributed Deadlock Detection

◼ If there is a cycle that does not include the external edges,

there is a local deadlock that can be handled locally.
◼ If, on the other hand, there is a cycle involving these external
edges, there is a potential distributed deadlock and this cycle
information has to be communicated to other deadlock
detectors.
◼ In the case of Example, the possibility of such a distributed
deadlock is detected by both sites.

© 2020, M.T. Özsu & P. Valduriez 30

Concurrency Control Algorithms

◼ Optimistic Algorithms

© 2020, M.T. Özsu & P. Valduriez 31

Timestamp Ordering

 Transaction (Ti) is assigned a globally unique timestamp

(using system clock) ts(Ti).
 Transaction manager attaches the timestamp to all
operations issued by the transaction.
 Each data item is assigned a write timestamp (wts) and
a read timestamp (rts):
❑ rts(x) = largest timestamp of any read on x
❑ wts(x) = largest timestamp of any write on x
 Conflicting operations are resolved by timestamp order.

© 2020, M.T. Özsu & P. Valduriez 32

Basic Timestamp Ordering

◼ Two conflicting operations Oij of Ti and Okl of Tk →

❑ Oij executed before Okl iff ts(Ti) < ts(Tk).
❑ Ti is called older transaction
❑ Tk is called younger transaction

for Ri(x) for Wi(x)

if ts(Ti) < wts(x) if ts(Ti) < rts(x) or ts(Ti) < wts(x)

then reject Ri(x) then reject Wi(x)
else accept Ri(x) else accept Wi(x)
rts(x)  ts(Ti) wts(x)  ts(Ti)

© 2020, M.T. Özsu & P. Valduriez 33

Conservative Timestamp Ordering

◼ Basic timestamp ordering tries to execute an operation

as soon as it is accepted
❑ progressive
❑ too many restarts since there is no delaying
◼ Conservative timestamping delays each operation until
no operation with a smaller timestamp can arrive at that
scheduler.
◼ If this condition can be guaranteed, the scheduler will
never reject an operation.
◼ However, this delay introduces the possibility of
deadlocks.

© 2020, M.T. Özsu & P. Valduriez 34

Concurrency Control Algorithms

◼ Optimistic Algorithms

© 2020, M.T. Özsu & P. Valduriez 35

Multiversion Concurrency Control
(MVCC)
◼ Do not modify the values in the database, create new
values.
◼ Implemented in a number of systems: IBM DB2, Oracle,
SQL Server, SAP HANA, BerkeleyDB, PostgreSQL
◼ MVCC techniques typically use timestamps to maintain
transaction isolation
◼ Each version of a data item that is created is labeled with
the timestamp of the transaction that creates it.
◼ The idea is that each read operation accesses the
version of the data item that is appropriate for its
timestamp, thus reducing transaction aborts and restarts.
© 2020, M.T. Özsu & P. Valduriez 36
MVCC Reads

◼ A Ri(x) is translated into a read on one version of x.

❑ Find a version of x (say xv) such that ts(xv) is the largest
timestamp less than ts(Ti).

© 2020, M.T. Özsu & P. Valduriez 37

MVCC Writes
◼ A Wi(x) is translated into Wi(xw) so that ts(xw) = ts(Ti)
❑ accepted if and only if no other transaction with a
timestamp greater than ts(Ti) has read the value of a
version of x (say, xr), in other words, accepted if ts(xr) <
ts(xw)
❑ Rejected If the scheduler has already processed any Rj(xr)
such that ts(xw) < ts(xr)
◼ If Wi(x) is accepted, it would create a version (xc) that Rj
should have read, but did not since the version was not
available when Rj was executed

xr
© 2020, M.T. Özsu & P. Valduriez 38
Concurrency Control Algorithms

◼ Optimistic Algorithms

© 2020, M.T. Özsu & P. Valduriez 39

Optimistic Concurrency Control
Algorithms

Pessimistic execution

Validate Read Compute Write

Optimistic execution

Read Compute Validate Write

© 2020, M.T. Özsu & P. Valduriez 40

Optimistic Concurrency Control
Algorithms
◼ Transaction execution model: divide into subtransactions
each of which execute at a site
❑ Tks: transaction Tk that executes at site s
◼ Transactions run independently at each site until they
reach the end of their read phases
◼ All subtransactions are assigned a timestamp at the end
of their read phase
◼ Validation test is performed during validation phase. If
one fails, all rejected.

© 2020, M.T. Özsu & P. Valduriez 41

Optimistic CC Validation Test

 If all transactions Tks where ts(Tks) < ts(Tis) have

completed their write phase before Tis has started its
read phase, then validation succeeds
❑ Transaction executions in serial order

© 2020, M.T. Özsu & P. Valduriez 42

Optimistic CC Validation Test

 If there is any transaction Tks such that ts(Tks)<ts(Tis) and

which completes its write phase while Tis is in its read
phase, then validation succeeds if WS(Tks)  RS(Tis) = Ø
❑ Read and write phases overlap, but Tis does not read
data items written by Tks

© 2020, M.T. Özsu & P. Valduriez 43

Optimistic CC Validation Test

 If there is any transaction Tks such that ts(Tks)< ts(Tis)

and which completes its read phase before Tis completes
its read phase, then validation succeeds if
WS(Tks)  RS(Tis) = Ø and WS(Tks)  WS(Tis) = Ø
❑ They overlap, but don't access any common data
items.

© 2020, M.T. Özsu & P. Valduriez 44

Assignment #3

© 2020, M.T. Özsu & P. Valduriez 45

Outline
◼ Distributed Transaction Processing
❑

❑ Distributed Reliability

© 2020, M.T. Özsu & P. Valduriez 46

Reliability

Problem:
How to maintain

atomicity

durability

properties of transactions

© 2020, M.T. Özsu & P. Valduriez 47

Ch.10/47
Types of Failures
◼ Transaction failures
❑ Transaction aborts (unilaterally or due to deadlock)
◼ System (site) failures
❑ Failure of processor, main memory, power supply, …
❑ Main memory contents are lost, but secondary storage
contents are safe
❑ Partial vs. total failure
◼ Media failures
❑ Failure of secondary storage devices → stored data is lost
❑ Head crash/controller failure
◼ Communication failures
❑ Lost/undeliverable messages
❑ Network partitioning

Distributed Reliability Protocols
◼ Distributed reliability protocols aim to maintain the atomicity and
durability of distributed transactions.
◼ Commit protocols
❑ How to execute commit command for distributed transactions.
❑ Issue: how to ensure atomicity and durability?
◼ Termination protocols
❑ If a failure occurs, how can the remaining operational sites deal with it?
❑ Non-blocking: the occurrence of failures should not force the sites to
wait until the failure is repaired to terminate the transaction.
◼ Recovery protocols
❑ When a failure occurs, how do the sites where the failure occurred deal
with it?
❑ Independent: a failed site can determine the outcome of a transaction
without having to obtain remote information.
◼ Independent recovery  non-blocking termination

Two-Phase Commit (2PC)
◼ It is a very simple and elegant protocol that ensures the
atomic commitment of distributed transactions.
◼ Coordinator :The process at the site where the transaction
originates and which controls the execution
◼ Participant :The process at the other sites that participate in
executing the transaction
Phase 1 : The coordinator gets the participants ready to write the
results into the database
Phase 2 : Everybody writes the results into the database
Global Commit Rule:
 The coordinator aborts a transaction if and only if at least
one participant votes to abort it.
 The coordinator commits a transaction if and only if all of
the participants vote to commit it.
© 2020, M.T. Özsu & P. Valduriez 50
State Transitions in 2PC

Coordinator Participant

COMMIT

2PC Protocol Actions

Centralized 2PC

Linear 2PC

V-C: Vote-Commit, V-A: Vote-Abort, G-C: Global-commit, G-A: Global-abort

Distributed 2PC
◼ The coordinator sends the
prepare message to all
participants.
◼ Each participant then
sends its decision to all
the other participants (and
to the coordinator) by
means of either a “vote-
commit” or a “vote-abort”
message.

Distributed 2PC
◼ Each participant waits for
messages from all the
other participants and
makes its termination
decision according to the
global-commit rule.
◼ Obviously, there is no
need for the second
phase of the protocol,
since each participant has
independently reached
that decision at the end of
the first phase.

Dealing with Site Failures
Our aim is to develop nonblocking termination and
independent recovery protocols.
◼ Termination Protocol for 2PC
❑ It serves the timeouts for both the coordinator and the participant
processes.
❑ A timeout occurs at a destination site when it cannot get an
expected message from a source site within the expected time
period.
❑ In this section, we consider that this is due to the failure of the
source site.
◼ Recovery Protocol for 2PC
❑ It serves the failures for both the coordinator and the participant
processes.
◼ 3PC Protocol
© 2020, M.T. Özsu & P. Valduriez 57
Site Failures - 2PC Termination
◼ Timeout in WAIT
❑ Coordinator is waiting for the local
decisions of the participants.
❑ Cannot unilaterally commit since the
global-commit rule has not been
satisfied.
❑ Can unilaterally abort

◼ Timeout in ABORT or COMMIT

❑ Not certain that the commit or abort
procedures have been completed by
the participant sites.
❑ Thus the coordinator repeatedly
sends the “global-commit” or “global-
abort” commands to the sites that
have not yet responded, and waits
for their acknowledgement.

Site Failures - 2PC Termination
◼ Timeout in INITIAL
❑ Participant is waiting for a “prepare”
message.
❑ Coordinator must have failed in
INITIAL state
❑ Unilaterally abort

◼ Timeout in READY
❑ Participant has voted to commit the
transaction but does not know the
global decision of the coordinator.
❑ The participant cannot unilaterally
reach a decision.
❑ Stay blocked until it can learn from
someone (either the coordinator or
some other participant) the ultimate
fate of the transaction
© 2020, M.T. Özsu & P. Valduriez 59
Site Failures - 2PC Recovery

◼ Failure in INITIAL
❑ Start the commit process upon
recovery
◼ Failure in WAIT
❑ Restart the commit process
upon recovery
◼ Failure in ABORT or COMMIT
❑ Nothing special if all the acks
have been received
❑ Otherwise the termination
protocol is involved

Site Failures - 2PC Recovery

◼ Failure in INITIAL
❑ Unilaterally abort upon recovery
◼ Failure in READY
❑ The coordinator has been
informed about the local decision
❑ Treat as timeout in READY state
and invoke the termination
protocol
◼ Failure in ABORT or COMMIT
❑ These states represent the
termination conditions, so, upon
recovery, the participant does not
need to take any special action.

2PC Recovery Protocols –
Additional Cases
A site failure may occur after the
coordinator or a participant has
written a log record but before it
can send a message
◼ Coordinator site fails after
writing “begin_commit” log and
before sending “prepare”
command
❑ treat it as a failure in WAIT
state; send “prepare”
command upon recovery

2PC Recovery Protocols –
Additional Cases
A site failure may occur after the
coordinator or a participant has
written a log record but before it can
send a message
◼ Participant site fails after writing
“ready” record in log but before
“vote-commit” is sent
❑ treat it as failure in READY
state
❑ alternatively, can send “vote-
commit” upon recovery

2PC Recovery Protocols –
Additional Cases
A site failure may occur after the
coordinator or a participant has
written a log record but before it can
send a message
◼ Participant site fails after writing
“abort” record in log but before
“vote-abort” is sent
❑ no need to do anything upon
recovery

2PC Recovery Protocols –
Additional Case
◼ Coordinator site fails
after logging its final
decision record but
before sending its
decision to the
participants
❑ coordinator treats it as
a failure in COMMIT
or ABORT state
❑ participants treat it as
timeout in the READY
state
© 2020, M.T. Özsu & P. Valduriez 65
2PC Recovery Protocols –
Additional Case
◼ Participant site fails after
writing “abort” or
“commit” record in log
but before
acknowledgement is sent
❑ participant treats it as
failure in COMMIT or
ABORT state
❑ coordinator will handle
it by timeout in
COMMIT or ABORT
state
© 2020, M.T. Özsu & P. Valduriez 66
Problem With 2PC
◼ Blocking
❑ Ready implies that the
participant waits for the
coordinator
❑ If coordinator fails, site is
blocked until recovery
❑ Blocking reduces availability

◼ Independent recovery is not possible

◼ However, it is known that:
❑ Independent recovery protocols
exist only for single site failures;
❑ no independent recovery protocol
exists which is resilient to
multiple-site failures.
◼ So we search for these protocols –
3PC
© 2020, M.T. Özsu & P. Valduriez 67
Three-Phase Commit

◼ 3PC is non-blocking.
◼ A commit protocols is non-blocking iff
❑ it is synchronous (occurring) within one state transition, and

❑ its state transition diagram contains

◼ no state which is “adjacent” to both a commit and an

abort state, and
◼ no non-committable state which is “adjacent” to a
commit state
◼ Adjacent: possible to go from one state to another with a single
state transition
◼ Committable: all sites have voted to commit a transaction
❑ e.g.: COMMIT state

State Transitions in 3PC
◼ add another
Coordinator Participant
state between
the WAIT (and
READY) and
COMMIT states
which serves as
a buffer state
where the
process is ready
to commit (if that
is the final
decision) but has
not yet
committed.
2PC Protocol Actions

3PC Protocol Actions

Network Partitioning

◼ Simple partitioning
❑ Only two partitions
◼ Multiple partitioning
❑ More than two partitions
◼ Formal bounds:
❑ There exists no non-blocking protocol that is resilient to a
network partition if messages are lost when partition occurs.
❑ There exist non-blocking protocols which are resilient to a single
network partition if all undeliverable messages are returned to
sender.
❑ There exists no non-blocking protocol which is resilient to a
multiple partition.

Independent Recovery Protocols for
Network Partitioning

◼ No general solution possible

❑ allow one group to terminate while the other is blocked
❑ improve availability
◼ How to determine which group to proceed?
❑ The group with a majority
◼ How does a group know if it has majority?
❑ Centralized
◼ Whichever partitions contains the central site should terminate the
transaction
❑ Voting-based (quorum)

Quorum Protocols
◼ The network partitioning problem is handled by the commit
protocol.
◼ Every site is assigned a vote Vi.
◼ Total number of votes in the system V
◼ Abort quorum Va, commit quorum Vc
1. Va + Vc > V where 0 ≤ Va , Vc ≤ V
2. Before a transaction commits, it must obtain a commit quorum
Vc
3. Before a transaction aborts, it must obtain an abort quorum Va

◼ The first rule ensures that a transaction cannot be committed and

aborted at the same time.
◼ The next two rules indicate the votes that a transaction has to
obtain before it can terminate one way or the other.
© 2020, M.T. Özsu & P. Valduriez 74
Paxos Consensus Protocol

◼ 2PC has blocking, and to overcome it, we have 3PC

which is expensive and not resilient to network
partitioning
◼ General problem: how to reach an agreement
(consensus) among TMs about the fate of a transaction
◼ General idea: If a majority reaches a decision, the global
decision is reached (like voting)
◼ Paxos is a family of protocols for solving consensus in a
network of unreliable or fallible processors.
◼ Consensus is the process of agreeing on one result
among a group of participants.

Paxos
◼ Roles:
❑ Proposer: recommends a decision

❑ Acceptor: decides whether to accept the proposed decision

❑ Learner: discovers the agreed-upon decision by asking or it is

pushed
◼ Naïve Paxos: one proposer
❑ Operates like a 2PC

◼ In the first round, the proposer suggests a value for the variable and
acceptors send their responses (accept/not accept).
◼ If the proposer gets accepts from a majority of the acceptors, then it
determines that particular value to be the value of the variable and
notifies the acceptors who now record that value the final one.
◼ A learner can, at any point, ask an acceptor what the value of the
variable is and learn the latest value.

Paxos & Complications
◼ Multiple proposers can put forward a value for the same
variable. Therefore, an acceptor needs to pick one of the
proposed values.
❑ using a ballot number so that acceptors can
differentiate different proposals
◼ Given multiple proposals, it is possible to get split votes
on multiple proposals with no proposed value receiving a
majority.
❑ running multiple consensus rounds—if no proposal
achieves a majority, then another round is run and
this is repeated until one value achieves majority

Paxos & Complications
◼ It is possible that some of the acceptors fail after they
accept a value. If the remaining acceptors who accepted
that value do not constitute a majority, this causes a
problem.
❑ this could be treated as the second issue and a new
round can be started.
❑ However, the complication is that some learners may
have learned the accepted value from acceptors in the
previous round, and if a different value is chosen in the
new round, we have inconsistency.
❑ Paxos deals with this again by using ballot numbers.

Basic Paxos with Failures

◼ Some acceptors fail but there is quorum

❑ Not a problem

◼ Enough acceptors fail to eliminate quorum

❑ Run a new ballot

◼ Proposer/leader fails
❑ Choose a new leader and start a new ballot

AWS Academy Cloud Foundations Module 05 Student Guide: 100-ACCLFO-20-EN-SG
100% (2)
AWS Academy Cloud Foundations Module 05 Student Guide: 100-ACCLFO-20-EN-SG
79 pages
DS1 Test Patterns
No ratings yet
DS1 Test Patterns
5 pages
5 Transaction Processing
No ratings yet
5 Transaction Processing
70 pages
Ch5 (CSE417)
No ratings yet
Ch5 (CSE417)
65 pages
DBMS 2phase Locking
No ratings yet
DBMS 2phase Locking
48 pages
Database Concurrency Control
No ratings yet
Database Concurrency Control
30 pages
11-Conc Control
No ratings yet
11-Conc Control
39 pages
5-Transaction Processing Nhom2
No ratings yet
5-Transaction Processing Nhom2
65 pages
12TransactionProcessing PDF
No ratings yet
12TransactionProcessing PDF
64 pages
Unit 4 DBMS
No ratings yet
Unit 4 DBMS
72 pages
Concurrency: Database Systems Lecture 15 Natasha Alechina
No ratings yet
Concurrency: Database Systems Lecture 15 Natasha Alechina
26 pages
16 Concurrency
No ratings yet
16 Concurrency
16 pages
11 UW Concurrency
No ratings yet
11 UW Concurrency
58 pages
Concurrency Control: R&G - Chapter 17
No ratings yet
Concurrency Control: R&G - Chapter 17
28 pages
Lecture - Transactions and Properties
No ratings yet
Lecture - Transactions and Properties
67 pages
Slides 11 Transactions
No ratings yet
Slides 11 Transactions
34 pages
Transaction Concept
No ratings yet
Transaction Concept
26 pages
Lecture 3 - Concurrency Control and Fault Tolerance
No ratings yet
Lecture 3 - Concurrency Control and Fault Tolerance
54 pages
Dbms Unit 4
No ratings yet
Dbms Unit 4
48 pages
SCHEDULES in Transaction and Concurrency Control
No ratings yet
SCHEDULES in Transaction and Concurrency Control
46 pages
DDA Answers
No ratings yet
DDA Answers
8 pages
Chapter 9: Concurrency Control
No ratings yet
Chapter 9: Concurrency Control
34 pages
Unit 3
No ratings yet
Unit 3
52 pages
Week-12 Concurrency Control
No ratings yet
Week-12 Concurrency Control
26 pages
Concurrency
No ratings yet
Concurrency
70 pages
Concurrency Control Additional Notes
No ratings yet
Concurrency Control Additional Notes
21 pages
10-DBMS - Transaction
No ratings yet
10-DBMS - Transaction
50 pages
Transaction Management: Chapter 22.1, 22.2
No ratings yet
Transaction Management: Chapter 22.1, 22.2
28 pages
Lecture 20
No ratings yet
Lecture 20
64 pages
Lecture 9 Distributed Transactions
No ratings yet
Lecture 9 Distributed Transactions
7 pages
Database 8
No ratings yet
Database 8
22 pages
Transactions, Concluded, and The Future of Data Management: Zachary G. Ives
No ratings yet
Transactions, Concluded, and The Future of Data Management: Zachary G. Ives
22 pages
Unit 4 (KCS501)
No ratings yet
Unit 4 (KCS501)
12 pages
Transaction Management
No ratings yet
Transaction Management
12 pages
FALLSEM2024-25 BCSE302L TH VL2024250101553 2024-10-04 Reference-Material-I
No ratings yet
FALLSEM2024-25 BCSE302L TH VL2024250101553 2024-10-04 Reference-Material-I
27 pages
Chap16 17 Transaction Con Currency
No ratings yet
Chap16 17 Transaction Con Currency
58 pages
CSE544: Transactions: Concurrency Control Wednesday, 4/26/2006
No ratings yet
CSE544: Transactions: Concurrency Control Wednesday, 4/26/2006
52 pages
Schedules: Downloaded From Ktunotes - in
No ratings yet
Schedules: Downloaded From Ktunotes - in
54 pages
Transactions Management and Concurrency Control
No ratings yet
Transactions Management and Concurrency Control
80 pages
Lect-Transactions-Week 11
No ratings yet
Lect-Transactions-Week 11
55 pages
Outline: Background Distributed DBMS Architecture Distributed Database Design Distributed Query Processing
No ratings yet
Outline: Background Distributed DBMS Architecture Distributed Database Design Distributed Query Processing
18 pages
Transactions and Concurrecynotes
No ratings yet
Transactions and Concurrecynotes
43 pages
Transactions: Controlling Concurrent Behavior
No ratings yet
Transactions: Controlling Concurrent Behavior
45 pages
Unit 3 - Transactions and Concurrency
No ratings yet
Unit 3 - Transactions and Concurrency
69 pages
Concurrency Control
No ratings yet
Concurrency Control
53 pages
Operating System
No ratings yet
Operating System
14 pages
DBMS M5 - Ktunotes - in
No ratings yet
DBMS M5 - Ktunotes - in
58 pages
UNIT 4-fdb
No ratings yet
UNIT 4-fdb
38 pages
UNIT IV Madhura
No ratings yet
UNIT IV Madhura
59 pages
Transactions and Concurrency Control
No ratings yet
Transactions and Concurrency Control
7 pages
Dbms Weekly Test SOLVED
No ratings yet
Dbms Weekly Test SOLVED
6 pages
03 Concurrency
No ratings yet
03 Concurrency
124 pages
Ch#22 TRANSACTION - MANAGEMENT
No ratings yet
Ch#22 TRANSACTION - MANAGEMENT
80 pages
Concurrency Control in Dynamic Database Systems
No ratings yet
Concurrency Control in Dynamic Database Systems
22 pages
ACS 1t
No ratings yet
ACS 1t
4 pages
Advanced DB-Chapter-Four Concurrency - Control - Techniques
No ratings yet
Advanced DB-Chapter-Four Concurrency - Control - Techniques
25 pages
Chapter 11 Concurrency Control
No ratings yet
Chapter 11 Concurrency Control
110 pages
Lect 19
No ratings yet
Lect 19
43 pages
Chapter 5
No ratings yet
Chapter 5
31 pages
Kenya-HICOOL Cooperation Proposal
No ratings yet
Kenya-HICOOL Cooperation Proposal
75 pages
Multiple Integrals, A Collection of Solved Problems
From Everand
Multiple Integrals, A Collection of Solved Problems
Steven Tan
No ratings yet
Geometric functions in computer aided geometric design
From Everand
Geometric functions in computer aided geometric design
Oscar Ruiz
No ratings yet
Infiniti 7000 Manual PDF
No ratings yet
Infiniti 7000 Manual PDF
84 pages
Modelling of Preconditioning by Blasting in Block and Panel Caving
No ratings yet
Modelling of Preconditioning by Blasting in Block and Panel Caving
19 pages
2.0 Thermochemistry Dec 21
No ratings yet
2.0 Thermochemistry Dec 21
77 pages
ESS Leave Request Config Steps
No ratings yet
ESS Leave Request Config Steps
8 pages
IST-100-1 509ITM PROTOCOL MANUAL Rev-31022
No ratings yet
IST-100-1 509ITM PROTOCOL MANUAL Rev-31022
131 pages
High Physical Chemistry PDF
No ratings yet
High Physical Chemistry PDF
300 pages
Ug NX
No ratings yet
Ug NX
4 pages
Bones of Upper Limb (Anatomy Practical) Mansoura
100% (1)
Bones of Upper Limb (Anatomy Practical) Mansoura
27 pages
Lecture 10-Controllers (PLC) B.
No ratings yet
Lecture 10-Controllers (PLC) B.
28 pages
NUMERALS (Bilangan) : Cardinal Ordinal Fraction
No ratings yet
NUMERALS (Bilangan) : Cardinal Ordinal Fraction
3 pages
Reduction To Diagnol Form
No ratings yet
Reduction To Diagnol Form
11 pages
Atmega32A DataSheet Complete DS40002072A 4
No ratings yet
Atmega32A DataSheet Complete DS40002072A 4
15 pages
059-048 - Carving Incised Letters
No ratings yet
059-048 - Carving Incised Letters
4 pages
Automobile Engineering Experiment 10: Study of Camber, Caster, Toe-In or Toe-Out Camber
No ratings yet
Automobile Engineering Experiment 10: Study of Camber, Caster, Toe-In or Toe-Out Camber
4 pages
First Mock Exam As Chem MCQ Nov 23
No ratings yet
First Mock Exam As Chem MCQ Nov 23
16 pages
Approximating Square Roots
No ratings yet
Approximating Square Roots
7 pages
A New LC MS MS Method For Quantification of Gangliosides in Human Plasma PDF
No ratings yet
A New LC MS MS Method For Quantification of Gangliosides in Human Plasma PDF
32 pages
Three Star Auto Spare Parts Trdg. - THB: 04-10-0011 26/08/2021 Top Concrete PO - BOX: 12515 050305999 06 8823055
No ratings yet
Three Star Auto Spare Parts Trdg. - THB: 04-10-0011 26/08/2021 Top Concrete PO - BOX: 12515 050305999 06 8823055
1 page
Msbte JPR All Program Chapter-2
No ratings yet
Msbte JPR All Program Chapter-2
28 pages
Weighted Moving Average Formula
No ratings yet
Weighted Moving Average Formula
25 pages
Chemical Equilibrium: 2.1. Some Definitions
No ratings yet
Chemical Equilibrium: 2.1. Some Definitions
24 pages
Magnetic Fields 2
No ratings yet
Magnetic Fields 2
4 pages
GC AccessoryCat 09 V2
No ratings yet
GC AccessoryCat 09 V2
13 pages
Backup Rings Respaldo de Orings
No ratings yet
Backup Rings Respaldo de Orings
8 pages
Chemistry: Matter and Change
No ratings yet
Chemistry: Matter and Change
12 pages
Call-Forward B2bua
No ratings yet
Call-Forward B2bua
4 pages
Kentertainment: This Week'S
No ratings yet
Kentertainment: This Week'S
10 pages
Basics Concrete Construction (2015) PDF
No ratings yet
Basics Concrete Construction (2015) PDF
76 pages

5-Transaction Processing

Uploaded by

5-Transaction Processing

Uploaded by

Principles of Distributed Database

© 2020, M.T. Özsu & P. Valduriez 1

© 2020, M.T. Özsu & P. Valduriez 2

A transaction is a collection of actions that make consistent

© 2020, M.T. Özsu & P. Valduriez 3

© 2020, M.T. Özsu & P. Valduriez 4

© 2020, M.T. Özsu & P. Valduriez 5

◼ Atomic and reliable execution in the presence of

◼ Correct and fast execution in the presence of

◼ Correct management of replicas (if they support

© 2020, M.T. Özsu & P. Valduriez 6

© 2020, M.T. Özsu & P. Valduriez 7

© 2020, M.T. Özsu & P. Valduriez 8

◼ The problem of synchronizing concurrent transactions

© 2020, M.T. Özsu & P. Valduriez 9

◼ Two actions are said to be in conflict (conflicting pair) if:

2. At least one of the actions is a write operation.

3. The actions access the same object (read or write).

◼ The following set of actions is conflicting:

◼ While the following sets of actions are not:

❑ R1(X), W2(Y), R3(X)

© 2020, M.T. Özsu & P. Valduriez 10

S2: r1(x) r2(x) r1(y) w2(x) w1(y) schedules have

S5: r1(x) r1(y) w1(y) r2(x) w2(x)

◼ Nodes represent transactions

◼ Two histories have to be considered:

◼ For global transactions (i.e., global history) to be

© 2020, M.T. Özsu & P. Valduriez 15

T1: Read(x) T2: Read(x)

◼ x stored at Site 1, y stored at Site 2

T1: Read(x) T2: Read(x)

◼ x stored at Site 1, y stored at Site 2

© 2020, M.T. Özsu & P. Valduriez 18

◼ Transactions indicate their intentions by requesting locks

Ts first unlock

◼ Guarantees serializability when locking is

© 2020, M.T. Özsu & P. Valduriez 21

© 2020, M.T. Özsu & P. Valduriez 23

© 2020, M.T. Özsu & P. Valduriez 24

© 2020, M.T. Özsu & P. Valduriez 25

◼ Transactions are allowed to wait freely.

© 2020, M.T. Özsu & P. Valduriez 26

© 2020, M.T. Özsu & P. Valduriez 29

◼ If there is a cycle that does not include the external edges,

© 2020, M.T. Özsu & P. Valduriez 30

© 2020, M.T. Özsu & P. Valduriez 31

 Transaction (Ti) is assigned a globally unique timestamp

© 2020, M.T. Özsu & P. Valduriez 32

◼ Two conflicting operations Oij of Ti and Okl of Tk →

for Ri(x) for Wi(x)

if ts(Ti) < wts(x) if ts(Ti) < rts(x) or ts(Ti) < wts(x)

© 2020, M.T. Özsu & P. Valduriez 33

◼ Basic timestamp ordering tries to execute an operation

© 2020, M.T. Özsu & P. Valduriez 34

© 2020, M.T. Özsu & P. Valduriez 35

◼ A Ri(x) is translated into a read on one version of x.

© 2020, M.T. Özsu & P. Valduriez 37

© 2020, M.T. Özsu & P. Valduriez 39

Validate Read Compute Write

Read Compute Validate Write

© 2020, M.T. Özsu & P. Valduriez 40

© 2020, M.T. Özsu & P. Valduriez 41

 If all transactions Tks where ts(Tks) < ts(Tis) have

© 2020, M.T. Özsu & P. Valduriez 42

 If there is any transaction Tks such that ts(Tks)<ts(Tis) and

© 2020, M.T. Özsu & P. Valduriez 43

 If there is any transaction Tks such that ts(Tks)< ts(Tis)

© 2020, M.T. Özsu & P. Valduriez 44

© 2020, M.T. Özsu & P. Valduriez 45

© 2020, M.T. Özsu & P. Valduriez 46

© 2020, M.T. Özsu & P. Valduriez 47

© 2020, M.T. Özsu & P. Valduriez 48

© 2020, M.T. Özsu & P. Valduriez 49

© 2020, M.T. Özsu & P. Valduriez 51

© 2020, M.T. Özsu & P. Valduriez 52

© 2020, M.T. Özsu & P. Valduriez 53