0% found this document useful (0 votes)
51 views43 pages

Transactions and Concurrecynotes

The document discusses transactions and concurrency control in database systems. A transaction is a unit of work that reads and writes data. For integrity, transactions must follow ACID properties - Atomicity, Consistency, Isolation, and Durability. Concurrency control schemes allow concurrent transactions by controlling their interactions to prevent inconsistencies. A schedule is the order of transaction instructions. A schedule is serializable, and thus valid, if it is equivalent to a serial schedule through swapping non-conflicting instructions.

Uploaded by

VAIBHAV SWAMI
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
51 views43 pages

Transactions and Concurrecynotes

The document discusses transactions and concurrency control in database systems. A transaction is a unit of work that reads and writes data. For integrity, transactions must follow ACID properties - Atomicity, Consistency, Isolation, and Durability. Concurrency control schemes allow concurrent transactions by controlling their interactions to prevent inconsistencies. A schedule is the order of transaction instructions. A schedule is serializable, and thus valid, if it is equivalent to a serial schedule through swapping non-conflicting instructions.

Uploaded by

VAIBHAV SWAMI
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 43

Transactions & Concurrency Control

Transaction
Transaction Concept
 A transaction is a unit of program execution that accesses and
possibly updates various data items
 A transaction is the DBMS’s abstract view of a user program: a
sequence of reads and writes
 A transaction must see a consistent database
 During transaction execution the database may be
temporarily inconsistent
 A sequence of many actions which are considered to be one atomic
unit of work
 When the transaction completes successfully (is
committed), the database must be consistent
 After a transaction commits, the changes it has made to the
database persist, even if there are system failures
 Multiple transactions can execute in parallel
 Two main issues to deal with:
 Failures of various kinds, such as hardware failures and system
crashes
 Concurrent execution of multiple transactions
ACID Properties
 To preserve the integrity of data the database system transaction
mechanism must ensure:
 Atomicity. Either all operations of the transaction are properly
reflected in the database or none are
 Consistency. Execution of a transaction in isolation preserves the
consistency of the database
 Isolation. Although multiple transactions may execute
concurrently, each transaction must be unaware of other
concurrently executing transactions. Intermediate transaction
results must be hidden from other concurrently executed
transactions
 That is, for every pair of transactions Ti and Tj, it appears to
Ti that either Tj, finished execution before Ti started, or Tj
started execution after Ti finished
 Durability. After a transaction completes successfully, the
changes it has made to the database persist, even if there are
system failures
Example of Fund Transfer
 Transaction to transfer $50 from account A to account B:
1. read(A)
2. A:=A–50
3. write(A)
4. read(B)
5. B:=B+50
6. write(B)
 Atomicity requirement – if the transaction fails after step 3
and before step 6, the system should ensure that its updates are
not reflected in the database, else an inconsistency will result.
 Consistency requirement – the sum of A and B is unchanged by
the execution of the transaction.
 Isolation requirement – if between steps 3 and 6, another
transaction is allowed to access the partially updated database, it
will see an inconsistent database (the sum A + B will be less than it
should be)
 Isolation can be ensured trivially by running transactions
serially, that is one after the other.
 However, executing multiple transactions concurrently has
significant benefits in DBMS throughput
 Durability requirement – once the user has been notified that the
transaction has completed (i.e., the transfer of the $50 has taken
place), the updates to the database by the transaction must persist
despite failures.
Transaction States
 Active
 the initial state; the transaction stays in this state while it
is executing
 Partially committed
 after the final statement has been executed
 Failed
 after the discovery that normal execution can no longer proceed
 Aborted
 after the transaction has been rolled back and the database
restored to its state prior to the start of the transaction
 Two options after it has been aborted:
 Restart the transaction; can be done only Partially Committed
committed
 if no internal logical error occurred
 Kill the transaction
 Committed
 after successful completion

Active
Failed
Aborted
Concurrent Executions
 Multiple transactions are allowed to run concurrently in the system.
Advantages are:
 increased processor and disk utilization, leading to better
transaction throughput: one transaction can be using the CPU while
another is reading from or writing to the disk
 reduced average response time for transactions: short
transactions need not wait behind long ones.
 Concurrency control schemes – mechanisms to achieve isolation;
that is, to control the interaction among the concurrent transactions in
order to prevent them from destroying the consistency of the database
 Will study later in this lesson
Schedules
 Schedule – a sequences of instructions that specify the
chronological order in which instructions of concurrent
transactions are executed
 a schedule for a set of transactions must consist of all instructions of
those transactions
 must preserve the order in which the instructions appear in each
individual transaction.
 A transaction that successfully completes its execution will have a
commit instructions as the last statement (will be omitted if it is
obvious)
 A transaction that fails to successfully complete its execution
will have an abort instructions as the last statement (will be
omitted if it is obvious)
Correct Schedule Examples
 Let T1 transfer $50 from A to B, and T2 transfer 10% of the
balance from A to B.
 Schedule S1 Schedule S2
A serial schedules S1 and
T1 T2 T1 T2
S2 read(A) read(A)
A := A –
Schedule S3 is not write(A) tmp
tmp :=
 A:=A–50 A*0.1
serial, but it is equivalent read(B) write(A)
B:=B+50 read(B)
B := B +
to Schedule S1 write(B) tmp
read(A) write(B)
tmp :=
Schedule S3 A*0.1 read(A)
A := A –
tmp A:=A–50
T1 T2 write(A) write(A)
read(A) read(B) read(B)
B := B +
A:=A–50 tmp B:=B+50
write(A) write(B) write(B)

read(A)
tmp :=
A*0.1 T1≺ T2 T 2≺ T 1
A := A –
tmp
write(A)
read(B)  All schedules preserve
B:=B+50 (A+B)
write(B)
read(B)
B := B +
tmp
write(B)
Bad Schedule
 The following concurrent schedule does not preserve the value of (A
+ B) and violates the consistency requirement

Schedule S4
T1 T2
read(A)
A:=A–50
read(A)
tmp :=
A*0.1
A := A –
tmp
write(A)
read(B)
write(A)
read(B)
B:=B+50
write(B)
B := B +
tmp
write(B)
Serializability
 Basic Assumption – Each transaction preserves database consistency
 Thus serial execution of a set of transactions preserves database
consistency
 A (possibly concurrent) schedule is serializable if it is
equivalent to a serial schedule.
 We ignore operations other than read and write operations (OS-level
instructions), and we assume that transactions may perform arbitrary
computations on data in local buffers in between reads and writes.
 Our simplified schedules consist of only read and write instructions.
Conflicting Instructions
 Instructions Ii and Ij of transactions Ti and Tj respectively, conflict if
and only if there exists some item Q accessed by both Ii and Ij, and at
least one of these instructions wrote Q.
Ii and Ij don’t
1. Ii = read(Q), Ij = read(Q) conflict.
2. Ii = read(Q), Ij = write(Q) They conflict.
3. Ii = write(Q), Ij = read(Q) They conflict
4. Ii = write(Q), Ij = write(Q) They conflict
 Intuitively, a conflict between Ii and Ij forces a (logical)
temporal order between them.
 If Ii and Ij are consecutive in a schedule and they do not
conflict, their results would remain the same even if they had
been interchanged in the schedule.
Serializability
 If a schedule S can be transformed into a schedule S´ by a series of
swaps of non-conflicting instructions, we say that S and S´ are conflict
equivalent.
 We say that a schedule S is serializable if it is conflict
equivalent to a serial schedule
 Schedule S3 can be transformed into S6, a serial schedule where T2
follows T1, by series of swaps of non-conflicting instructions
 Therefore Schedule S3 is serializable

Schedule S3 Schedule S6  Schedule S is not


5
T1 T2 T1 T2 serializable:
read( read(A
A) )  We are unable to swap
write(
A) write(
A)
read(B
read( ) instructions in the schedule to
A)
write( write( read(
A) B) obtain either the serial
read( A) schedule
B)
write(
A) Schedule S5
write(
B)
read(
read( B) <T3, T4>,
B) write(
T
3 T
4
B)
write(
B) or the serial read(Q)

schedule write(Q
)
<T4,T3>. write(Q
)

Serializability Example
 Swapping non-conflicting actions
 Example:
 r1, w1 – transaction 1 actions, r2, w2 – transaction 2 actions
S = r1(A), w1(A), r2(A), w2(A), r1(B), w1(B), r2(B),

w2(B) r1(B) w2(A)


r1(B) r2(A) w1(B) w2(A)

S’ = r1(A), w1(A), r1(B), w1(B); r2(A), w2(A), r2(B), w2(B)

T1 T2
Testing for Serializability
 Consider some schedule of a set of transactions T1, T2, ..., Tn
 Precedence graph – a directed graph
 The vertices are the transactions (names).
 An arc from Ti to Tj if the two transaction conflict,
and the data item on which the conflict arose earlier.
 We may label the arc by the item that was accessed.
 Example
T1 T2 T3 T4 T5
A
r(X)
r(Y)
T1 T2 r(Z) T1
r(V)
Z
B r(W)
r(Y)
w(Y)
w(Z) T3
r(U) Z
Y
Ti accessed
T2
Y, Z
Y

T4
Test for Serializability
 A schedule is serializable if and only if its
precedence graph is acyclic.
 Cycle-detection algorithms exist which Ti
take order n2 time, where n is the number
of vertices in the graph. Tj Tk
 Better algorithms take order n + e
where e is the number of edges Tm
 If precedence graph is acyclic, the
serializability order can be obtained by a Ti Ti
topological sorting of the graph.
 A linear ordering of nodes in which T
j T
k
each node precedes all nodes to which it T T
has outbound edges. There are one or k j
more topological sorts. T
m T
m
 For example, a serializability order for
Schedule from the previous slide
would be
T5T1T3T2T4
 Are there others?
Recoverable Schedules
 Need to address the effect of transaction failures on
concurrently running transactions
 Recoverable schedule
 if a transaction Tj reads a data item previously written by a
transaction Ti , then the commit operation of Ti must appear before
the commit operation of Tj.
Schedule S11
T8 T9
read(A)
write(A)
read(A)
read(B)
 The schedule S11 is not recoverable if T9 commits
immediately after the read
 If T8 should abort, T9 would have read (and possibly shown to the
user) an inconsistent database state.
 DBMS must ensure that schedules are recoverable
Cascading Rollbacks

 Cascading rollback – a single transaction failure can lead to a


series of transaction rollbacks
 Consider the following schedule where none of the transactions
has yet committed (so the schedule is recoverable)

T T T
10 11 12
read(A)
read(B)
write(A)
read(A)
write(A)
read(A)

If T10 fails, T11 and T12 must also be rolled back.


 This can lead to the undoing of a significant amount of work
Cascadeless Schedules
 Cascadeless schedules – cascading rollbacks do not occur
 For each pair of transactions Ti and Tj such that Tj reads a data
item previously written by Ti, the commit operation of Ti
appears before the read operation of Tj.
 Every cascadeless schedule is also recoverable
 It is desirable to restrict the schedules to those that are
cascadeless
Concurrency Control
 A database must provide a mechanism that will ensure that all possible
schedules are
 serializable, and
 are recoverable and preferably cascadeless
 A policy in which only one transaction can execute at a time generates
serial schedules, but provides a poor degree of concurrency and low
throughput
 Are serial schedules recoverable/cascadeless?
 Testing a schedule for serializability after it has executed is a little too
late!
 Goal – to develop concurrency control protocols that will assure
serializability
Concurrency Control vs. Serializability Tests
 Concurrency-control protocols allow concurrent schedules, but ensure
that the schedules are serializable, and are recoverable and
cascadeless.
 Concurrency control protocols generally do not examine the
precedence graph as it is being created
 Instead a protocol imposes a discipline that avoids nonseralizable
schedules.
 Different concurrency control protocols provide different tradeoffs
between the amount of concurrency they allow and the amount of
overhead that they incur
 Tests for serializability help us understand why a
concurrency control protocol is correct
Concurrency Control Mechanisms and
Protocols
Lock-Based Concurrency Control Protocols
 A lock is a mechanism to control concurrent access to a data item.
Data items can be locked in two modes:
1. exclusive (X) mode. Data item can be both read as well as written.
X-lock is requested using lock-X instruction.
2. shared (S) mode. Data item can only be read. S-lock is requested
using lock-S instruction.
 Lock requests are made to concurrency-control manager.
Transaction can proceed only after request is granted

 Lock-compatibility matrix
S X
S true false
X false false
 A transaction may be granted a lock on an item if the requested
lock is compatible with locks already held on the item by other
transactions
 Any number of transactions can hold shared locks on an item,
 but if any transaction holds an exclusive on the item
no other transaction may hold any lock on the item.
 If a lock cannot be granted, the requesting transaction is made to wait
till all incompatible locks held by other transactions have been
released. The lock is then granted
Lock-Based Protocols (Cont.)
 Example of a transaction with locking:
T2: lock-S(A);
read(A);
unlock(A);
lock-S(B);
read(B);
unlock(B);
display(A+B);
 Locking as above is not sufficient to guarantee serializability
 if A and B get updated in-between the read of A and B, the displayed
sum would be wrong.
 A locking protocol is a set of rules followed by all
transactions while requesting and releasing locks.
 Locking protocols restrict the set of possible schedules
 Locking may be dangerous
 Danger of deadlocks
 Cannot be completely solved – transactions have to be killed
and rolled back
 Danger of starvation
 A transaction is repeatedly rolled back due to deadlocks
 Concurrency control manager can be designed to prevent starvation
 Compare these problems with critical sections in OS
The Two-Phase Locking Protocol
 This is a protocol which ensures conflict-serializable
schedules
 Phase 1: Growing Phase
 transaction may obtain locks
 transaction may not release locks
 Phase 2: Shrinking Phase
 transaction may release locks
 transaction may not obtain locks
 The protocol assures serializability. It can be proved that the
transactions can be serialized in the order of their lock points (i.e. the
point where a transaction acquired its final
Lock point
Shrinki
Growing ng
lock) phase phase
Time →
AE3B33OSD Silberschatz, Korth,
Sudarshan S. ©2007
The Two-Phase Locking Protocol (Cont.)
 Two-phase locking does not ensure freedom from
deadlocks
 Cascading roll-back is possible under two-phase locking. To avoid this,
follow a modified protocol called strict two-phase locking.
 Here a transaction must hold all its exclusive locks till it commits
or aborts.
 Rigorous two-phase locking is even stricter:
 All locks are held till commit/abort. In this protocol transactions
can be serialized in the order in which they commit.
Lock point
Growing phase Shrinkin
g phase

locksofNumber
Time →
Lock Conversions
 Two-phase locking with lock conversions:
– First Phase:
 can acquire a lock-S on item
 can acquire a lock-X on item
 can convert a lock-S to a lock-X (upgrade)
– Second Phase:
 can release a lock-S
 can release a lock-X
 can convert a lock-X to a lock-S (downgrade)
 This protocol assures serializability. But still relies on the
programmer to insert the locking instructions.
Automatic Acquisition of Locks
 A transaction Ti issues the standard read/write instruction, without
explicit locking calls (locking is a part of these operations)
 The operation read(D) is processed as:
if Ti has a lock on D then
read(D)
else begin
if necessary wait until no other transaction has a lock-
X on D; grant Ti a lock-S on D;
read(D)
end
 write(D) is processed as:
if Ti has a lock-X on D then
write(D)
else begin
if necessary wait until no other transaction has any
lock on D; if Ti has a lock-S on D then
upgrade lock on D to lock-X
else
grant Ti a lock-X on D;
write(D)
end;
 All locks are released after commit or abort
Implementation of Locking
 A lock manager can be implemented as a separate process to
which transactions send lock and unlock requests
 The lock manager replies to a lock request by sending a lock grant
messages
 or a message asking the transaction to roll back, in case a deadlock
is detected
 The requesting transaction waits until its request is answered
 The lock manager maintains a data-structure called a lock table to
record granted locks and pending requests
 The lock table is usually implemented as an in-memory hash table
indexed on the name of the data item being locked
Lock Table

D7 T
1
D23
T T8 T2
2
0
T
2 T D
0 1 4 Granted locks
D20 4
0 Waiting for lock grant

T
8
T20

D4
the is compatible with all earlier
 Lock table data locks
also item  Unlock requests result in the
records , request being deleted, and later
the type of and requests are checked to see if
lock gran they can now be granted
granted or ted  If transaction aborts, all waiting
requested if it or granted requests of the
 New request is transaction are deleted
added to the  lock manager may keep a list
end of the of locks held by each
queue of transaction, to implement
requests for this efficiently
Multiple Granularity
 Allow data items to be of various sizes and define a hierarchy
of data granularities, where the small granularities are nested
within larger ones
 Can be represented graphically as a tree (but don't confuse
with tree-locking protocol)
 When a transaction locks a node in the tree explicitly, it implicitly
locks all the node's descendents in the same mode.
 Granularity of locking (level in tree where locking is done):
 fine granularity (lower in tree): high concurrency, high
locking overhead
 coarse granularity (higher in tree): low locking overhead,
low concurrency
Example of Granularity Hierarchy

DB

A1 A2

Fa Fb Fc

… … …
ra1 ra2 ran rb1 rbn rc1 rcn

 The levels, starting from the coarsest (top) level are


 database
 area
 file
 record

You might also like