Week 10
Week 10
Week 10 Lecture 1
Class BSCCS2001
Materials
Module # 46
Type Lecture
Week # 10
Transactions
Transaction Concept
A transaction is a unit of program execution that accesses and possibly updates various data items
read(A)
A := A - 50
write(A)
read(B)
B := B + 50
write(B)
If the transaction fails after step 3 and before step 6, money will be "lost" leading to an inconsistent database state
The system should ensure that updates of a partially executed transaction are not reflected in the database
Week 10 Lecture 1 1
Required properties of a Transaction: ACID: Consistency
Consistency Requirement
sum of balances of all accounts, minus sum of loan amounts must equal value of cash-in-hand
If between steps 3 and 6 (of the fund transfer transaction), another transaction T2 is allowed to access the
partially updated database, it will see an inconsistent database (the sum A + B will be less than it should be)
Once the user has been notified that the transaction has completed (that is, the transfer of $50 has taken place),
the updates to the database by the transaction must persist even if there are software or hardware failures
ACID Properties
A transaction is a unit of program execution that accesses and possibly updates various data items
Atomicity: Atomicity guarantees that each transaction is treated as a single unit, which either succeeds completely or
fails completely
If any of the statements constituting a transactions fails to complete, the entire transaction fails and the database
is left unchanged
Atomicity must be guaranteed in every situation, including power failures, errors and crashes
Consistency: Consistency ensures that a transaction can only bring the database from one valid state to another,
maintaining database invariants
Any data written to the database must be valid according to all defined rules, including constraints, cascades,
triggers and any combination thereof
Week 10 Lecture 1 2
Isolation: Transactions are often executed concurrently (multiple transactions reading and writing to a table at the
same time)
Isolation ensures that concurrent execution of transactions leaves the database in the same state that would
have been obtained if the transactions were executed sequentially
Durability: Durability guarantees that once a transactions has been committed, it will remain committed even in the
case of a system failure (like power outage or crash)
This usually means that completed transactions (or their effects) are recorded in non-volatile memory
Transaction States
Every transaction can be in one of the following states (like Process States in OS)
Active
The initial state; the transaction stays in this state while it is executing
Partially committed
Failed
Aborted
After the transaction has been rolled back and the database restored to its state prior to the start of the
transaction
Committed
Terminated
Week 10 Lecture 1 3
Concurrent Executions
Multiple transactions are allowed to run concurrently in the system
Advantages are:
For example, one transaction can be using the CPU while another is reading from or writing to the disk
Reduced average response time for transactions: short transactions need not wait behind long ones
To control the interaction among the concurrent transactions in order to prevent them from destroying the
consistency of the database
Schedules
Schedules: A sequence of instructions that specify the chronological order in which instructions of concurrent
transactions are executed
A scheduled for a set of transactions must consists for all instructions of those transactions
Must preserve the order in which the instructions appear in each individual transactions
A transactions that successfully completes its execution will have a commit instruction as the last statement
A transaction that fails to successfully complete its execution will have an abort instruction as the last statement
Schedule 1
Let T1 transfer $50 from A to B and T2 transfer 10% of the balance from A to B
Week 10 Lecture 1 4
Schedule 2
A serial schedule in which T2 is followed by T1
Schedule 3
Let T1 and T2 be the transactions defined previously
Week 10 Lecture 1 5
Schedule 4
The following concurrent schedule does not preserve the sum of "A + B"
Week 10 Lecture 1 6
📚
Week 10 Lecture 2
Class BSCCS2001
Materials
Module # 47
Type Lecture
Week # 10
Transactions: Serializability
Serializability
Assumption: Each transaction preserves database consistency
Conflict Serializability
View Serializability
Week 10 Lecture 2 1
Recap Schedule 4: Not Serializable
The following concurrent schedule does not preserve the sum of "A + B"
Other operations happen in memory (are temporary in nature) and (mostly) do not affect the state of the database
We assume that transactions may perform arbitrary computations on data in local buffers in between reads and writes
Conflicting Instructions
Let Ii and Ij be 2 instructions from transactions Ti and Tj respectively
Instructions Ii and Ij conflict if and only if there exists some item Q accessed by both Ii and Ij and at least one of
these instructions write to Q
Week 10 Lecture 2 2
If Ii and Ij are consecutive in a schedule and they do not conflict, their results would remain the same even if
they had been interchanged in the schedule
Conflict Serializability
If a schedule S can be transformed into a schedule S' by a series of swaps of non-conflicting instructions, we say that
S and S' are conflict equivalent
Schedule 3 can be transformed into Schedule 6, a serial schedule where T2 follows T1 by a series of swaps of non-
conflicting instructions:
These swaps do not conflict as they work with different items (A or B) in different transactions
We are unable to swap instructions in the above schedule to obtain either the serial schedule < T3 , T4 > or the
serial schedule < T4 , T3 >
Transaction 1
UPDATE accounts
Transaction 2
UPDATE accounts
Week 10 Lecture 2 3
Transaction 1: r1 (A), w1 (A)//A is the balance for acct_id = 31414
Consider schedule S:
We withdrew $100 from account A, but somehow the database has recorded that our account now holds $201
Serial schedule 1:
As an example, consider Schedule T, which has swapped the third and fourth operations from S:
But that's just a peculiarity of the data, as revealed by the second example, where the final value of A can't be the
consequence of either of the possible serial schedules
We could credit interest to A first then withdraw the money, then credit interest to B:
Week 10 Lecture 2 4
Schedule U is conflict serializable to Schedule 2:
Serializability
Are all serializable schedules conflict-serializable? No
The third and fourth are both on B at least one is a write and
So this schedule is not conflict-equivalent to anything - and certainly not any serial schedules
However, since nobody ever reads the values written by the w1 (A), w2 (B) and w1 (B) operations, the schedule has
the same outcome as the serial outcome
Precedence Graph
Consider some schedule of a set of transactions T1 , T2 , ..., Tn
Precedence Graph
We draw an arc from Ti to Tj if the two transactions conflict and Ti accessed the data item on which the conflict
arose earlier
Example:
Cycle-detection algorithms exist which take order n2 time, where n is the number of vertices in the graph
If precedence graph is acyclic, the serializability order can be obtained by a topological sorting of the graph
That is, linear order consistent with the partial order of the graph
For example, a serializability order for the schedule (a) would be one of either (b) or (c)
Week 10 Lecture 2 5
Build a directed graph, with a vertex for each transaction
If the operation is of the form wi (X), find each subsequent operation in the schedule also operating on the same
data element X by a different transaction: that is, anything of the form rj (X) or wj (X)
For each subsequent operation, add a directed edge in the graph from Ti to Tj
If the operation is of the form ri (X), find each subsequent write to the same data element X by a different
transaction: that is, anything of the form wj (X)
For each such subsequent write, add a directed edge in the graph from Ti to Tj
The schedule is conflict-serializable if and only if the resulting directed graph is acyclic
Moreover, we can perform a topological sort on the graph to discover the serial schedule to which the schedule is
conflict-equivalent
w1 (A), r2 (A), w1 (B), w3 (C), r2 (C), r4 (B), w2 (D), w4 (E), r5 (D), w5 (E)
We start with an empty graph with five vertices labeled T1 , T2 , T3 , T4 , T5
Week 10 Lecture 2 6
We end up with a precedence graph
Moreover, since one way to topologically sort the graph is T3 − T1 − T4 − T2 − T5 , one serial schedule that is
conflict-equivalent is
w3 (C), w1 (A), w1 (B), r4 (B), w4 (E), r2 (A), r2 (C), w2 (D), r5 (D), w5 (E)
Week 10 Lecture 2 7
📚
Week 10 Lecture 3
Class BSCCS2001
Materials
Module # 48
Type Lecture
Week # 10
Transactions: Recoverability
What is Recovery?
Serializability helps to ensure Isolation and Consistency of a schedule
Yet, the Atomicity and Consistency may be compromised in the face of system failures
read(A)
A := A - 50
write(A)
read(B)
B := B + 50
write(B)
commit // Make the changes permanent; show the results to the user
Recoverable Schedules
Week 10 Lecture 3 1
If a transaction Tj reads a data item previously written by a transaction Ti , then the commit operation of Ti must
appear before the commit operation of Tj
The following schedule is not recoverable if T9 commits immediately after the read(A) operation
If T8 should abort, T9 would have read (and possibly shown to the user) an inconsistent database state
Cascading Rollbacks
Cascading rollback: A single transaction failure leads to a series of transaction rollbacks
Consider the following schedule where none of the transactions has yet committed (so the schedule is
recoverable)
Cascadeless Schedules
Cascadeless schedules: For each pair of transactions Ti and Tj such that Tj reads a data item previously written
by Ti , the commit operation of Ti appears before the read operation of Tj
Week 10 Lecture 3 2
Rollback is possible only till the end (commit) of T2
Week 10 Lecture 3 3
Rollback is possible without cascading - wherever failure occurs
Commit work
Rollback work
In almost al database systems, by default, every SQL statement also commits implicitly if it executes successfully
COMMIT
ROLLBACK
SAVEPOINT
SET TRANSACTION
Transactional control commands are only used with the DML Commands such as
They cannot be used while creating tables or dropping them because these operations are automatically
committed to the database
COMMIT saves all the transactions to the database since the last COMMIT or ROLLBACK command
Week 10 Lecture 3 4
SQL> DELETE FROM Customers WHERE AGE = 25;
SQL> COMMIT;
This can only be used to undo transactions since the last COMMIT or ROLLBACK command was issued
SQL> ROLLBACK;
SAVEPOINT SAVEPOINT_NAME;
This command serves only in the creation of a SAVEPOINT among all the transactional statements
ROLLBACK TO SAVEPOINT_NAME;
Week 10 Lecture 3 5
Three records deleted
Rollback complete
Once a SAVEPOINT has been released, you can no longer use the ROLLBACK command to undo transactions
performed since the last SAVEPOINT
This command is used to specify a characteristics for the transactions that follows
Week 10 Lecture 3 6
SET TRANSACTION [READ WRITE | READ ONLY];
View Serializability
Let S and S' be two schedules with the same set of transactions
S and S' are view equivalent if the following 3 conditions are met, for each data item Q
Initial Read: If in schedule S, transaction Ti reads the initial value of Q, then in schedule S' also transaction Ti
must read the initial value of Q
Write-Read Pair: If in schedule S transaction Ti executed read(Q) and that value was produced by transaction
Tj (if any), then in schedule S" also transaction Ti must read the value of Q that was produced by the same
write(Q) operation of transaction Tj
Final Write: The transaction (if any) that performs the final write(Q) operation in schedule S must also perform
the final write(Q) operation in schedule S'
As can be seen, view equivalence is also based purely on reads and writes alone
Extension to test for view serializablilty has cost exponential in the size of the precedence graph
The problem of checking if a schedule is view serializable falls in the case of NP-complete problems
However, practical assignments that just check some sufficient conditions for view serializability can still be used
Week 10 Lecture 3 7
< T3 T2 T1 >
Solution #2
A :- (No write on A)
< T1 T2 T3 >
< T2 T1 T3 >
Solution #3
A : T2 , T1 , T3 (initial read)
Hence, T2 → T1
So, only one schedule survives:
< T2 T1 T3 >
Write Read Sequence (WR)
T2 → T1 → T3
If we start with A = 1000 and B = 2000, the final result is 960 and 2040
Determining such equivalence requires analysis of operations other than read and write
Week 10 Lecture 3 8
Week 10 Lecture 3 9
📚
Week 10 Lecture 4
Class BSCCS2001
Materials
Module # 49
Type Lecture
Week # 10
Concurrency Control
A database must provide a mechanism that will ensure that all possible schedules are both:
Conflict serializable
A policy in which only one transaction can execute at a time generates serial schedules, but provides a poor degree of
concurrency
Concurrency-control schemes tradeoff between the amount of concurrency they allow and the amount of overhead
that they incur
Testing a schedule for serializability after it has executed is a little too late!
Tests for serializability help us understand why a concurrency control protocol is correct
One way to ensure isolation is to require that data items be accessed in a mutually exclusive manner, that is, while
one transaction is accessing a data item, no other transactions can modify that data item
The most common method used to implement locking requirement is to allow a transaction to access a data item only
if it is currently holding a lock on that item
Lock-based Protocols
A lock is a mechanism to control concurrent access to a data item
Week 10 Lecture 4 1
Data items can be locked in two modes:
exclusive(X) mode:
shared(S) mode:
A transaction may be granted a lock on an item if the requested lock is compatible with locks already held on the
item by other transactions
Sharing a Lock
But if any transaction holds an exclusive lock on the item no other transaction may hold any lock on the item
If a lock cannot be granted, the requesting transaction is made to wait till all incompatible locks held by other
transactions have been released
Holding a Lock
A transaction must hold a lock on a data item as long as it accesses that item
Transaction Ti may unlock a data item that it had locked at some earlier point
It is not necessarily desirable for a transaction to unlock a data item immediately after its final access of that data
item, since serializability may not be ensured
Week 10 Lecture 4 2
Let A and B be 2 accounts that are accessed by
transactions T1 and T2
Week 10 Lecture 4 3
Given T3 and T4 consider Schedule 2 (partial)
Since T3 is holding an exclusive mode lock on B and T4 is requesting a shared-mode lock on B, T4 is waiting for T3
to unlock B
Thus, we have arrived at a state where neither of these transactions can ever proceed with its normal execution
When deadlock occurs, the system must roll back one of the two transactions
Once a transaction has been rolled back, the data items that were locked by that transaction are unlocked
These data items are then available to the other transaction which can continue with its execution
Lock-Based Protocols
If we do not use locking, or if we unlock data items too soon after reading or writing them, we may get inconsistent
states
On the other hand, if we do not unlock a data item before requesting a lock on another data item, deadlocks may
occur
Deadlocks are a necessary evil associated with locking, if we want to avoid inconsistent states
Deadlocks are definitely preferable to inconsistent states, since they can be handled by rolling
back transactions, whereas inconsistent states may lead to real-world problems that cannot be handled by the
database system
A locking protocol is a set of rules followed by all transactions while requesting and releasing
locks
The set of all such schedules is a proper subset of all possible serializable schedules
We present locking protocols that allow only conflict-serializable schedules, and thereby ensure
isolation
Week 10 Lecture 4 4
Transaction may release locks
It can be proved that the transactions can be serialized in the order of their lock points
That is, the point where a transaction acquires its final lock
However, in the absence of extra information (that is, ordering of access to data),
two-phase locking is needed for conflict serializability in the following sense:
Given a transaction Ti that does not follow two-phase locking, we can find a
transaction Tj that uses two-phase locking, and a schedule for Ti and Tj that is not
conflict serializable
Lock Conversions
Two-phase locking with lock conversions
First Phase
Second Phase
But still relies on the programmer to insert the various locking instructions
if Ti has a lock on D
then
read(D)
else begin
grant Ti a lock-S on D;
read(D)
end
if Ti has a lock-X on D
then
write(D)
else begin
if Ti has a lock-S on D
then
Week 10 Lecture 4 5
upgrade lock on D to lock-X
else
grant Ti a lock-X on D
write(D)
end;
Deadlocks
Two-phase locking does not ensure freedom from
deadlocks
Starvation
In addition to deadlocks, there is a possibility of Starvation (wot)
For example:
A transaction may be waiting for an X-lock on an item, while a sequence of other transactions request and are
granted an S-lock on the same item
Cascading Rollback
The potential for deadlock exists in most locking protocols
In the schedule here, each transaction observes the two-phase locking protocol, but the failure of T5 after the read(A)
step of T7 leads to cascading rollback of T6 and T7
Week 10 Lecture 4 6
More Two Phase Locking Protocols
To avoid Cascading roll-back, follow a modified protocol called strict two-phase locking
In this protocol, transactions can be serialized in the order in which they commit
Note that concurrency goes down as we move to more and more strict locking protocol
Implementation of Locking
A lock manager can be implemented as a separate process to which transactions send lock and unlock requests
The lock manager replies to a lock request by sending a lock grant messages (or a message asking the transaction to
roll back, in case of a deadlock)
The lock manager maintains a data-structure called a lock table to record granted locks and pending requests
The lock table is usually implemented as an in-memory hash table indexed on the name of the data item being locked
Lock Table
Dark blue rectangle indicate granted locks; light blue indicate waiting requests
New request is added to the end of the queue of requests for the data item, and granted if it is compatible with all
earlier locks
Unlock requests result in the request being deleted, and later requests are checked to see it they can now be granted
If transaction aborts, all waiting or granted requests of the transaction are deleted
Lock manager may keep a list of locks held by each transaction, to implement this efficiently
Week 10 Lecture 4 7
Week 10 Lecture 4 8
📚
Week 10 Lecture 5
Class BSCCS2001
Materials
Module # 50
Type Lecture
Week # 10
Deadlock Prevention protocols ensure that the system will never enter into a deadlock state
Require that each transaction locks all its data items before it beings execution (pre-declaration)
Impose partial ordering of all data items and require that a transaction can lock data items in the order
specified by the partial order
Deadlock Prevention
Transaction Timestamp: Timestamp is a unique identifier created by the DBMS to identify the relative starting time of
a transaction
Timestamping is a method of concurrency control in which each transaction is assigned a transaction timestamp
Following schemes use transaction timestamps for the sake of deadlock prevention alone
Older transaction may wait for younger one to release data item (here, older means smaller timestamp)
Younger transactions never wait for older ones; they are rolled back instead
A transaction may die several times before acquiring needed data item
Week 10 Lecture 5 1
Older transaction wounds (forces rollback) of younger transaction instead of waiting for it
When transaction Tn requests a data item currently held by Tk , Tn is allowed to wait only if it has a timestamp
smaller than that of Tk (That is, Tn is older than Tk ), otherwise Tn is killed (”die”)
If a transaction requests to lock a resource (data item), which is already held with a conflicting lock by another
transaction, then one of the two possibilities may occur:
Timestamp(Tn ) < Timestamp(Tk ): Tn which is requesting a conflicting lock, is older than Tk , then T n is
allowed to "wait" until the data-item is available
Tn is restarted later with a random delay but with the same timestamp(n)
This scheme allows the older transaction to "wait" but kills the younger one ("die")
Example:
If T15 requests a data item held by T10 , then T15 will be killed ("die")
When transaction Tn requests a data item currently held by Tk , Tn is allowed to wait only if it has a timestamp larger
than that of Tk , otherwise Tk is killed (wounded by Tn )
If a transaction requests to lock a resource (data item), which is already held with a conflicting lock by another
transaction, then one of the two possibilities may occur:
Tk is restarted later with a random delay but with the same timestamp(k)
Timestamp(Tn ) > Timestamp(Tk ): Tn ”wait”s until the resource is free
This scheme allows the younger transaction requesting a lock to ”wait” if the older transaction already holds a lock,
but forces the younger one to be suspended (”wound”) if the older transaction requests a lock on an item already held
by the younger one
Example:
If T5 requests a data item held by T10 , then it will be preempted from T10 and T10 will be suspended (”wounded”)
If T15 requests a data item held by T10 , then T15 will ”wait”
Deadlock prevention
Both in wait-die and in wound-wait schemes, a rolled back transaction is restarted with
its original timestamp
Older transactions thus have precedence over newer ones, and starvation is hence avoided
Timeout-Based Schemes
If the lock has not been granted within that time, the transaction is rolled back and restarted
Week 10 Lecture 5 2
Deadlock Detection
Deadlocks can be described as a wait-for graph, which consists of a pair G = (V , E)
V is a set of vertices (all the transactions in the system)
When Ti requests a data item currently being held by Tj , then the edge Ti → Tj is inserted in the wait-for graph
This edge is removed only when Tj is no longer holding a data item needed by Ti
The system is in a deadlock state if and only if the wait-for graph has a cycle
Deadlock Recovery
When deadlock is detected:
Some transaction will have to rolled back (made a victim) to break deadlock
More effective to roll back transaction only as far as necessary to break deadlock
Timestamp-based Protocols
Each transaction is issued a timestamp when it enters the system
If an old transaction Ti has time-stamp TS(Ti ), a new transaction Ti is assigned time-stamp TS(Tj ) such that
TS(Ti ) < TS(Tj )
The protocol manages concurrent execution such that the time-stamps determine the serializability order
In order to assure such behavior, the protocol maintains for each data Q two timestamp values:
The timestamp ordering protocol ensures that any conflicting read and write operations are executed in timestamp
order
If TS(Ti ) ≤ W-timestamp(Q), then Ti needs to read a value of Q that was already overwritten
Hence, the read operation is rejected, and Ti is rolled back
Week 10 Lecture 5 3
If TS(Ti ) ≥ W-timestamp(Q), then the read operation is executed, and R-timestamp(Q) is set to max(R-
timestamp(Q), TS(Ti ))
If TS(Ti ) < R-timestamp(Q), then the value of Q that Ti is producing was needed previously, and the system
assumed that that value would never be produced
Timestamp protocol ensures freedom from deadlock as no transaction ever waits (TATAKAE)
But the schedule may not be cascade-free, may not even be recoverable
Week 10 Lecture 5 4