Unit 5
Unit 5
This whole set of operations can be called a transaction. Although I have shown you read, write
and update operations in the above example but the transaction can have operations like read,
write, insert, update, delete.
1. R(A);
2. A = A - 10000;
3. W(A);
4. R(B);
5. B = B + 10000;
6. W(B);
In the above transaction R refers to the Read operation and W refers to the write operation.
Now that we understand what is transaction, we should understand what are the problems
associated with it.
The main problem that can happen during a transaction is that the transaction can fail before
finishing the all the operations in the set. This can happen due to power failure, system crash etc.
This is a serious problem that can leave database in an inconsistent state. Assume that
transaction fail after third operation (see the example above) then the amount would be
deducted from your account but your friend will not receive it.
Rollback: If any of the operation fails then rollback all the changes done by previous operations.
Even though these operations can help us avoiding several issues that may arise during
transaction but they are not sufficient when two transactions are running concurrently. To
handle those problems we need to understand database ACID properties.
• Atomicity: This property ensures that either all the operations of a transaction reflect in
database or none. Let’s take an example of banking system to understand this: Suppose
Account A has a balance of 400$ & B has 700$. Account A is transferring 100$ to
Account B. This is a transaction that has two operations a) Debiting 100$ from A’s
balance b) Crediting 100$ to B’s balance. Let’s say first operation passed successfully
while second failed, in this case A’s balance would be 300$ while B would be having
700$ instead of 800$. This is unacceptable in a banking system. Either the transaction
should fail without executing any of the operation or it should process both the
operations. The Atomicity property ensures that.
• Consistency: To preserve the consistency of database, the execution of transaction
should take place in isolation (that means no other transaction should run concurrently
when there is a transaction already running). For example account A is having a balance
of 400$ and it is transferring 100$ to account B & C both. So we have two transactions
here. Let’s say these transactions run concurrently and both the transactions read 400$
balance, in that case the final balance of A would be 300$ instead of 200$. This is
wrong. If the transaction were to run in isolation then the second transaction would have
read the correct balance 300$ (before debiting 100$) once the first transaction went
successful.
• Isolation: For every pair of transactions, one transaction should start execution only
when the other finished execution. I have already discussed the example of Isolation in
the Consistency property above.
• Durability: Once a transaction completes successfully, the changes it has made into the
database should be permanent even if there is a system failure. The recovery-
management component of database systems ensures the durability of transaction.
DBMS Transaction States Diagram
Active State
Failed State
If a transaction is executing and a failure occurs, either a hardware failure or a software failure
then the transaction goes into failed state from the active state.
As we can see in the above diagram that a transaction goes into “partially committed” state
from the active state when there are read and write operations present in the transaction.
A transaction contains number of read and write operations. Once the whole transaction is
successfully executed, the transaction goes into partially committed state where we have all the
read and write operations performed on the main memory (local memory) instead of the actual
database.
The reason why we have this state is because a transaction can fail during execution so if we are
making the changes in the actual database instead of local memory, database may be left in an
inconsistent state in case of any failure. This state helps us to rollback the changes made to
the database in case of a failure during execution.
Committed State
If a transaction completes the execution successfully then all the changes made in the local
memory during partially committed state are permanently stored in the database. You can also
see in the above diagram that a transaction goes from partially committed state to committed
state when everything is successful.
Aborted State
As we have seen above, if a transaction fails during execution then the transaction goes into a
failed state. The changes made into the local memory (or buffer) are rolled back to the previous
consistent state and the transaction goes into aborted state from the failed state.
The following sequence of operations is a schedule. Here we have two transactions T1 & T2
which are running concurrently.
This schedule determines the exact order of operations that are going to be performed on
database. In this example, all the instructions of transaction T1 are executed before the
instructions of transaction T2, however this is not always necessary and we can have various
types of schedules.
T1 T2
---- ----
R(X)
W(X)
R(Y)
R(Y)
R(X)
W(Y)
Types of Schedules in DBMS
Serial Schedule
In Serial schedule, a transaction is executed completely before starting the execution of another
transaction. In other words, you can say that in serial schedule, a transaction does not start
execution until the currently running transaction finished execution. This type of execution of
transaction is also known as non-interleaved execution. The example we have seen above is the
serial schedule.
T1 T2
---- ----
R(A)
R(B)
W(A)
commit
R(B)
R(A)
W(B)
commit
Strict Schedule
In Strict schedule, if the write operation of a transaction precedes a conflicting operation (Read
or Write operation) of another transaction then the commit or abort operation of such transaction
should also precede the conflicting operation of other transaction.
Ta Tb
----- -----
R(X)
R(X)
W(X)
commit
W(X)
R(X)
commit
Here the write operation W(X) of Ta precedes the conflicting operation (Read or Write
operation) of Tb so the conflicting operation of Tb had to wait the commit operation of Ta.
Cascade Schedule
Ta Tb
----- -----
R(X)
W(X)
R(X)
W(X)
commit
commit
Cascadeless Schedule
Recoverable Schedule
In Recoverable schedule, if a transaction is reading a value which has been updated by some
other transaction then this transaction can commit only after the commit of other transaction
which is updating value.
Ta Tb
----- -----
R(X)
W(X)
R(X)
W(X)
R(X)
commit
commit
Deadlock in DBMS
In a database, a deadlock is an unwanted situation in which two or more transactions are waiting
indefinitely for one another to give up locks. Deadlock is said to be one of the most feared
complications in DBMS as it brings the whole system to a Halt.
Now, the main problem arises. Transaction T1 will wait for transaction T2 to give up lock, and
similarly, transaction T2 will wait for transaction T1 to give up the lock. As a consequence, All
activity comes to a halt and remains at a standstill forever unless the DBMS detects the deadlock
and aborts one of the transactions.
Deadlock in DBMS
Deadlock Avoidance –
When a database is stuck in a deadlock, It is always better to avoid the deadlock rather than
restarting or aborting the database. Deadlock avoidance method is suitable for smaller databases
whereas deadlock prevention method is suitable for larger databases.
One method of avoiding deadlock is using application-consistent logic. In the above given
example, Transactions that access Students and Grades should always access the tables in the
same order. In this way, in the scenario described above, Transaction T1 simply waits for
transaction T2 to release the lock on Grades before it begins. When transaction T2 releases the
lock, Transaction T1 can proceed freely.
Another method for avoiding deadlock is to apply both row-level locking mechanism and
READ COMMITTED isolation level. However, It does not guarantee to remove deadlocks
completely.
Deadlock Detection –
When a transaction waits indefinitely to obtain a lock, The database management system should
detect whether the transaction is involved in a deadlock or not.
Wait-for-graph is one of the methods for detecting the deadlock situation. This method is
suitable for smaller database. In this method a graph is drawn based on the transaction and their
lock on the resource. If the graph created has a closed loop or a cycle, then there is a deadlock.
For the above mentioned scenario the Wait-For graph is drawn below
Deadlock prevention –
For large database, deadlock prevention method is suitable. A deadlock can be prevented if the
resources are allocated in such a way that deadlock never occur. The DBMS analyzes the
operations whether they can create deadlock situation or not, If they do, that transaction is never
allowed to be executed.
• Wait-Die Scheme –
In this scheme, If a transaction request for a resource that is locked by other transaction,
then the DBMS simply checks the timestamp of both transactions and allows the older
transaction to wait until the resource is available for execution.
Suppose, there are two transactions T1 and T2 and Let timestamp of any transaction T be
TS (T). Now, If there is a lock on T2 by some other transaction and T1 is requesting for
resources held by T2, then DBMS performs following actions:
Checks if TS (T1) < TS (T2) – if T1 is the older transaction and T2 has held some
resource, then it allows T1 to wait until resource is available for execution. That means if
a younger transaction has locked some resource and older transaction is waiting for it,
then older transaction is allowed wait for it till it is available. If T1 is older transaction
and has held some resource with it and if T2 is waiting for it, then T2 is killed and
restarted latter with random delay but with the same timestamp. i.e. if the older
transaction has held some resource and younger transaction waits for the resource, then
younger transaction is killed and restarted with very minute delay with same timestamp.
This scheme allows the older transaction to wait but kills the younger one.
The 3 activities taking place in the two phase update algorithm are:
Two phase locking prevents deadlock from occurring in distributed systems by releasing all the
resources it has acquired, if it is not possible to acquire all the resources required without waiting
for another process to finish using a lock. This means that no process is ever in a state where it is
holding some shared resources, and waiting for another process to release a shared resource
which it requires. This means that deadlock cannot occur due to resource contention.
A transaction in the Two Phase Locking Protocol can assume one of the 2 phases:
• (i) W-timestamp(X):
This means the latest time when the data item X has been written into.
• (ii) R-timestamp(X):
This means the latest time when the data item X has been read from. These 2 timestamps are
updated each time a successful read/write operation is performed on the data item X.
• (i) During read phase, the transaction reads the database, executes the needed computations
and makes the updates to a private copy of the the database values. All update operations of
the transactions are recorded in a temporary update file, which is not accessed by the remaining
transactions.
• (ii) During the validation phase, the transaction is validated to ensure that the changes made
will not affect the integrity and consistency of the database. If the validation test is positive, the
transaction goes to a write phase. If the validation test is negative, he transaction is restarted
and the changes are discarded.
• (iii) During the write phase, the changes are permanently applied to the database