Transaction in DBMS
Transaction in DBMS
Operations in Transaction-
1. Read Operation-
Read operation reads the data from the database and then stores it in the buffer in main
memory.
For example- Read(A) instruction will read the value of A from the database and will
store it in the buffer in main memory.
2. Write Operation-
Write operation writes the updated data value back to the database from the buffer.
For example- Write(A) will write the updated value of A from the buffer to the database.
A transaction goes through many different states throughout its life cycle.
These states are called as transaction states.
Transaction states are as follows-
1. Active state
2. Partially committed state
3. Committed state
4. Failed state
5. Aborted state
6. Terminated state
1. Active State-
After the last instruction of transaction has executed, it enters into a partially
committed state.
After entering this state, the transaction is considered to be partially committed.
It is not considered fully committed because all the changes made by the transaction
are still stored in the buffer in main memory.
3. Committed State-
After
all the changes made by the transaction have been successfully stored into the
database, it enters into a committed state.
Now, the transaction is considered to be fully committed.
NOTE-
After a transaction has entered the committed state, it is not possible to roll back the
transaction.
In other words, it is not possible to undo the changes that has been made by the
transaction.
This is because the system is updated into a new consistent state.
The only way to undo the changes is by carrying out another transaction called
as compensating transaction that performs the reverse operations.
4. Failed State-
When a transaction is getting executed in the active state or partially committed state
and some failure occurs due to which it becomes impossible to continue the
execution, it enters into a failed state.
5. Aborted State-
After the transaction has failed and entered into a failed state, all the changes made by
it have to be undone.
To undo the changes made by the transaction, it becomes necessary to roll back the
transaction.
After the transaction has rolled back completely, it enters into an aborted state.
6. Terminated State-
1. Atomicity-
This property ensures that either the transaction occurs completely or it does not occur
at all.
In other words, it ensures that no transaction occurs partially.
That is why, it is also referred to as “All or nothing rule“.
It is the responsibility of Transaction Control Manager to ensure atomicity of the
transactions.
2. Consistency-
3. Isolation-
This property ensures that multiple transactions can occur simultaneously without
causing any inconsistency.
During execution, each transaction feels as if it is getting executed alone in the system.
A transaction does not realize that there are other transactions as well getting executed
parallely.
Changes made by a transaction becomes visible to other transactions only after they
are written in the memory.
The resultant state of the system after executing all the transactions is same as the
state that would be achieved if the transactions were executed serially one after the
other.
It is the responsibility of concurrency control manager to ensure isolation for all the
transactions.
4. Durability-
This property ensures that all the changes made by a transaction after its successful
execution are written successfully to the disk.
It also ensures that these changes exist permanently and are never lost even if there
occurs a failure of any kind.
It is the responsibility of recovery manager to ensure durability in the database.
NOTE-
Here,
1. T1 reads the value of A.
2. T1 updates the value of A in the buffer.
3. T2 reads the value of A from the buffer.
4. T2 writes the updated the value of A.
5. T2 commits.
6. T1 fails in later stages and rolls back.
In this example,
T2 reads the dirty value of A written by the uncommitted transaction T1.
T1 fails in later stages and roll backs.
Thus, the value that T2 read now stands to be incorrect.
Therefore, database becomes inconsistent.
This problem occurs when a transaction gets to read unrepeated i.e. different values of
the same variable in its different read operations even when it has not updated its value.
Example-
Here,
1. T1 reads the value of X (= 10 say).
2. T2 reads the value of X (= 10).
3. T1 updates the value of X (from 10 to 15 say) in the buffer.
4. T2 again reads the value of X (but = 15).
In this example,
T2 gets to read a different value of X in its second reading.
T2 wonders how the value of X got changed because according to it, it is running in isolation.
This problem occurs when multiple transactions execute concurrently and updates from
one or more transactions get lost.
Example-
Here,
1. T1 reads the value of A (= 10 say).
2. T1 updates the value to A (= 15 say) in the buffer.
3. T2 does blind write A = 25 (write without read) in the buffer.
4. T2 commits.
5. When T1 commits, it writes A = 25 in the database.
In this example,
T1 writes the over written value of X in the database.
Thus, update from T1 gets lost.
NOTE-
This problem occurs when a transaction reads some variable from the buffer and when
it reads the same variable later, it finds that the variable does not exist.
Example-
Here,
1. T1 reads X.
2. T2 reads X.
3. T1 deletes X.
4. T2 tries reading X but does not find it.
In this example,
T2 finds that there does not exist any variable X when it tries reading X again.
T2 wonders who deleted the variable X because according to it, it is running in isolation.
Concurrency Control Protocols help to prevent the occurrence of above problems and
maintain the consistency of the database.
Schedules in DBMS-
The order in which the operations of multiple transactions appear for execution is called as a
schedule.
Types of Schedules-
In serial schedules,
All the transactions execute serially one after the other.
When one transaction executes, no other transaction is allowed to execute.
Characteristics-
Example-01:
In this schedule,
There are two transactions T1 and T2 executing serially one after the other.
Transaction T1 executes first.
After T1 completes its execution, transaction T2 executes.
So, this schedule is an example of a Serial Schedule.
Example-02:
In this schedule,
There are two transactions T1 and T2 executing serially one after the other.
Transaction T2 executes first.
After T2 completes its execution, transaction T1 executes.
So, this schedule is an example of a Serial Schedule.
Non-Serial Schedules-
In non-serial schedules,
Multiple transactions execute concurrently.
Operations of all the transactions are inter leaved or mixed with each other.
Characteristics-
Example-01:
In this schedule,
There are two transactions T1 and T2 executing concurrently.
The operations of T1 and T2 are interleaved.
So, this schedule is an example of a Non-Serial Schedule.
Example-02:
In this schedule,
There are two transactions T1 and T2 executing concurrently.
The operations of T1 and T2 are interleaved.
So, this schedule is an example of a Non-Serial Schedule.
Consider there are n number of transactions T1, T2, T3 …. , Tn with N1, N2, N3 …. , Nn
number of operations respectively.
Problem-
Solution-
Total Number of Schedules-
Serializable Schedules-
Characteristics-
Serializable schedules behave exactly same as serial schedules.
Thus, serializable schedules are always-
Consistent
Recoverable
Casacadeless
Strict
Thus, all the transactions necessarily execute Thus, multiple transactions can execute
serially one after the other. concurrently.
Serial schedules lead to less resource utilization Serializable schedules improve both resource
and CPU throughput. utilization and CPU throughput.
Serial Schedules are less efficient as compared to Serializable Schedules are always better than
serializable schedules. serial schedules.
Types of Serializability-
Conflict Serializability-
If a given non-serial schedule can be converted into a serial schedule by swapping its
non-conflicting operations, then it is called as a conflict serializable schedule.
Conflicting Operations-
Two operations are called as conflicting operations if all the following conditions hold
true for them-
Both the operations belong to different transactions
Both the operations are on the same data item
At least one of the two operations is a write operation
Example-
Follow the following steps to check whether a given non-serial schedule is conflict
serializable or not-
Step-01:
Step-02:
Start creating a precedence graph by drawing one node for each transaction.
Step-03:
Draw an edge for each conflict pair such that if Xi (V) and Yj (V) forms a conflict pair then draw
an edge from Ti to Tj.
This ensures that Ti gets executed before Tj.
Step-04:
NOTE-
By performing the Topological Sort of the Directed Acyclic Graph so obtained, the
corresponding serial schedule(s) can be found.
Such schedules can be more than 1.
Problem-01:
Solution-
Step-01:
List all the conflicting operations and determine the dependency between the
transactions-
R2(A) , W1(A) (T2 → T1)
R1(B) , W2(B) (T1 → T2)
R3(B) , W2(B) (T3 → T2)
Step-02:
Problem-02:
Check whether the given schedule S is conflict serializable and recoverable or not-
Solution-
Step-01:
List all the conflicting operations and determine the dependency between the
transactions-
R2(X) , W3(X) (T2 → T3)
R2(X) , W1(X) (T2 → T1)
W3(X) , W1(X) (T3 → T1)
W3(X) , R4(X) (T3 → T4)
W1(X) , R4(X) (T1 → T4)
W2(Y) , R4(Y) (T2 → T4)
Step-02:
Alternatively,
There exists no dirty read operation.
This is because all the transactions which update the values commits immediately.
Therefore, the given schedule S is recoverable.
Also, S is a Cascadeless Schedule.
Problem-03:
Check whether the given schedule S is conflict serializable or not. If yes, then determine
all the possible serialized schedules-
Solution-
Checking Whether S is Conflict Serializable Or Not-
Step-01:
List all the conflicting operations and determine the dependency between the
transactions-
R4(A) , W2(A) (T4 → T2)
R3(A) , W2(A) (T3 → T2)
W1(B) , R3(B) (T1 → T3)
W1(B) , W2(B) (T1 → T2)
R3(B) , W2(B) (T3 → T2)
Step-02:
All the possible topological orderings of the above precedence graph will be the possible
serialized schedules.
The topological orderings can be found by performing the Topological Sort of the above
precedence graph.
After performing the topological sort, the possible serialized schedules are-
1. T1 → T3 → T4 → T2
2. T1 → T4 → T3 → T2
3. T4 → T1 → T3 → T2
Problem-04:
Determine all the possible serialized schedules for the given schedule-
Solution-
This is because we are only concerned about the read and write operations taking place
on the database.
Step-01:
List all the conflicting operations and determine the dependency between the
transactions-
R1(A) , W2(A) (T1 → T2)
R2(A) , W1(A) (T2 → T1)
W2(A) , W1(A) (T2 → T1)
R2(B) , W1(B) (T2 → T1)
R1(B) , W2(B) (T1 → T2)
W1(B) , W2(B) (T1 → T2)
Step-02:
View Serializability-
If a given schedule is found to be view equivalent to some serial schedule, then it is called as a view
serializable schedule.
Consider two schedules S1 and S2 each consisting of two transactions T1 and T2.
Schedules S1 and S2 are called view equivalent if the following three conditions hold
true for them-
Condition-01:
For each data item X, if transaction Ti reads X from the database initially in schedule S1,
then in schedule S2 also, Ti must perform the initial read of X from the database.
Thumb Rule
“Initial readers must be same for all the data items”.
Condition-02:
If transaction Ti reads a data item that has been updated by the transaction Tj in
schedule S1, then in schedule S2 also, transaction Ti must read the same data item that
has been updated by the transaction Tj.
Thumb Rule
“Write-read sequence must be same.”.
Condition-03:
For each data item X, if X has been updated at last by transaction Ti in schedule S1,
then in schedule S2 also, X must be updated at last by transaction Ti.
Thumb Rule
“Final writers must be same for all the data items”.
Method-01:
Check whether the given schedule is view serializable or not.
If the given schedule is conflict serializable, then it is surely view serializable.
If the given schedule is not conflict serializable, then it may or may not be view serializable.
Thumb Rules
All conflict serializable schedules are view serializable.
All view serializable schedules may or may not be conflict serializable.
Method-02:
Thumb Rule
No blind write means not a view serializable schedule.
Method-03:
Problem-01:
Check whether the given schedule S is view serializable or not-
Solution-
Step-01:
List all the conflicting operations and determine the dependency between the
transactions-
W1(B) , W2(B) (T1 → T2)
W1(B) , W3(B) (T1 → T3)
W1(B) , W4(B) (T1 → T4)
W2(B) , W3(B) (T2 → T3)
W2(B) , W4(B) (T2 → T4)
W3(B) , W4(B) (T3 → T4)
Step-02:
Problem-02:
Step-01:
List all the conflicting operations and determine the dependency between the
transactions-
R1(A) , W3(A) (T1 → T3)
R2(A) , W3(A) (T2 → T3)
R2(A) , W1(A) (T2 → T1)
W3(A) , W1(A) (T3 → T1)
Step-02:
Now,
Since, the given schedule S is not conflict serializable, so, it may or may not be view
serializable.
To check whether S is view serializable or not, let us use another method.
Let us check for blind writes.
Now,
To check whether S is view serializable or not, let us use another method.
Let us derive the dependencies and then draw a dependency graph.
Problem-03:
Step-01:
List all the conflicting operations and determine the dependency between the
transactions-
R1(A) , W2(A) (T1 → T2)
R2(A) , W1(A) (T2 → T1)
W1(A) , W2(A) (T1 → T2)
R1(B) , W2(B) (T1 → T2)
R2(B) , W1(B) (T2 → T1)
Step-02:
Now,
Since, the given schedule S is not conflict serializable, so, it may or may not be view
serializable.
To check whether S is view serializable or not, let us use another method.
Let us check for blind writes.
Alternatively,
You could directly declare that the given schedule S is not view serializable.
This is because there exists no blind write in the schedule.
You need not check for conflict serializability.
Problem-04:
Check whether the given schedule S is view serializable or not. If yes, then give the
serial schedule.
S : R1(A) , W2(A) , R3(A) , W1(A) , W3(A)
Solution-
For simplicity and better understanding, we can represent the given schedule pictorially
as-
Step-01:
List all the conflicting operations and determine the dependency between the
transactions-
R1(A) , W2(A) (T1 → T2)
R1(A) , W3(A) (T1 → T3)
W2(A) , R3(A) (T2 → T3)
W2(A) , W1(A) (T2 → T1)
W2(A) , W3(A) (T2 → T3)
R3(A) , W1(A) (T3 → T1)
W1(A) , W3(A) (T1 → T3)
Step-02:
Now,
Since, the given schedule S is not conflict serializable, so, it may or may not be view
serializable.
To check whether S is view serializable or not, let us use another method.
Let us check for blind writes.
Now,
To check whether S is view serializable or not, let us use another method.
Let us derive the dependencies and then draw a dependency graph.
Drawing a Dependency Graph-
Non-Serializable Schedules-
Characteristics-
Non-serializable schedules-
may or may not be consistent
may or may not be recoverable
Irrecoverable Schedules-
If in a schedule,
A transaction performs a dirty read operation from an uncommitted transaction
And commits before the transaction from which it has read the value
then such a schedule is known as an Irrecoverable Schedule.
Example-
Here,
T2 performs a dirty read operation.
T2 commits before T1.
T1 fails later and roll backs.
The value that T2 read now stands to be incorrect.
T2 can not recover since it has already committed.
Recoverable Schedules-
If in a schedule,
A transaction performs a dirty read operation from an uncommitted transaction
And its commit operation is delayed till the uncommitted transaction either commits or roll backs
then such a schedule is known as a Recoverable Schedule.
Here,
The commit operation of the transaction that performs the dirty read is delayed.
This ensures that it still has a chance to recover if the uncommitted transaction fails later.
Example-
Method-01:
Thumb Rules
All conflict serializable schedules are recoverable.
All recoverable schedules may or may not be conflict serializable.
Method-02:
If there exists a dirty read operation, then follow the following cases-
Case-01:
If the commit operation of the transaction performing the dirty read occurs before the
commit or abort operation of the transaction which updated the value, then the schedule
is irrecoverable.
Case-02:
If the commit operation of the transaction performing the dirty read is delayed till the
commit or abort operation of the transaction which updated the value, then the schedule
is recoverable.
Thumb Rule
No dirty read means a recoverable schedule.
Cascading Rollback | Cascadeless
Schedule
Recoverable Schedules-
If in a schedule,
A transaction performs a dirty read operation from an uncommitted transaction
And its commit operation is delayed till the uncommitted transaction either commits or roll backs
then such a schedule is called as a Recoverable Schedule.
1. Cascading Schedule
2. Cascadeless Schedule
3. Strict Schedule
Cascading Schedule-
Here,
Transaction T2 depends on transaction T1.
Transaction T3 depends on transaction T2.
Transaction T4 depends on transaction T3.
In this schedule,
The failure of transaction T1 causes the transaction T2 to rollback.
The rollback of transaction T2 causes the transaction T3 to rollback.
The rollback of transaction T3 causes the transaction T4 to rollback.
Such a rollback is called as a Cascading Rollback.
NOTE-
If the transactions T2, T3 and T4 would have committed before the failure of transaction
T1, then the schedule would have been irrecoverable.
Cascadeless Schedule-
If in a schedule, a transaction is not allowed to read a data item until the last transaction
that has written it is committed or aborted, then such a schedule is called as
a Cascadeless Schedule.
In other words,
Cascadeless schedule allows only committed read operations.
Therefore, it avoids cascading roll back and thus saves CPU time.
Example-
NOTE-
Example-
Strict Schedule-
If in a schedule, a transaction is neither allowed to read nor write a data item until the
last transaction that has written it is committed or aborted, then such a schedule is
called as a Strict Schedule.
In other words,
Strict schedule allows only committed read and write operations.
Clearly, strict schedule implements more restrictions than cascadeless schedule.
Example-
Remember-
Equivalence of Schedules-
In DBMS, schedules may have the following three different kinds of equivalence
relations among them-
1. Result Equivalence
2. Conflict Equivalence
3. View Equivalence
1. Result Equivalent Schedules-
If any two schedules generate the same result after their execution, then they are called as
result equivalent schedules.
This equivalence relation is considered of least significance.
This is because some schedules might produce same results for some set of values and
different results for some other set of values.
If any two schedules satisfy the following two conditions, then they are called as conflict
equivalent schedules-
1. The set of transactions present in both the schedules is same.
2. The order of pairs of conflicting operations of both the schedules is same.
Problem-01:
Let X = 2 and Y = 5.
On substituting these values, the results produced by each schedule are-
Problem-02:
Solution-