DATABASE MANAGEMENT
SYSTEM
BDA202/BDA252
Unit-5
TRANSACTION
The transaction is a set of logically related
operation. It contains a group of tasks.
A transaction is an action or series of actions. It is
performed by a single user to perform operations for
accessing the contents of the database.
Example: Suppose an employee of bank transfers Rs
800 from X's account to Y's account. This small
transaction contains several low-level tasks:
CONTD…
X's Account
Open_Account(X)
Old_Balance = X.balance
New_Balance = Old_Balance - 800
X.balance = New_Balance
Close_Account(X)
Y's Account
Open_Account(Y)
Old_Balance = Y.balance
New_Balance = Old_Balance + 800
Y.balance = New_Balance
Close_Account(Y)
OPERATIONS OF TRANSACTION:
Following are the main operations of transaction:
Read(X): Read operation is used to read the value of
X from the database and stores it in a buffer in main
memory.
Write(X): Write operation is used to write the value
back to the database from the buffer.
Let's take an example to debit transaction from an
account which consists of following operations:
1. R(X);
2. X = X - 500;
3. W(X);
Let's assume the value of X before starting of the
transaction is 4000.
The first operation reads X's value from database and stores it
in a buffer.
The second operation will decrease the value of X by 500. So
buffer will contain 3500.
The third operation will write the buffer's value to the
database. So X's final value will be 3500.
But it may be possible that because of the failure of hardware,
software or power, etc. that transaction may fail before
finished all the operations in the set.
For example: If in the above transaction, the debit transaction
fails after executing operation 2 then X's value will remain
4000 in the database which is not acceptable by the bank.
To solve this problem, we have two important operations:
Commit: It is used to save the work done permanently.
Rollback: It is used to undo the work done.
TRANSACTION PROPERTY
The transaction has the four properties. These are used to
maintain consistency in a database, before and after the
transaction.
Property of Transaction
Atomicity
Consistency
Isolation
Durability
ATOMICITY
It states that all operations of the transaction take place at
once if not, the transaction is aborted.
There is no midway, i.e., the transaction cannot occur partially.
Each transaction is treated as one unit and either run to
completion or is not executed at all.
Atomicity involves the following two operations:
Abort: If a transaction aborts then all the changes made are
not visible.
Commit: If a transaction commits then all the changes made
are visible.
Example: Let's assume that following transaction T consisting
of T1 and T2. A consists of Rs 600 and B consists of Rs 300.
Transfer Rs 100 from account A to account B.
CONTD…
T1 T2
Read(A) Read(B)
A:= A-100 Y:= Y+100
Write(A) Write(B)
After completion of the transaction, A consists of Rs 500
and B consists of Rs 400.
If the transaction T fails after the completion of transaction
T1 but before completion of transaction T2, then the
amount will be deducted from A but not added to B.
This shows the inconsistent database state. In order to
ensure correctness of database state, the transaction must
be executed in entirety.
CONSISTENCY
The integrity constraints are maintained so that the database
is consistent before and after the transaction.
The execution of a transaction will leave a database in either
its prior stable state or a new stable state.
The consistent property of database states that every
transaction sees a consistent database instance.
The transaction is used to transform the database from one
consistent state to another consistent state.
For example: The total amount must be maintained before or
after the transaction.
Total before T occurs = 600+300=900
Total after T occurs= 500+400=900
Therefore, the database is consistent. In the case when T1 is
completed but T2 fails, then inconsistency will occur.
ISOLATION
It shows that the data which is used at the time of
execution of a transaction cannot be used by the
second transaction until the first one is completed.
In isolation, if the transaction T1 is being executed
and using the data item X, then that data item can't
be accessed by any other transaction T2 until the
transaction T1 ends.
The concurrency control subsystem of the DBMS
enforced the isolation property.
DURABILITY
The durability property is used to indicate the
performance of the database's consistent state. It states
that the transaction made the permanent changes.
They cannot be lost by the erroneous operation of a
faulty transaction or by the system failure. When a
transaction is completed, then the database reaches a
state known as the consistent state. That consistent
state cannot be lost, even in the event of a system's
failure.
The recovery subsystem of the DBMS has the
responsibility of Durability property.
STATES OF TRANSACTION
In a database, the transaction can be in one of the
following states –
STATES OF TRANSACTION
Active state
The active state is the first state of every
transaction. In this state, the transaction is being
executed.
For example: Insertion or deletion or updating a
record is done here. But all the records are still not
saved to the database.
Partially committed
In the partially committed state, a transaction
executes its final operation, but the data is still not
saved to the database.
In the total mark calculation example, a final display
of the total marks step is executed in this state.
STATES OF TRANSACTION
Committed
A transaction is said to be in a committed state if it
executes all its operations successfully. In this state,
all the effects are now permanently saved on the
database system.
Failed state
If any of the checks made by the database recovery
system fails, then the transaction is said to be in the
failed state.
In the example of total mark calculation, if the
database is not able to fire a query to fetch the
marks, then the transaction will fail to execute.
STATES OF TRANSACTION
Aborted
If any of the checks fail and the transaction has
reached a failed state then the database recovery
system will make sure that the database is in its
previous consistent state. If not then it will abort or roll
back the transaction to bring the database into a
consistent state.
If the transaction fails in the middle of the transaction
then before executing the transaction, all the executed
transactions are rolled back to its consistent state.
After aborting the transaction, the database recovery
module will select one of the two operations:
Re-start the transaction
Kill the transaction
SCHEDULE
A series of operation from one transaction to
another transaction is known as schedule. It is used
to preserve the order of the operation in each of the
individual transaction.
SERIAL SCHEDULE
The serial schedule is a type of schedule where one
transaction is executed completely before starting another
transaction. In the serial schedule, when the first transaction
completes its cycle, then the next transaction is executed.
For example: Suppose there are two transactions T1 and
T2 which have some operations. If it has no interleaving of
operations, then there are the following two possible
outcomes:
Execute all the operations of T1 which was followed by all
the operations of T2 as shown in fig(a).
Execute all the operations of T2 which was followed by all
the operations of T1. as shown in fig(b)
NON-SERIAL SCHEDULE
If interleaving of operations is allowed, then there will be
non-serial schedule.
It contains many possible orders in which the system can
execute the individual operations of the transactions.
In the given figure (c) and (d), Schedule C and Schedule
D are the non-serial schedules. It has interleaving of
operations.
SERIALIZABLE SCHEDULE
The serializability of schedules is used to find non-serial
schedules that allow the transaction to execute
concurrently without interfering with one another.
It identifies which schedules are correct when
executions of the transaction have interleaving of their
operations.
A non-serial schedule will be serializable if its result is
equal to the result of its transactions executed serially.
TESTING OF SERIALIZABILITY
Serialization Graph is used to test the Serializability of
a schedule.
Assume a schedule S. For S, we construct a graph
known as precedence graph. This graph has a pair G
= (V, E), where V consists a set of vertices, and E
consists a set of edges. The set of vertices is used to
contain all the transactions participating in the
schedule. The set of edges is used to contain all edges
Ti ->Tj for which one of the three conditions holds:
Create a node Ti → Tj if Ti executes write (Q)
before Tj executes read (Q).
Create a node Ti → Tj if Ti executes read (Q)
before Tj executes write (Q).
Create a node Ti → Tj if Ti executes write (Q)
before Tj executes write (Q).
STATES OF TRANSACTION
If a precedence graph for schedule S contains a
cycle, then S is non-serializable. If the precedence
graph has no cycle, then S is known as serializable.
PRECEDENCE GRAPH
Read(A): In T1, no subsequent writes to A, so no new edges
Read(B): In T2, no subsequent writes to B, so no new edges
Read(C): In T3, no subsequent writes to C, so no new edges
Write(B): B is subsequently read by T3, so add edge T2 → T3
Write(C): C is subsequently read by T1, so add edge T3 → T1
Write(A): A is subsequently read by T2, so add edge T1 → T2
Write(A): In T2, no subsequent reads to A, so no new edges
Write(C): In T1, no subsequent reads to C, so no new edges
Write(B): In T3, no subsequent reads to B, so no new edges
Precedence graph for schedule S1:
The precedence graph for
schedule S1 contains a cycle
that's why Schedule S1 is
non-serializable.
EXAMPLE-2
Explanation:
Read(A): In T4,no subsequent
writes to A, so no new edges
Read(C): In T4, no subsequent
writes to C, so no new edges
Write(A): A is subsequently
read by T5, so add edge T4 → T5
Read(B): In T5,no subsequent writes to B, so no new edges
Write(C): C is subsequently read by T6, so add edge T4 → T6
Write(B): A is subsequently read by T6, so add edge T5 → T6
Write(C): In T6, no subsequent reads to C, so no new edges
Write(A): In T5, no subsequent reads to A, so no new edges
Write(B): In T6, no subsequent reads to B, so no new edges
Hence The schedule S2 is serializable
CONFLICT SERIALIZABLE SCHEDULE
A schedule is called conflict serializability if after
swapping of non-conflicting operations, it can
transform into a serial schedule.
The schedule will be a conflict serializable if it is
conflict equivalent to a serial schedule.
Conflicting Operations
The two operations become conflicting if all
conditions satisfy:
Both belong to separate transactions.
They have the same data item.
They contain at least one write operation.
Example:
Swapping is possible only if S1 and S2 are logically
equal.
NON-CONFLICT
Here, S1 = S2. That means it is non-conflict.
CONFLICT: CASE-1
Here, S1 ≠ S2. That means it is conflict.
CONFLICT EQUIVALENT
In the conflict equivalent, one can be transformed to
another by swapping non-conflicting operations. In
the given example, S2 is conflict equivalent to S1 (S1
can be converted to S2 by swapping non-conflicting
operations).
Two schedules are said to be conflict equivalent if and
only if:
They contain the same set of the transaction.
If each pair of conflict operations are ordered in the
same way.
Example:
Schedule S2 is a serial schedule because, in this, all
operations of T1 are performed before starting any
operation of T2. Schedule S1 can be transformed
into a serial schedule by swapping non-conflicting
operations of S1.
AFTER SWAPPING OF NON-CONFLICT
OPERATIONS, THE SCHEDULE S1 BECOMES:
T1 T2
Read(A)
Write(A)
Read(B)
Write(B)
Read(A)
Write(A)
Read(B)
Write(B)
Since, S1 is conflict serializable.
VIEW SERIALIZABILITY
A schedule will view serializable if it is view equivalent to
a serial schedule.
If a schedule is conflict serializable, then it will be view
serializable.
The view serializable which does not conflict serializable
contains blind writes.
View Equivalent
Two schedules S1 and S2 are said to be view equivalent
if they satisfy the following conditions:
1. Initial Read
An initial read of both schedules must be the same.
Suppose two schedule S1 and S2. In schedule S1, if a
transaction T1 is reading the data item A, then in S2,
transaction T1 should also read A.
CONTD….
Above two schedules are view equivalent because
Initial read operation in S1 is done by T1 and in S2 it
is also done by T1.
2. Updated Read
In schedule S1, if Ti is reading A which is updated by
Tj then in S2 also, Ti should read A which is updated
by Tj.
CONTD….
Above two schedules are not view equal because, in
S1, T3 is reading A updated by T2 and in S2, T3 is
reading A updated by T1.
3. Final Write
A final write must be the same between both the
schedules. In schedule S1, if a transaction T1
updates A at last then in S2, final writes operations
should also be done by T1.
CONTD….
Above two schedules is view equal because Final write operation in S1 is done by
T3 and in S2, the final write operation is also done by T3.
Example:
CONTD….
Schedule S
With 3 transactions, the total number of possible
schedule
= 3! = 6
S1 = <T1 T2 T3>
S2 = <T1 T3 T2>
S3 = <T2 T3 T1>
S4 = <T2 T1 T3>
S5 = <T3 T1 T2>
S6 = <T3 T2 T1>
Taking first schedule S1:
CONTD….
Schedule S1
Step 1: final updation on data items
In both schedules S and S1, there is no read except
the initial read that's why we don't need to check that
condition.
Step 2: Initial Read
The initial read operation in S is done by T1 and in S1,
it is also done by T1.
Step 3: Final Write
The final write operation in S is done by T3 and in S1,
it is also done by T3. So, S and S1 are view Equivalent.
The first schedule S1 satisfies all three conditions, so
we don't need to check another schedule.
Hence, view equivalent serial schedule is:
T1 → T2 → T3
RECOVERABILITY OF SCHEDULE
Sometimes a transaction may not execute
completely due to a software issue, system crash or
hardware failure. In that case, the failed transaction
has to be rollback. But some other transaction may
also have used value produced by the failed
transaction. So we also have to rollback those
transactions.
CONTD….
The above table 1 shows a schedule which has two
transactions. T1 reads and writes the value of A and that value
is read and written by T2. T2 commits but later on, T1 fails.
Due to the failure, we have to rollback T1. T2 should also be
rollback because it reads the value written by T1, but T2 can't
be rollback because it already committed. So this type of
schedule is known as irrecoverable schedule.
Irrecoverable schedule: The schedule will be irrecoverable if
Tj reads the updated value of Ti and Tj committed before Ti
commit.
CONTD…
The above table 2 shows a schedule with two transactions.
Transaction T1 reads and writes A, and that value is read and
written by transaction T2. But later on, T1 fails. Due to this,
we have to rollback T1. T2 should be rollback because T2 has
read the value written by T1. As it has not committed before
T1 commits so we can rollback transaction T2 as well. So it is
recoverable with cascade rollback.
Recoverable with cascading rollback: The schedule will
be recoverable with cascading rollback if Tj reads the
updated value of Ti. Commit of Tj is delayed till commit of Ti.
CONTD…
The above Table 3 shows a schedule with two
transactions. Transaction T1 reads and write A and
commits, and that value is read and written by T2.
So this is a cascade less recoverable schedule.
CONCURRENCY CONTROL
Concurrency Control is the management procedure
that is required for controlling concurrent execution of
the operations that take place on a database.
But before knowing about concurrency control, we
should know about concurrent execution.
Concurrent Execution in DBMS
In a multi-user system, multiple users can access and
use the same database at one time, which is known
as the concurrent execution of the database. It means
that the same database is executed simultaneously
on a multi-user system by different users.
CONTD…
While working on the database transactions, there
occurs the requirement of using the database by
multiple users for performing different operations, and
in that case, concurrent execution of the database is
performed.
The thing is that the simultaneous execution that is
performed should be done in an interleaved manner,
and no operation should affect the other executing
operations, thus maintaining the consistency of the
database. Thus, on making the concurrent execution of
the transaction operations, there occur several
challenging problems that need to be solved.
EXECUTION
In a database transaction, the two main operations
are READ and WRITE operations. So, there is a need
to manage these two operations in the concurrent
execution of the transactions as if these operations
are not performed in an interleaved manner, and
the data may become inconsistent. So, the following
problems occur with the Concurrent Execution of the
operations:
PROBLEM 1: LOST UPDATE PROBLEMS
(W - W CONFLICT)
The problem occurs when two different database
transactions perform the read/write operations on the
same database items in an interleaved manner (i.e.,
concurrent execution) that makes the values of the
items incorrect hence making the database
inconsistent.
For example:
Consider the below diagram where two transactions
TX and TY, are performed on the same account A
where the balance of account A is $300.
CONTD…
TIMESTAMP ORDERING PROTOCOL
The Timestamp Ordering Protocol is used to order the
transactions based on their Timestamps. The order of
transaction is nothing but the ascending order of the
transaction creation.
The priority of the older transaction is higher that's
why it executes first. To determine the timestamp of
the transaction, this protocol uses system time or
logical counter.
The lock-based protocol is used to manage the order
between conflicting pairs among transactions at the
execution time. But Timestamp based protocols start
working as soon as a transaction is created.
Let's assume there are two transactions T1 and T2.
Suppose the transaction T1 has entered the system at
007 times and transaction T2 has entered the system
at 009 times. T1 has the higher priority, so it executes
first as it is entered the system first.
CONTD…
The timestamp ordering protocol also maintains the
timestamp of last 'read' and 'write' operation on a data.
Basic Timestamp ordering protocol works as
follows:
1. Check the following condition whenever a transaction Ti
issues a Read (X) operation:
If W_TS(X) >TS(Ti) then the operation is rejected.
If W_TS(X) <= TS(Ti) then the operation is executed.
Timestamps of all the data items are updated.
2. Check the following condition whenever a transaction Ti
issues a Write(X) operation:
If TS(Ti) < R_TS(X) then the operation is rejected.
If TS(Ti) < W_TS(X) then the operation is rejected and Ti
is rolled back otherwise the operation is executed.
CONTD…
Where,
TS(TI) denotes the timestamp of the transaction Ti.
R_TS(X) denotes the Read time-stamp of data-item
X.
W_TS(X) denotes the Write time-stamp of data-item
X.
Advantages and Disadvantages of TO protocol:
TO protocol ensures serializability since the
precedence graph is as follows:
CONTD…
TS protocol ensures freedom from deadlock that
means no transaction ever waits.
But the schedule may not be recoverable and may
not even be cascade- free.
LOCK-BASED PROTOCOL
In this type of protocol, any transaction cannot read
or write data until it acquires an appropriate lock on
it. There are two types of lock:
1. Shared lock:
It is also known as a Read-only lock. In a shared
lock, the data item can only read by the transaction.
It can be shared between the transactions because
when the transaction holds a lock, then it can't
update the data on the data item.
2. Exclusive lock:
In the exclusive lock, the data item can be both
reads as well as written by the transaction.
This lock is exclusive, and in this lock, multiple
transactions do not modify the same data
simultaneously.
CONTD…
There are four types of lock protocols available:
1. Simplistic lock protocol
It is the simplest way of locking the data while
transaction. Simplistic lock-based protocols allow all
the transactions to get the lock on the data before
insert or delete or update on it. It will unlock the
data item after completing the transaction.
2. Pre-claiming Lock Protocol
Pre-claiming Lock Protocols evaluate the transaction
to list all the data items on which they need locks.
Before initiating an execution of the transaction, it
requests DBMS for all the lock on all those data
items.
If all the locks are granted then this protocol allows
the transaction to begin. When the transaction is
completed then it releases all the lock.
STATES OF TRANSACTION
If all the locks are not granted then this protocol
allows the transaction to rolls back and waits until
all the locks are granted.
STATES OF TRANSACTION
3. Two-phase locking (2PL)
The two-phase locking protocol divides the
execution phase of the transaction into three parts.
In the first part, when the execution of the
transaction starts, it seeks permission for the lock it
requires.
In the second part, the transaction acquires all the
locks. The third phase is started as soon as the
transaction releases its first lock.
In the third phase, the transaction cannot demand
any new locks. It only releases the acquired locks.
STATES OF TRANSACTION
There are two phases of 2PL:
Growing phase: In the growing phase, a new lock
on the data item may be acquired by the transaction,
but none can be released.
Shrinking phase: In the shrinking phase, existing
lock held by the transaction may be released, but no
new locks can be acquired.
In the below example, if lock conversion is allowed
then the following phase can happen:
Upgrading of lock (from S(a) to X (a)) is allowed in
growing phase.
Downgrading of lock (from X(a) to S(a)) must be done
in shrinking phase.
Example:
The following way shows how unlocking and locking work with
2-PL.
Transaction T1:
Growing phase: from step 1-3
Shrinking phase: from step 5-7
Lock point: at 3
Transaction T2:
Growing phase: from step 2-6
Shrinking phase: from step 8-9
Lock point: at 6
DATABASE RECOVERY MANAGEMENT
Follow the notes
END OF UNIT-5