0% found this document useful (0 votes)
2 views25 pages

DBMS Notes Unit 3

The document outlines key concepts in database transaction management, focusing on ACID properties (Atomicity, Consistency, Isolation, Durability) and their significance in ensuring reliable database operations. It discusses transaction execution models, including serial and non-serial schedules, and introduces serializability as a method to validate concurrent transaction executions. Additionally, it covers recovery mechanisms and concurrency control techniques essential for maintaining database integrity during transactions.

Uploaded by

Vignes Waran
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views25 pages

DBMS Notes Unit 3

The document outlines key concepts in database transaction management, focusing on ACID properties (Atomicity, Consistency, Isolation, Durability) and their significance in ensuring reliable database operations. It discusses transaction execution models, including serial and non-serial schedules, and introduces serializability as a method to validate concurrent transaction executions. Additionally, it covers recovery mechanisms and concurrency control techniques essential for maintaining database integrity during transactions.

Uploaded by

Vignes Waran
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

CS3492 – DATABASE MANAGEMENT SYSTEMS

ACADEMIC YEAR 2024-2025 (EVEN)


Department of Computer Science and Engineering

UNIT-III - TRANSACTIONS

Transaction Concepts – ACID Properties – Schedules – Serializability – Transaction support in SQL –


Need for Concurrency – Concurrency control –Two Phase Locking- Timestamp – Multiversion –
Validation and Snapshot isolation– Multiple Granularity locking – Deadlock Handling – Recovery
Concepts – Recovery based on deferred and immediate update – Shadow paging – ARIES Algorithm

3.1 Transaction Concepts


A transaction is an event which occurs on the database. Generally a transaction reads a value
from the database or writes a value to the database. If you have any concept of Operating Systems, then
we can say that a transaction is analogous to processes. Although a transaction can both read and write
on the database, there are some fundamental differences between these two classes of operations. A read
operation does not change the image of the database in any way. But a write operation, whether
performed with the intention of inserting, updating or deleting data from the database, changes the
image of the database.

3.2 ACID Properties


Every transaction, for whatever purpose it is being used, has the following four properties.
Taking the initial letters of these four properties we collectively call them the ACID Properties. A
transaction is a unit of program execution that accesses and possibly updates various data items.
ACID Properties:
Properties of the transactions are,
1) Atomicity. Either all operations of the transaction are reflected properly in the database, or
none
2) Consistency. Execution of a transaction in isolation (that is, with no other transaction
executing concurrently) preserves the consistency of the data-base.
3) Isolation. Even though multiple transactions may execute concurrently, the system
guarantees that, for every pair of transactions Ti and Tj, it appears to Ti that either Tj
finished execution before Ti started or Tj started execution after Ti finished. Thus, each
transaction is unaware of other transactions executing concurrently in the system.

SRMMCET / CSE / DBMS 1


CS3492 – DATABASE MANAGEMENT SYSTEMS

4) Durability. After a transaction completes successfully, the changes it has made to the
database persist, even if there are system failures. These properties are often called the
ACID properties; the acronym is derived from the first letter of each of the four properties.

A Simple Transaction Model:


Consider a simple bank application consisting of several accounts and a set of transactions that
access and update those accounts.
Transactions access data using two operations:
 read(X), which transfers the data item X from the database to a variable, also called X, in a buffer
in main memory belonging to the transaction that executed the read operation.
 write(X), which transfers the value in the variable X in the main-memory buffer of the transaction
that executed the write to the data item X in the database.

It is important to know if a change to a data item appears only in main memory or if it has been
written to the database on disk. In a real database system, the write operation does not necessarily result
in the immediate update of the data on the disk; the write operation may be temporarily stored elsewhere
and executed on the disk later. For now, however, we shall assume that the write operation updates the
database immediately.
Let Ti be a transaction that transfers $50 from account A to account B. This transaction can be
defined as:
Ti : read(A);
A := A − 50;
write(A);
read(B);
B := B + 50;
write(B).

Let us now consider each of the ACID properties.


Consistency:
The consistency requirement here is that the sum of A and B be unchanged by the execution of
the transaction. Without the consistency requirement, money could be created or destroyed by the
transaction. It can be verified easily that, if the database is consistent before an execution of the
transaction, the database remains consistent after the execution of the transaction.
Ensuring consistency for an individual transaction is the responsibility of the application
programmer who codes the transaction. This task may be facilitated by automatic testing of integrity
constraints.

SRMMCET / CSE / DBMS 2


CS3492 – DATABASE MANAGEMENT SYSTEMS

Atomicity:
Suppose that, just before the execution of transaction Ti, the values of accounts A and B are
$1000 and $2000, respectively. Now suppose that, during the execution of transaction Ti , a failure
occurs that prevents Ti from completing its execution successfully. Further, suppose that the failure
happened after the write (A) operation but before the write (B) operation. In this case, the values of
accounts A and B reflected in the database are $950 and $2000. The system destroyed $50 as a result of
this failure. In particular, we note that the sum A + B is no longer preserved. Thus, because of the
failure, the state of the system no longer reflects a real state of the world that the database is supposed to
capture. We term such a state an inconsistent state.

We must ensure that such inconsistencies are not visible in a database system. However, the
system must at some point be in an inconsistent state. Even if transaction Ti is executed to completion,
there exist a point at which the value of account A is $950 and the value of account B is $2000, which is
clearly an inconsistent state. This state, however, is eventually replaced by the consistent state where the
value of account A is $950, and the value of account B is $2050. Thus, if the transaction never started or
was guaranteed to complete, such an inconsistent state would not be visible except during the execution
of the transaction. That is the reason for the atomicity requirement: If the atomicity property is present,
all actions of the transaction are reflected in the database, or none are.

The basic idea behind ensuring atomicity is this: The database system keeps track (on disk) of
the old values of any data on which a transaction performs a write. This information is written to a file
called the log. If the transaction does not complete its execution, the database system restores the old
values from the log to make it appear as though the transaction never executed. Ensuring atomicity is
the responsibility of the database system; specifically, it is handled by a component of the database
called the recovery system.

Durability:
Once the execution of the transaction completes successfully, and the user who initiated the
transaction has been notified that the transfer of funds has taken place, it must be the case that no system
failure can result in a loss of data corresponding to this transfer of funds. The durability property
guarantees that, once a transaction completes successfully, all the updates that it carried out on the
database persist, even if there is a system failure after the transaction completes execution.

We assume for now that a failure of the computer system may result in loss of data in main
memory, but data written to disk are never lost. We can guarantee durability by ensuring that either: The
updates carried out by the transaction have been written to disk before the transaction completes.
Information about the updates carried out by the transaction and writ-ten to disk is sufficient to enable
the database to reconstruct the updates when the database system is restarted after the failure.

SRMMCET / CSE / DBMS 3


CS3492 – DATABASE MANAGEMENT SYSTEMS

Isolation:
Even if the consistency and atomicity properties are ensured for each transaction, if several
transactions are executed concurrently, their operations may interleave in some undesirable way,
resulting in an inconsistent state. For example, as we saw earlier, the database is temporarily
inconsistent while the transaction to transfer funds from A to B is executing, with the deducted total
written to A and the increased total yet to be written to B. If a second concurrently running transaction
reads A and B at this intermediate point and computes A+ B, it will observe an inconsistent value.

Furthermore, if this second transaction then performs updates on A and B based on the
inconsistent values that it read, the database may be left in an inconsistent state even after both
transactions have completed.

3.3 Schedules

A series of operation from one transaction to another transaction is known as schedule. It is used
to preserve the order of the operation in each of the individual transaction.

Fig. 3.1 Types of Schedule


A. Serial Schedule
The serial schedule is a type of schedule where one transaction is executed completely before
starting another transaction. In the serial schedule, when the first transaction completes its cycle, then
the next transaction is executed.

For example: Suppose there are two transactions T1 and T2 which have some operations. If it has no
interleaving of operations, then there are the following two possible outcomes:
1. Execute all the operations of T1 which was followed by all the operations of T2.
2. Execute all the operations of T1 which was followed by all the operations of T2.
o In the given figure 3.11, Schedule A shows the serial schedule where T1 followed by T2

SRMMCET / CSE / DBMS 4


CS3492 – DATABASE MANAGEMENT SYSTEMS

Fig. 3.2 Schedule A

o In the given figure 3.12, Schedule B shows the serial schedule where T2 followed by T1

Fig. 3.3 Schedule B


B. Non-serial Schedule
If interleaving of operations is allowed, then there will be non-serial schedule. It contains many
possible orders in which the system can execute the individual operations of the transactions. In the
given figure 3.13 and 3.14, Schedule C and Schedule D are the non-serial schedules. It has interleaving
of operations.

Fig. 3.4 Schedule C

SRMMCET / CSE / DBMS 5


CS3492 – DATABASE MANAGEMENT SYSTEMS

Fig. 3.5 Schedule D


C. Serializable schedule
The serializability of schedules is used to find non-serial schedules that allow the transaction to
execute concurrently without interfering with one another. It identifies which schedules are correct when
executions of the transaction have interleaving of their operations. A non-serial schedule will be
serializable if its result is equal to the result of its transactions executed serially.
Here,
o Schedule A and Schedule B are serial schedule.
o Schedule C and Schedule D are Non-serial schedule.

3.3 Serializability

Serial schedules are serializable, but if steps of multiple transactions are interleaved, it is harder
to determine whether a schedule is serializable.
 Since transactions are programs, it is difficult to determine exactly what operations a
transaction performs and how operations of various transactions interact.
 For this reason, we shall not consider the various types of operations that a transaction can
perform on a data item, but instead consider only two operations: read and write.
 We assume that, between a read(Q) instruction and a write(Q) instruction on a data item Q, a
transaction may perform an arbitrary sequence of operations on the copy of Q that is residing
in the local buffer of the transaction. In this model, the only significant operations of a
transaction, from a scheduling point of view, are its read and write instructions. Commit
operations, though relevant, are not considered. We therefore may show only read and write
instructions in schedules
Different forms of schedule serializablity are
1) Conflict serializability.
2) View serialzablity

SRMMCET / CSE / DBMS 6


CS3492 – DATABASE MANAGEMENT SYSTEMS

1. Conflict Serializability:
Let us consider a schedule S in which there are two consecutive instructions, and J, of
transactions Ti and Tj , respectively (i = j ). If I and J refer to different data items, then we can swap I
and J without affecting the results of any instruction in the schedule. However, if I and J refer to the
same data item Q, then the order of the two steps may matter. Since we are dealing with only read and
write instructions, there are four cases that we need to considered:
 I = read(Q), J=read(Q). The order of I and J does not matter, since the same value of Q is read by Ti and T j,
regardless of the order.
 I = read(Q), J = write(Q). If I comes before J , then Ti does not read the value of Q that is written
by Tj in instruction J . If J comes before I , then Ti reads the value of Q that is written by Tj. Thus,
the order of I and J matters.
 I = write(Q), J = read(Q). The order of I and J matters for reasons similar to those of the previous
case.
a. I = write(Q), J = write(Q). Since both instructions are write operations, the order of these
instructions does not affect either Ti or Tj . However, the value obtained by the next read(Q)
instruction of S is affected, since the result of only the latter of the two write instructions is
preserved in the database. If there is no other write(Q) instruction after I and J in S, then the order
of I and J directly affects the final value of Q in the database state that results from schedule S.

Fig. 3.6 Schedule 3 — showing only the read and write instructions

Fig. 3.7 Schedule 5 — schedule 3 after swapping of a pair of instructions

Thus, only in the case where both I and J are read instructions does the relative order of their
execution not matter. We say that I and J conflict if they are operations by different transactions on the
same data item, and at least one of these instructions is a write operation.

SRMMCET / CSE / DBMS 7


CS3492 – DATABASE MANAGEMENT SYSTEMS

To illustrate the concept of conflicting instructions, we consider schedule 3. The write(A)


instruction of T1 conflicts with the read(A) instruction of T2. However, the write(A) instruction of T2
does not conflict with the read(B) instruction of T1, because the two instructions access different data
items.

Fig. 3.8 Schedule 6 — a serial schedule that is equivalent to schedule 3

Fig. 3.9 Schedule 7

Let I and J be consecutive instructions of a schedule S. If I and J are instructions of different


transactions and I and J do not conflict, then we can swap the order of I and J to produce a new schedule
S. S is equivalent to S, since all instructions appear in the same order in both schedules except for I and
J, whose order does not matter.
Since the write(A) instruction of T2 in schedule 3 does not conflict with the read(B) instruction
of T1, we can swap these instructions to generate an equivalent schedule, schedule 5. Regardless of the
initial system state, schedules 3 and 5 both produce the same final system state.

1. View Serializability:
There is another form of equivalence that is less stringent than conflict equivalence, but that, like
conflict equivalence, is based on only the read and writes operations of transactions.

Consider two schedules S and S, where the same set of transactions participates in both
schedules. The schedules S and S are said to be view equivalent if three conditions are met:
(1) For each data item Q, if transaction Ti reads the initial value of Q in schedule S, then
transaction Ti must, in schedule S, also read the initial value of Q.
(2) For each data item Q, if transaction Ti executes read(Q) in schedule S, and if that value was
produced by a write(Q) operation executed by transaction Tj , then the read(Q) operation of

SRMMCET / CSE / DBMS 8


CS3492 – DATABASE MANAGEMENT SYSTEMS

transaction Ti must, in schedule S , also read the value of Q that was produced by the same
write(Q) operation of transaction Tj .
(3) For each data item Q, the transaction (if any) that performs the final write(Q) operation in
schedule S must perform the final write(Q) operation in schedule S .

Conditions 1 and 2 ensure that each transaction reads the same values in both schedules and,
therefore, performs the same computation. Condition 3, coupled with conditions 1 and 2, and ensures
that both schedules result in the same final system state.
The concept of view equivalence leads to the concept of view serializability. We say that a
schedule S is view serializable if it is view equivalent to a serial schedule.
As an illustration, suppose that we augment schedule 4 with transaction T29, and obtain the
following view serializable (schedule 5):

T27 T28 T29


read (Q)
write (Q)
write (Q)
write (Q)

Indeed, schedule 5 is view equivalent to the serial schedule <T27, T28, T29>, since the one
read(Q) instruction reads the initial value of Q in both schedules and T29 performs the final write of Q
in both schedules.
Every conflict-serializable schedule is also view serializable, but there are view-serializable
schedules that are not conflict serializable. Indeed, schedule 5 is not conflict serializable, since every
pair of consecutive instructions conflicts, and, thus, no swapping of instructions is possible.
Observe that, in schedule 5, transactions T28 and T29 perform write(Q) operations without
having performed a read(Q) operation. Writes of this sort are called blind writes. Blind writes appear in
any view-serializable schedule that is not conflict serializable.

3.4 Transaction support in SQL


The COMMIT, ROLLBACK, and SAVEPOINT are collectively considered as Transaction
Commands.
(1) COMMIT: The COMMIT command is used to save permanently any transaction to database.
When we perform, Read or Write operations to the database then those changes can be undone by
rollback operations. To make these changes permanent, we should make use of commit

(2) ROLLBACK: The ROLLBACK command is used to undo transactions that have not already saved
to database.
For example

SRMMCET / CSE / DBMS 9


CS3492 – DATABASE MANAGEMENT SYSTEMS

Consider the student database table as


RollNo Name
1 AAA
2 BBB
3 CCC
4 DDD
5 EEE

Following command will delete the record from the database, but if we immediately performs
ROLLBACK, then this deletion is undone.
For instance -
DELETE FROM Student
WHERE RollNo = 2;
ROLLBACK;
Then the resultant table will be
RollNo Name
1 AAA
2 BBB
3 CCC
4 DDD
5 EEE

(3) SAVEPOINT: A SAVEPOINT is a point in a transaction when you can roll the transaction back to a
certain point without rolling back the entire transaction. The SAVEPOINT can be created as
SAVEPOINT savepoint_name;
Then we can ROLLBACK to SAVEPOIT as
ROLLBACK TO savepoint_name;

For example – Let us consider Student table and consider following commands
SQL> SAVEPOINT S1 SQL>DELETE FROM Student
Where RollNo=2;
SQL> SAVEPOINT S2
SQL>DELETE FROM Student
Where RollNo=3;

SQL> SAVEPOINT S3
SQL>DELETE FROM Student
Where RollNo=4
SQL> SAVEPOINT S4
SQL>DELETE FROM Student
Where RollNo=5

SRMMCET / CSE / DBMS 10


CS3492 – DATABASE MANAGEMENT SYSTEMS

SQL> ROLLBACK TO S3;

Then the resultant table will be


RollNo Name
1 AAA
2 BBB
3 CCC

Thus the effect of deleting the record having RollNo 2, and RollNo3 is undone.

3.5 Need for Concurrency


Following are the purposes of concurrency control:
 To ensure isolation
 To resolve read-write or write-write conflicts
 To preserve consistency of database
 Concurrent execution of transactions over shared database creates several data integrity
and consistency problems. They are:
(1) Lost update problem
This problem occurs when two transactions that access the same database items have their
operations interleaved in a way that makes the value of some database item incorrect.
For example - Consider following transactions
(1) Salary of Employee is read during transaction T1.
(2) Salary of Employee is read by another transaction T2.
(3) During transaction T1, the salary is incremented by 200
(4) During transaction T2, the salary is incremented by 500

Fig. 3.10 Lost update problem


The result of the above sequence is that the update made by transaction T1 is completely lost. Therefor
this problem is called as lost update problem.

SRMMCET / CSE / DBMS 11


CS3492 – DATABASE MANAGEMENT SYSTEMS

2) Dirty read or Uncommited read problem


The dirty read is a situation in which one transaction reads the data immediately after the write
operation of previous transaction.

Fig. 3.11 Dirty read problem

For example - Consider following transactions


Assume initially salary is = 1000

Fig. 3.12 Example for dirty read problem

i) At the time t1, the transaction T2 updates the salary to 1200


ii) This salary is read at time t2 by transaction T1. Obviously it is 1200
iii) But at the time t3, the transaction T2 performs Rollback by undoing the changes made by T1
and T2 at time t1 and t2.
iv) Thus the salary again becomes = 1000. This situation leads to Dirty Read or Uncommited
Read because here the read made at time t2(immediately after roid update of another
transaction) becomes a dirty read.

(3) Non-repeatable read problem


This problem is also known as inconsistent analysis problem. This problem occurs when a
particular transaction sees two different values for the same row within its lifetime. For example-

SRMMCET / CSE / DBMS 12


CS3492 – DATABASE MANAGEMENT SYSTEMS

Fig. 3.13 Non-repeatable read problem

i) At time t1, the transaction T1 reads the salary as 1000


ii) At time t2 the transaction T2 reads the same salary as 1000 and updates it to 1200
iii) Then at time t3, the transaction T2 gets committed.
iv) Now when the transaction T1 reads the same salary at time t4, it gets different value than
that it had read at time t1. Now, transaction T1, cannot repeat its reading operation. Thus
inconsistent values are obtained.
Hence the name of this problem is non-repeatable read or inconsistent analysis problem.

(4) Phantom read problem


The phantom read problem is a special case of non-repeatable read problem. This is a problem in
which one of the transaction makes the changes in the database system and due to these changes another
transaction cannot read the data item which it has read just recently.
For example -

Fig. 3.14 Phantom read problem

i) At time t1, the transaction T1 reads the value of salary as 1000


ii) At time t2, the transaction T2 reads the value of the same salary as 1000
iii) At time t3, the transaction T1 deletes the variable salary.
iv) Now at time t4, when T2 again reads the salary it gets error. Now transaction T2
cannot identify the reason why it is not getting the salary value which is read just few
time back.
This problem occurs due to changes in the database and is called phantom read problem.

SRMMCET / CSE / DBMS 13


CS3492 – DATABASE MANAGEMENT SYSTEMS

3.6 Concurrency control


Definition of concurrency control: A mechanism which ensures that simultaneous execution of
more than one transactions does not lead to any database inconsistencies is called concurrency control
mechanism.
The concurrency control can be achieved with the help of various protocols such as - lock based
protocol, Deadlock handling, Multiple Granularity, Timestamp based protocol, and validation based
protocols.

3.7 Two Phase Locking


If read and write operations introduce the first unlock operation in the transaction, then it is said
to be Two-Phase Locking Protocol. This locking protocol divides the execution phase of a transaction
into three parts. In the first part, when the transaction starts executing, it seeks permission for the locks it
requires. The second part is where the transaction acquires all the locks. As soon as the transaction
releases its first lock, the third phase starts. In this phase, the transaction cannot demand any new locks;
it only releases the acquired locks.

Fig. 3.15 Two-Phase Locking 2PL


Two-phase locking has two phases, one is growing, where all the locks are being acquired by the
transaction; and the second phase is shrinking, where the locks held by the transaction are being
released. To claim an exclusive (write) lock, a transaction must first acquire a shared (read) lock and
then upgrade it to an exclusive lock.

The two phase locking is a protocol in which there are two phases:
i) Growing phase (Locking phase): It is a phase in which the transaction may obtain locks but does
not release any lock.
ii) Shrinking phase (Unlocking phase): It is a phase in which the transaction may release the locks but
does not obtain any new lock.
There are 3 types of two – phase locking protocol. They are,
1. Strict Two – Phase Locking Protocol
2. Rigorous Two – Phase Locking Protocol
3. Conservative Two – Phase Locking Protocol

SRMMCET / CSE / DBMS 14


CS3492 – DATABASE MANAGEMENT SYSTEMS

Strict Two-Phase Locking Protocol


The first phase of Strict-2PL is same as 2PL. After acquiring all the locks in the first phase, the
transaction continues to execute normally. But in contrast to 2PL, Strict-2PL does not release a lock
after using it. Strict-2PL holds all the locks until the commit point and releases all the locks at a time.
Strict-2PL does not have cascading abort as 2PL does.

Fig. 3.16 Strict Two-Phase Locking

Rigorous Two-Phase Locking


 Rigorous Two – Phase Locking Protocol avoids cascading rollbacks.
 This protocol requires that all the share and exclusive locks to be held until the transaction
commit
Conservative Two-Phase Locking Protocol
 Conservative Two – Phase Locking Protocol is also called as Static Two – Phase Locking
Protocol.
 This protocol is almost free from deadlocks as all required items are listed in advanced.
 It requires locking of all data items to access before the transaction starts.

Example:
Consider following transactions
T1 T2
Lock-X(A) Lock-S(B)
Read(A) Read(B)
A=A-50 Unlock-S(B)
Write(A)
Lock-X(B)
Unlock-X(A)
B=B+100 Lock-S(A)
Write(B) Read(A)
Unlock-X(B) Unlock-S(A)

The important rule for being a two phase locking is all Lock operations precede all the unlock
operations. In above transactions T1 is in two phase locking mode but transaction T2 is not in two phase

SRMMCET / CSE / DBMS 15


CS3492 – DATABASE MANAGEMENT SYSTEMS

locking. Because in T2, the Shared lock is acquired by data item B, then data item B is read and then the
lock is released. Again the lock is acquired by data item A, then the data item A is read and the lock is
then released. Thus we get lock-unlock-lock-unlock sequence. Clearly this is not possible in two phase
locking.

3.8 Timestamp

Timestamp-based Protocols
The most commonly used concurrency protocol is the timestamp based protocol. This protocol uses
either system time or logical counter as a timestamp. Lock-based protocols manage the order between
the conflicting pairs among transactions at the time of execution, whereas timestamp-based protocols
start working as soon as a transaction is created. Every transaction has a timestamp associated with it,
and the ordering is determined by the age of the transaction.

A transaction created at 0002 clock time would be older than all other transactions that come after
it. For example, any transaction 'y' entering the system at 0004 is two seconds younger and the priority
would be given to the older one. In addition, every data item is given the latest read and write-
timestamp. This lets the system know when the last ‘read and write’ operation was performed on the
data item.

3.9 Multiversion
The DBMS maintains multiple physical versions of single logical object in the database.
 When a transaction writes to an object, the database creates a new version of that object.
 When a transaction reads an object, it reads the newest version that exists when the
transaction started.
 In this technique, the writers don't block readers or readers don't block writer.
 This scheme makes use of:
i) Locking protocol and
ii) Timestamp protocol.
 The multiversion is now used in almost all database management system as a modern
technique of concurrency control.

SRMMCET / CSE / DBMS 16


CS3492 – DATABASE MANAGEMENT SYSTEMS

3.10 Validation and Snapshot isolation


Validation Based Protocol
The optimistic concurrency control algorithm is basically a validation based protocol. It works in
three phases. They are:
 Read phase:
In this phase, transaction T is read. The values of various data items are read and stored in
temporary variables. All the operations are then performed on temporary variables without updating the
actual database.
 Validation phase :
In this phase, the temporary variable value is validated against the actual data in the database and
it checked whether the transaction T follows serializability or not.
 Write phase
If the transaction T is validated then only the temporary results are written to database, otherwise
the system rolls back. Each phase has following different timestamps
- Start (Ti): - It contains the timestamp when Ti starts the execution.
- Validation (Ti): - It contains the timestamp when transaction Ti finishes the read phase and
starts its validation phase.
- Finish (Ti): - It contains the timestamp when transaction T, finishes its write phase.

With the help of timestamp in validation phase, this protocol determines if the transaction will
commit or rollback. Hence TS (Ti) = validation (Ti). The serializability is determined at the validation
process, it can't be determined in advance. While executing the transactions, this protocol gives greater
degree of concurrency when there are less number of conflicts. That is because the serializability order
is not pre-decided (validated and then executed) and relatively less transaction will have to be rolled
back.

Snapshot Isolation
The snapshot isolation is a multi-version concurrency control technique. In snapshot isolation,
we can imagine that each transaction is given its own version, or snapshot, of the database when it
begins. It reads data from this private version and is thus isolated from the updates made by other
transactions. If the transaction updates the database, that update appears only in its own version, not in
the actual database itself. Information about these updates is saved so that the updates can be applied to
the "real" database if the transaction commits. When a transaction T enters the partially committed state,
it then proceeds to the committed state only if no other concurrent transaction has modified data that T
intends to update. Transactions that, as a result, cannot commit abort instead. Snapshot isolation ensures
that attempts to read data never need to wait.

SRMMCET / CSE / DBMS 17


CS3492 – DATABASE MANAGEMENT SYSTEMS

3.11 Multiple Granularity locking

Multiple granularity locking is a locking mechanism that provides different levels of locks for
different database objects. It allows for different locks at different levels of granularity. This mechanism
allows multiple transactions to lock different levels of granularity, ensuring that conflicts are minimized,
and concurrency is maximized.
For example:
Consider a tree which has four levels of nodes. The first level or higher level shows the entire
database. The second level represents a node of type area. The higher level database consists of exactly
these areas. The area consists of children nodes which are known as files. No file can be present in more
than one area. Finally, each file contains child nodes known as records. The file has exactly those
records that are its child nodes. No records represent in more than one file. Hence, the levels of the tree
starting from the top level are as follows:
- Database
- Area
- File
- Record

Fig. 3.17 Multiple Granularity Tree Hierarchy

There are three additional lock modes with multiple granularity. They are,
i. Intention-shared (IS): It contains explicit locking at a lower level of the tree but only with
shared locks.
ii. Intention-Exclusive (IX): It contains explicit locking at a lower level with exclusive or
shared locks.
iii. Shared & Intention-Exclusive (SIX): In this lock, the node is locked in shared mode, and
some node is locked in exclusive mode by the same transaction.
The compatibility metrics for these lock modes are described below:
IS IX S SIX X
1S YES YES YES YES NO
QX YES YES NO NO NO
S YES NO YES NO NO
SIX YES NO NO NO NO
X NO NO NO NO NO

SRMMCET / CSE / DBMS 18


CS3492 – DATABASE MANAGEMENT SYSTEMS

Advantages of Multiple Granularity Locking


Multiple granularity locking has several advantages over other locking mechanisms. They are:
i. Increased Concurrency :- Multiple granularity locking allows for multiple transactions to
access different levels of granularity concurrently, thereby increasing concurrency.
ii. Reduced Locking Overhead :- Multiple granularity locking allows for locks to be set at
different levels of granularity, reducing the locking overhead.
iii. Improved Performance: - Multiple granularity locking provides finer-grained control over
locks, improving performance by reducing the number of conflicts and deadlocks.

3.12 Deadlock Handling

Deadlock is a situation in computing where two or more processes are unable to proceed because
each is waiting for the other to release resources.
For example, let us assume, we have two processes P1 and P2. Now, process P1 is holding the
resource R1 and is waiting for the resource R2. At the same time, the process P2 is having the resource
R2 and is waiting for the resource R1. So, the process P1 is waiting for process P2 to release its resource
and at the same time, the process P2 is waiting for process P1 to release its resource. And no one is
releasing any resource. So, both are waiting for each other to release the resource. This leads to infinite
waiting and no work is done here. This is called Deadlock.

Fig. 3.18 Dead Lock


Necessary Conditions of Deadlock
There are four different conditions that result in Deadlock. These four conditions are also known
as Coffman conditions and these conditions are not mutually exclusive. Deadlock will happen if all the
above four conditions happen simultaneously.

i. Mutual Exclusion:
A resource can be held by only one process at a time. In other words, if a process P1 is using
some resource R at a particular instant of time, then some other process P2 can't hold or use the same
resource R at that particular instant of time. The process P2 can make a request for that resource R but it
can't use that resource simultaneously with process P1.

SRMMCET / CSE / DBMS 19


CS3492 – DATABASE MANAGEMENT SYSTEMS

Fig. 3.19 Mutual Exclusion


ii. Hold and Wait:
A process can hold a number of resources at a time and at the same time, it can request for other
resources that are being held by some other process. For example, a process P1 can hold two resources
R1 and R2 and at the same time, it can request some resource R3 that is currently held by process P2.

Fig. 3.20 Hold and Wait


iii. No preemption:
A resource can't be preempted from the process by another process, forcefully. For example, if a
process P1 is using some resource R, then some other process P2 can't forcefully take that resource. If it
is so, then what's the need for various scheduling algorithm. The process P2 can request for the resource
R and can wait for that resource to be freed by the process P1.

iv. Circular Wait:


Circular wait is a condition when the first process is waiting for the resource held by the second
process, the second process is waiting for the resource held by the third process, and so on. At last, the
last process is waiting for the resource held by the first process. So, every process is waiting for each
other to release the resource and no one is releasing their own resource. Everyone is waiting here for
getting the resource. This is called a circular wait.

Fig. 3.21 Circular Wait


3.13 Recovery Concepts
Recovery
Database recovery is the process of restoring the database to a correct (consistent) state in the
event of a failure. In other words, it is the process of restoring the database to the most recent consistent
state that existed shortly before the time of system failure.

SRMMCET / CSE / DBMS 20


CS3492 – DATABASE MANAGEMENT SYSTEMS

When system with concurrent transaction crashes and recovers, it does behave in the following
manner:

Fig. 3.22 Recovery

The recovery system reads the logs backwards from the end to the last Checkpoint.
 It maintains two lists, undo-list and redo-list.
 If the recovery system sees a log with <Tn, Start> and <Tn, Commit> or just <Tn, Commit>, it
puts the transaction in redo-list.
 If the recovery system sees a log with <Tn, Start> but no commit or abort log found, it puts the
transaction in undo-list.
 All transactions in undo-list are then undone and their logs are removed. All transaction in redo-
list, their previous logs are removed and then redone again and log saved.

3.14 Recovery based on deferred and immediate update

Crash Recovery
Though we are living in highly technologically advanced era where hundreds of satellite monitor
the earth and at every second billions of people are connected through information technology, failure is
expected but not every time acceptable.

DBMS is highly complex system with hundreds of transactions being executed every second.
Availability of DBMS depends on its complex architecture and underlying hardware or system software.
If it fails or crashes amid transactions being executed, it is expected that the system would follow some
sort of algorithm or techniques to recover from crashes or failures.

Failure Classification
To see where the problem has occurred we generalize the failure into various categories, as
follows:
1) Transaction Failure
When a transaction is failed to execute or it reaches a point after which it cannot be completed
successfully it has to abort. This is called transaction failure. Only few transaction or processes are hurt.

SRMMCET / CSE / DBMS 21


CS3492 – DATABASE MANAGEMENT SYSTEMS

Reason for transaction failure could be:


 Logical errors: where a transaction cannot complete because of it has some code error or any
internal error condition
 System errors: where the database system itself terminates an active transaction because DBMS is
not able to execute it or it has to stop because of some system condition. For example, in case of
deadlock or resource unavailability systems aborts an active transaction.

2) System Crash
There are problems, which are external to the system, which may cause the system to stop
abruptly and cause the system to crash. For example interruption in power supplies, failure of
underlying hardware or software failure.
3) Disk Failure:
In early days of technology evolution, it was a common problem where hard disk drives or
storage drives used to fail frequently. Disk failures include formation of bad sectors, unreachability to
the disk, disk head crash or any other failure, which destroys all or part of disk storage.
4) Storage Structure
We have already described storage system here. In brief, the storage structure can be divided in
various categories:
 Volatile storage: As name suggests, this storage does not survive system crashes and mostly
placed much closed to CPU by embedding them onto the chipset itself for examples: main
memory, cache memory. They are fast but can store a small amount of information.
 Nonvolatile storage: These memories are made to survive system crashes. They are huge in data
storage capacity but slower in accessibility. Examples may include, hard disks, magnetic tapes,
flash memory, non-volatile (battery backed up) RAM.

Recovery and Atomicity


When a system crashes, it many have several transactions being executed and various files
opened for them to modifying data items. As we know that transactions are made of various operations,
which are atomic in nature. But according to ACID properties of DBMS, atomicity of transactions as a
whole must be maintained that is, either all operations are executed or none.
When DBMS recovers from a crash it should maintain the following:
 It should check the states of all transactions, which were being executed.
 A transaction may be in the middle of some operation; DBMS must ensure the atomicity of
transaction in this case.
 It should check whether the transaction can be completed now or needs to be rolled back.
 No transactions would be allowed to leave DBMS in inconsistent state.

SRMMCET / CSE / DBMS 22


CS3492 – DATABASE MANAGEMENT SYSTEMS

 Maintaining the logs of each transaction, and writing them onto some stable storage before
actually modifying the database.
 Maintaining shadow paging, where the changes are are done on a volatile memory and later
the actual database is updated.

Log-Based Recovery
Log is a sequence of records, which maintains the records of actions performed by a transaction.
It is important that the logs are written prior to actual modification and stored on a stable storage media,
which is failsafe. Log based recovery works as follows:
- The log file is kept on stable storage media
- When a transaction enters the system and starts execution, it writes a log about it <Tn, Start>
- When the transaction modifies an item X, it write logs as <Tn, X, V1, V2>
- It reads Tn has changed the value of X, from V1 to V2.
- When transaction finishes, it logs: <Tn, commit>

Database can be modified using two approaches:


1) Deferred database modification: All logs are written on to the stable storage and database is
updated when transaction commits.
2) Immediate database modification: Each log follows an actual database modification. That is,
database is modified immediately after every operation.

3.15 Shadow paging


Shadow paging is a recovery scheme in which database is considered to be made up of number
of fixed size disk pages. A directory or a page table is constructed with n number of pages where each
ith page points to the ith database page on the disk.

Fig. 3.23 Shadow paging

SRMMCET / CSE / DBMS 23


CS3492 – DATABASE MANAGEMENT SYSTEMS

The directory can be kept in the main memory. When a transaction begins executing, the current
directory-whose entries point to the most recent or current database pages on disk-is copied into a
directory called shadow directory. The shadow directory is then saved on disk while the current
directory is used by the transaction. During the execution of transaction, the shadow directory is never
modified.

When a write operation is to be performed then the new copy of modified database page is
created but the old copy of database page is never overwritten. This newly created database page is
written somewhere else. The current directory will point to newly modified web page and the shadow
page directory will point to the old web page entries of database disk. When the failure occurs then the
modified database pages and current directory is discarded. The state of database before the failure
occurs is now available through the shadow directory and this state can be recovered using shadow
directory pages. This technique does not require any UNDO/REDO operation.

3.16 ARIES Algorithm


ARIES algorithm
Algorithm for Recovery and Isolation Exploiting Semantics (ARIES) is based on the Write
Ahead Log (WAL) protocol. The ARIES Recovery Algorithm in DBMS is an important method for
maintaining data integrity and consistency, particularly after a system crash.

Key Features of Aries Recovery Algorithm in DBMS


i. Write-Ahead Logging (WAL): This make sure that all changes are logged before they are
applied to the database.
ii. Check pointing: To create a stable point in the database from which recovery can start.
iii. Three Phases of Recovery: To ensure database recovery Analysis, Redo, and Undo phases
are crucial.

ARIES Recovery Algorithm Phases


The recovery process actually consists of 3 phases. They are,
1. Analysis Phase
The Analysis Phase identifies the database's state and determines which log entries need to be
processed for recovery.
Example: Suppose we have a log with the following entries:
1. <T1, Start>
2. <T1, Write(A, 10)>
3. <T1, Commit>
4. <T2, Start>
5. <T2, Write(B, 20)>
6. <T2, Abort>

SRMMCET / CSE / DBMS 24


CS3492 – DATABASE MANAGEMENT SYSTEMS

During the Analysis Phase:


 The system reads the log to identify active transactions (T2) and committed transactions (T1).
 It also records the state of database pages that were modified by active transactions.
Steps:
1. Scan the log to identify transactions and their states.
2. Create a list of dirty pages (pages that have been modified but not yet written to disk).
2. Redo Phase
The Redo Phase re-applies changes recorded in the log to ensure that all committed transactions
are properly reflected in the database.
Example: Using the log from the previous example, the Redo Phase will:
 Reapply the changes made by T1 and T2. Since T1 committed, its changes to page A will be
redone.
 T2's changes are ignored because the transaction was aborted.
Steps:
1. Replay every action taken during a committed transaction.
2. Ensure all modifications are applied to the database pages as per the log.

3. Undo Phase
The Undo Phase reverses changes made by aborted transactions, restoring the database to a
consistent state by undoing any modifications from these transactions.
Example: From the log, we need to undo changes made by T2.
Steps:
1. Identify all operations from aborted transactions.
2. Apply undo operations to revert changes made by these transactions.

SRMMCET / CSE / DBMS 25

You might also like