DBMS Question Bank - Unit 3
DBMS Question Bank - Unit 3
UNIT-III
PART-A
4. List the responsibilities of a DBMS has whenever a transaction is submitted to the system for
execution? (CO3) (NOV/DEC, 2019)
The responsibilities of a DBMS has whenever a transaction is submitted to the system for
execution are,
Begin the transaction.
Execute a set of data manipulations and/or queries.
If no errors occur then commit the transaction and end it.
If errors occur then roll back the transaction and end it.
5. Brief any two violations that may occur if a transaction executes a lower isolation level than
Srializable? (CO3) (NOV/DEC, 2019)
The violations are,
Lost updates
Dirty read (or uncommitted data)
Unrepeatable read (or inconsistent retrievals)
6. State the difference between a shared lock and an exclusive lock. (CO3) (APR/MAY, 2018)
S.No. Shared Lock Exclusive Lock
1 Shared Lock is used for when the Exclusive Lock is used for when the
transaction wants to perform read transaction wants to perform both
operation. read and write operation.
2 Multiple Shared lock can be set on Only one exclusive lock can be
transactions simultaneously. placed on a data item at a time.
3 Using shared lock data item can be Using exclusive lock data can be
viewed. inserted or deleted.
9. List out the timestamp based deadlock prevention schemes. (CO3) (APR/MAY, 2022)
A timestamp is a unique identifier created by the DBMS to identify a transaction.
Timestamp based deadlock prevention schemes :
Wait_Die schemes: An older transaction is allowed to wait for a younger transaction,
whereas a younger transaction requesting an item held by an older transaction is aborted and
restarted.
Wound_Wait schemes: It is just the opposite of the Wait_Die technique. Here, a younger
transaction is allowed to wait for an older one, whereas if an older transaction requests an
item held by the younger transaction, we preempt the younger transaction by aborting it.
10. What is the use of save points in recovery? (CO3) (APR/MAY, 2022)
Save points are useful for implementing complex error recovery in database applications. If an
error occurs in the midst of a multiple-statement transaction, the application may be able to recover from
the error by rolling back to a save point without needing to abort the entire transaction.
11. Give the reasons for allowing concurrency? (CO3) (NOV/DEC, 2017)
It improves performance on the database - Concurrency allows multiple transactions or
processes to be executed at once, which maximizes resources and improves the system at
large.
It enhances Scalability: Concurrency enables a system to handle increased workloads and
scale easily
13. What type of locking needed for inserts and delete operations? (CO3) (NOV/DEC, 2017)
When we execute an INSERT, UPDATE, or DELETE statement, the database server uses
exclusive locks. An exclusive lock means that no other users can update or delete the item until the
database server removes the lock.
15. Give an example of two phase commit protocol. (CO3) (NOV/DEC, 2017).
The Two Phase Commit protocol function is in two phases:
Prepare Phase: A coordinator node receives a transaction request, then sends a prepare message to all
participant nodes.
Commit Phase: Depending on the responses from participant nodes, the coordinator sends a commit or
abort message. All nodes then commit or abort the transaction accordingly.
16. How will you handle deadlock during two transactions in database?
(CO3) (APR/MAY, 2024)
Transaction ‘A’ waits for a resource which is held by transaction ‘B’. However, if transaction
‘B’ is not in a position to release it because it is waiting on some resource held by ‘A’, both are
deadlocked and the only way of breaking the deadlock is to cancel one of the transactions, thus releasing
its resources.
18. Write the problems of executing two concurrent transactions. (CO3) (NOV/DEC, 2023)
When multiple transactions execute concurrently in an uncontrolled or unrestricted manner,
then it might lead to several problems. These problems are commonly referred to as concurrency
problems in a database environment.
The five concurrency problems that can occur in the database are:
Lost updates - Two applications, A and B, might both read the same row and calculate new
values for one of the columns based on the data that these applications read.
Access to uncommitted data.
Non-repeatable reads.
Phantom reads.
19. Why might the leaf nodes of a B+ tree file organization lose sequentially?
(CO3) (APR/MAY, 2024)
In a B+-tree index or file organization, leaf nodes that are adjacent to each other in the tree may
be located at different places on disk. When a file organization is newly created on a set of records, it is
possible to allocate blocks that are mostly contiguous on disk to leafs nodes that are contiguous in the
tree. As insertions and deletions occur on the tree, sequentiality is increasingly lost, and sequential
access has to wait for disk seeks increasingly often.
20. What benefit does strict two-phase locking provide? What disadvantages result?
(CO3) (APR/MAY, 2024)
The strict two-phase locking mechanism has the advantage of guaranteeing recoverable
transactions. For example, if we have transactions that rely on previous ones for accuracy, we don't want
to run a second transaction if the first one fails. If the first transaction fails to update, then the second
one would also abort.
PART-B
1. Explain the two phase locking protocol for concurrency control. (CO3) (APR/MAY 2022)
Two-Phase Locking (2PL) is a concurrency control method which divides the execution phase of
a transaction into two parts. It ensures conflict serializable schedules. If read and write operations
introduce the first unlock operation in the transaction, then it is said to be Two-Phase Locking Protocol.
The two phase locking is a protocol in which there are two phases:
i) Growing phase (Locking phase): It is a phase in which the transaction may obtain locks but does
not release any lock.
ii) Shrinking phase (Unlocking phase): It is a phase in which the transaction may release the locks but
does not obtain any new lock.
There are 3 types of two – phase locking protocol. They are,
1. Strict Two – Phase Locking Protocol
2. Rigorous Two – Phase Locking Protocol
3. Conservative Two – Phase Locking Protocol
The important rule for being a two phase locking is all Lock operations precede all the unlock
operations. In above transactions T1 is in two phase locking mode but transaction T2 is not in two phase
locking. Because in T2, the Shared lock is acquired by data item B, then data item B is read and then the
lock is released. Again the lock is acquired by data item A, then the data item A is read and the lock is
then released. Thus we get lock-unlock-lock-unlock sequence. Clearly this is not possible in two phase
locking.
1. Conflict Serializability:
Let us consider a schedule S in which there are two consecutive instructions, and J, of
transactions Ti and Tj , respectively (i = j ). If I and J refer to different data items, then we can swap I
and J without affecting the results of any instruction in the schedule. However, if I and J refer to the
same data item Q, then the order of the two steps may matter. Since we are dealing with only read and
write instructions, there are four cases that we need to considered:
a. I = read(Q), J=read(Q). The order of I and J does not matter, since the same value of Q is read by Ti and T j,
regardless of the order.
b. I = read(Q), J = write(Q). If I comes before J , then Ti does not read the value of Q that is written
by Tj in instruction J . If J comes before I , then Ti reads the value of Q that is written by Tj. Thus,
the order of I and J matters.
c. I = write(Q), J = read(Q). The order of I and J matters for reasons similar to those of the previous
case.
a. I = write(Q), J = write(Q). Since both instructions are write operations, the order of these
instructions does not affect either Ti or Tj . However, the value obtained by the next read(Q)
instruction of S is affected, since the result of only the latter of the two write instructions is
preserved in the database. If there is no other write(Q) instruction after I and J in S, then the order
of I and J directly affects the final value of Q in the database state that results from schedule S.
Fig. 3.1 Schedule 3 — showing only the read and write instructions
Thus, only in the case where both I and J are read instructions does the relative order of their
execution not matter. We say that I and J conflict if they are operations by different transactions on the
same data item, and at least one of these instructions is a write operation.
3. View Serializability:
There is another form of equivalence that is less stringent than conflict equivalence, but that, like
conflict equivalence, is based on only the read and writes operations of transactions.
Consider two schedules S and S, where the same set of transactions partici-pates in both
schedules. The schedules S and S are said to be view equivalent if three conditions are met:
(1) For each data item Q, if transaction Ti reads the initial value of Q in schedule S, then
transaction Ti must, in schedule S, also read the initial value of Q.
(2) For each data item Q, if transaction Ti executes read(Q) in schedule S, and if that value was
produced by a write(Q) operation executed by transaction Tj , then the read(Q) operation of
transaction Ti must, in schedule S , also read the value of Q that was produced by the same
write(Q) operation of transaction Tj .
(3) For each data item Q, the transaction (if any) that performs the final write(Q) operation in
schedule S must perform the final write(Q) operation in schedule S .
Conditions 1 and 2 ensure that each transaction reads the same values in both schedules and,
therefore, performs the same computation. Condition 3, coupled with conditions 1 and 2, ensures that
both schedules result in the same final system state.
The concept of view equivalence leads to the concept of view serializability. We say that a
schedule S is view serializable if it is view equivalent to a serial schedule.
As an illustration, suppose that we augment schedule 4 with transaction T29, and obtain the
following view serializable (schedule 5):
Indeed, schedule 5 is view equivalent to the serial schedule <T27, T28, T29>, since the one
read(Q) instruction reads the initial value of Q in both schedules and T29 performs the final write of Q
in both schedules.
Every conflict-serializable schedule is also view serializable, but there are view-serializable
schedules that are not conflict serializable. Indeed, schedule 5 is not conflict serializable, since every
pair of consecutive instructions conflicts, and, thus, no swapping of instructions is possible.
Observe that, in schedule 5, transactions T28 and T29 perform write(Q) operations without
having performed a read(Q) operation. Writes of this sort are called blind writes. Blind writes appear in
any view-serializable schedule that is not conflict serializable.
read(X), which transfers the data item X from the database to a variable, also called X, in a buffer
in main memory belonging to the transaction that executed the read operation.
write(X), which transfers the value in the variable X in the main-memory buffer of the transaction
that executed the write to the data item X in the database.
It is important to know if a change to a data item appears only in main memory or if it has been
written to the database on disk. In a real database system, the write operation does not necessarily result
in the immediate update of the data on the disk; the write operation may be temporarily stored elsewhere
and executed on the disk later. For now, however, we shall assume that the write operation updates the
database immediately.
Let Ti be a transaction that transfers $50 from account A to account B. This transaction can be
defined as:
Ti : read(A);
A := A − 50;
write(A);
read(B);
B := B + 50;
write(B).
Atomicity:
Suppose that, just before the execution of transaction Ti, the values of accounts A and B are
$1000 and $2000, respectively. Now suppose that, during the execution of transaction Ti , a failure
occurs that prevents Ti from completing its execution successfully. Further, suppose that the failure
happened after the write(A) operation but before the write(B) operation. In this case, the values of
accounts A and B reflected in the database are $950 and $2000. The system destroyed $50 as a result of
this failure. In particular, we note that the sum A + B is no longer preserved. Thus, because of the
failure, the state of the system no longer reflects a real state of the world that the database is supposed to
capture. We term such a state an inconsistent state.
We must ensure that such inconsistencies are not visible in a database system. However, the
system must at some point be in an inconsistent state. Even if transaction Ti is executed to completion,
there exist a point at which the value of account A is $950 and the value of account B is $2000, which is
clearly an inconsistent state. This state, however, is eventually replaced by the consistent state where the
value of account A is $950, and the value of account B is $2050. Thus, if the transaction never started or
was guaranteed to complete, such an inconsistent state would not be visible except during the execution
of the transaction. That is the reason for the atomicity requirement: If the atomicity property is present,
all actions of the transaction are reflected in the database, or none are.
The basic idea behind ensuring atomicity is this: The database system keeps track (on disk) of
the old values of any data on which a transaction performs a write. This information is written to a file
called the log. If the transaction does not complete its execution, the database system restores the old
values from the log to make it appear as though the transaction never executed. Ensuring atomicity is
the responsibility of the database system; specifically, it is handled by a component of the database
called the recovery system.
Durability:
Once the execution of the transaction completes successfully, and the user who initiated the
transaction has been notified that the transfer of funds has taken place, it must be the case that no system
failure can result in a loss of data corresponding to this transfer of funds. The durability property
guarantees that, once a transaction completes successfully, all the updates that it carried out on the
database persist, even if there is a system failure after the transaction completes execution.
We assume for now that a failure of the computer system may result in loss of data in main
memory, but data written to disk are never lost. We can guarantee durability by ensuring that either: The
updates carried out by the transaction have been written to disk before the transaction completes.
Information about the updates carried out by the transaction and writ-ten to disk is sufficient to enable
the database to reconstruct the updates when the database system is restarted after the failure.
Isolation:
Even if the consistency and atomicity properties are ensured for each transaction, if several
transactions are executed concurrently, their operations may interleave in some undesirable way,
resulting in an inconsistent state. For example, as we saw earlier, the database is temporarily
inconsistent while the transaction to transfer funds from A to B is executing, with the deducted total
written to A and the increased total yet to be written to B. If a second concurrently running transaction
reads A and B at this intermediate point and computes A+ B, it will observe an inconsistent value.
Furthermore, if this second transaction then performs updates on A and B based on the
inconsistent values that it read, the database may be left in an inconsistent state even after both
transactions have completed.
Lock-based Protocols
Database systems equipped with lock-based protocols use a mechanism by which any transaction
cannot read or write data until it acquires an appropriate lock on it. Locks are of two kinds:
Binary Locks − A lock on a data item can be in two states; it is either locked or unlocked.
Shared/exclusive − This type of locking mechanism differentiates the locks based on their
uses. If a lock is acquired on a data item to perform a write operation, it is an exclusive lock.
Allowing more than one transaction to write on the same data item would lead the database
into an inconsistent state. Read locks are shared because no data value is being changed.
operations are over. If all the locks are not granted, the transaction rolls back and waits until all the locks
are granted.
Two-phase locking has two phases, one is growing, where all the locks are being acquired by the
transaction; and the second phase is shrinking, where the locks held by the transaction are being
released. To claim an exclusive (write) lock, a transaction must first acquire a shared (read) lock and
then upgrade it to an exclusive lock.
Timestamp-based Protocols
The most commonly used concurrency protocol is the timestamp based protocol. This protocol uses
either system time or logical counter as a timestamp. Lock-based protocols manage the order between
the conflicting pairs among transactions at the time of execution, whereas timestamp-based protocols
start working as soon as a transaction is created. Every transaction has a timestamp associated with it,
and the ordering is determined by the age of the transaction.
A transaction created at 0002 clock time would be older than all other transactions that come after
it. For example, any transaction 'y' entering the system at 0004 is two seconds younger and the priority
would be given to the older one. In addition, every data item is given the latest read and write-
timestamp. This lets the system know when the last ‘read and write’ operation was performed on the
data item.
6. Explain deferred and immediate modification versions of the log based recovery scheme.
(CO3) (APR/MAY, 2019)
Crash Recovery
Though we are living in highly technologically advanced era where hundreds of satellite monitor
the earth and at every second billions of people are connected through information technology, failure is
expected but not every time acceptable.
DBMS is highly complex system with hundreds of transactions being executed every second.
Availability of DBMS depends on its complex architecture and underlying hardware or system software.
If it fails or crashes amid transactions being executed, it is expected that the system would follow some
sort of algorithm or techniques to recover from crashes or failures.
Failure Classification
To see where the problem has occurred we generalize the failure into various categories, as
follows:
Transaction Failure
When a transaction is failed to execute or it reaches a point after which it cannot be completed
successfully it has to abort. This is called transaction failure. Only few transaction or processes are hurt.
Reason for transaction failure could be:
Logical errors: where a transaction cannot complete because of it has some code error or any
internal error condition
System errors: where the database system itself terminates an active transaction because DBMS is
not able to execute it or it has to stop because of some system condition. For example, in case of
deadlock or resource unavailability systems aborts an active transaction.
System Crash
There are problems, which are external to the system, which may cause the system to stop
abruptly and cause the system to crash. For example interruption in power supplies, failure of
underlying hardware or software failure.
Disk Failure:
In early days of technology evolution, it was a common problem where hard disk drives or
storage drives used to fail frequently. Disk failures include formation of bad sectors, unreachability to
the disk, disk head crash or any other failure, which destroys all or part of disk storage.
Storage Structure
We have already described storage system here. In brief, the storage structure can be divided in
various categories:
Volatile storage: As name suggests, this storage does not survive system crashes and mostly
placed very closed to CPU by embedding them onto the chipset itself for examples: main memory,
cache memory. They are fast but can store a small amount of information.
Nonvolatile storage: These memories are made to survive system crashes. They are huge in data
storage capacity but slower in accessibility. Examples may include, hard disks, magnetic tapes,
flash memory, non-volatile (battery backed up) RAM.
Log-Based Recovery
Log is a sequence of records, which maintains the records of actions performed by a transaction.
It is important that the logs are written prior to actual modification and stored on a stable storage media,
which is failsafe. Log based recovery works as follows:
The log file is kept on stable storage media
When a transaction enters the system and starts execution, it writes a log about it
<Tn, Start>
When the transaction modifies an item X, it write logs as follows:
<Tn, X, V1, V2>
It reads Tn has changed the value of X, from V1 to V2.
When transaction finishes, it logs:
<Tn, commit>
The recovery system reads the logs backwards from the end to the last Checkpoint.
It maintains two lists, undo-list and redo-list.
If the recovery system sees a log with <Tn, Start> and <Tn, Commit> or just <Tn, Commit>, it
puts the transaction in redo-list.
If the recovery system sees a log with <Tn, Start> but no commit or abort log found, it puts the
transaction in undo-list.
All transactions in undo-list are then undone and their logs are removed. All transaction in redo-
list, their previous logs are removed and then redone again and log saved.
7. What is recovery? Outline the steps in the Algorithm for Recovery and Isolation Exploiting
Semantics (ARIES) algorithm with an example. (CO3) (APR/MAY, 2023)
Recovery
Database recovery is the process of restoring the database to a correct (consistent) state in the
event of a failure. In other words, it is the process of restoring the database to the most recent consistent
state that existed shortly before the time of system failure.
ARIES algorithm
Algorithm for Recovery and Isolation Exploiting Semantics (ARIES) is based on the Write
Ahead Log (WAL) protocol. The ARIES Recovery Algorithm in DBMS is an important method for
maintaining data integrity and consistency, particularly after a system crash.
2. Redo Phase
The Redo Phase re-applies changes recorded in the log to ensure that all committed transactions
are properly reflected in the database.
Example: Using the log from the previous example, the Redo Phase will:
Reapply the changes made by T1 and T2. Since T1 committed, its changes to page A will be
redone.
T2's changes are ignored because the transaction was aborted.
Steps:
1. Replay every action taken during a committed transaction.
2. Ensure all modifications are applied to the database pages as per the log.
3. Undo Phase
The Undo Phase reverses changes made by aborted transactions, restoring the database to a
consistent state by undoing any modifications from these transactions.
Example: From the log, we need to undo changes made by T2.
Steps:
1. Identify all operations from aborted transactions.
2. Apply undo operations to revert changes made by these transactions.
There are three additional lock modes with multiple granularity. They are,
i. Intention-shared (IS): It contains explicit locking at a lower level of the tree but only with
shared locks.
ii. Intention-Exclusive (IX): It contains explicit locking at a lower level with exclusive or
shared locks.
iii. Shared & Intention-Exclusive (SIX): In this lock, the node is locked in shared mode, and
some node is locked in exclusive mode by the same transaction.
The compatibility metrics for these lock modes are described below:
IS IX S SIX X
1S YES YES YES YES NO
QX YES YES NO NO NO
S YES NO YES NO NO
SIX YES NO NO NO NO
X NO NO NO NO NO
Locks:
Locks are mechanism used to ensure data integrity. The oracle engine automatically locks j table
data while executing SQL statements like Select/insert/UPDATE/DELETE. This type of locking is
called implicit locking.
There are two types of Locks. They are,
i. Shared lock
ii. Exclusive lock
Shared lock:
Shared locks are placed on resources whenever a read operation (select) is performed. Multiple
shared locks can be simultaneously set on a resource.
Exclusive lock:
Exclusive locks are placed on resources whenever a write operation (INSERT, UPDATE And
DELETE) are performed. Only one exclusive lock can be placed on a resource at a time.
i.e. the first user who acquires an exclusive lock will continue to have the sole ownership of the
resource, and no other user can acquire an exclusive lock on that resource
Deadlock:
In a deadlock, two database operations wait for each other to release a lock. A deadlock occurs
when two users have a lock, each on a separate object, and, they want to acquire a lock on each other's
object.
When this happens, the first user has to wait for the second user to release the lock, but the
second user will not release it until the lock on the first user's object is freed. At this point, both the users
are at an impasse and cannot proceed with their business.
9. Explain the concepts of serial, non-serial and conflict-serializable schedules with examples.
(CO3) (NOV/DEC, 2023)
Schedule
A series of operation from one transaction to another transaction is known as schedule. It is used
to preserve the order of the operation in each of the individual transaction.
For example: Suppose there are two transactions T1 and T2 which have some operations. If it has no
interleaving of operations, then there are the following two possible outcomes:
1. Execute all the operations of T1 which was followed by all the operations of T2.
2. Execute all the operations of T1 which was followed by all the operations of T2.
o In the given figure 3.11, Schedule A shows the serial schedule where T1 followed by T2
o In the given figure 3.12, Schedule B shows the serial schedule where T2 followed by T1
executions of the transaction have interleaving of their operations. A non-serial schedule will be
serializable if its result is equal to the result of its transactions executed serially.
Here,
o Schedule A and Schedule B are serial schedule.
o Schedule C and Schedule D are Non-serial schedule.
10. What is dead lock? Explain the four conditions for dead lock with an example.
(CO3) (APR/MAY, 2019)
Deadlock
Deadlock is a situation in computing where two or more processes are unable to proceed because
each is waiting for the other to release resources.
For example, let us assume, we have two processes P1 and P2. Now, process P1 is holding the
resource R1 and is waiting for the resource R2. At the same time, the process P2 is having the resource
R2 and is waiting for the resource R1. So, the process P1 is waiting for process P2 to release its resource
and at the same time, the process P2 is waiting for process P1 to release its resource. And no one is
releasing any resource. So, both are waiting for each other to release the resource. This leads to infinite
waiting and no work is done here. This is called Deadlock.
i. Mutual Exclusion:
A resource can be held by only one process at a time. In other words, if a process P1 is using
some resource R at a particular instant of time, then some other process P2 can't hold or use the same
resource R at that particular instant of time. The process P2 can make a request for that resource R but it
can't use that resource simultaneously with process P1.