Dbms-Unit-5 R16
Dbms-Unit-5 R16
Transaction
A transaction can be defined as a group of tasks. A single task is the minimum processing
unit which cannot be divided further.
Let’s take an example of a simple transaction. Suppose a bank employee transfers Rs 500
from A's account to B's account. This very simple and small transaction involves several
low-level tasks.
A’s Account
Open_Account(A)
Old_Balance = A.balance
New_Balance = Old_Balance - 500
A.balance = New_Balance
Close_Account(A)
B’s Account
Open_Account(B)
Old_Balance = B.balance
New_Balance = Old_Balance + 500
B.balance = New_Balance
Close_Account(B)
ACID Properties
A transaction is a very small unit of a program and it may contain several lowlevel tasks. A
transaction in a database system must maintain Atomicity, Consistency, Isolation,
and Durability − commonly known as ACID properties − in order to ensure accuracy,
completeness, and data integrity.
Atomicity − This property states that a transaction must be treated as an atomic
unit, that is, either all of its operations are executed or none. There must be no
state in a database where a transaction is left partially completed. States should be
defined either before the execution of the transaction or after the
execution/abortion/failure of the transaction.
Consistency − The database must remain in a consistent state after any
transaction. No transaction should have any adverse effect on the data residing in
the database. If the database was in a consistent state before the execution of a
transaction, it must remain consistent after the execution of the transaction as well.
Durability − The database should be durable enough to hold all its latest updates
even if the system fails or restarts. If a transaction updates a chunk of data in a
database and commits, then the database will hold the modified data. If a
transaction commits but the system fails before the data could be written on to the
disk, then that data will be updated once the system springs back into action.
Isolation − In a database system where more than one transaction are being
executed simultaneously and in parallel, the property of isolation states that all the
transactions will be carried out and executed as if it is the only transaction in the
system. No transaction will affect the existence of any other transaction.
www.Jntufastupdates.com 1
Transaction Log
In the field of databases in computer science, a transaction log (also transaction
journal, database log, binary log or audit trail) is a history of actions executed
by a database management system used to guarantee ACID properties
over crashes or hardware failures. Physically, a log is a file listing changes to the
database, stored in a stable storage format.
If, after a start, the database is found in an inconsistent state or not been shut down
properly, the database management system reviews the database logs
for uncommittedtransactions and rolls back the changes made by these transactions.
Additionally, all transactions that are already committed but whose changes were not
yet materialized in the database are re-applied. Both are done to
ensure atomicity and durability of transactions.
www.Jntufastupdates.com 2
the log's past. This varies according to checkpoint algorithm. If all dirty pages are
flushed while creating the checkpoint (as in PostgreSQL), it might contain:
redoLSN: This is a reference to the first log record that corresponds to a dirty page.
i.e. the first update that wasn't flushed at checkpoint time. This is where redo must
begin on recovery.
undoLSN: This is a reference to the oldest log record of the oldest in-progress
transaction. This is the oldest log record needed to undo all in-progress transactions.
Completion Record notes that all work has been done for this particular transaction.
(It has been fully committed or aborted)
Transaction Control
The following commands are used to control transactions.
COMMIT − to save the changes.
ROLLBACK − to roll back the changes.
SAVEPOINT − creates points within the groups of transactions in which to
ROLLBACK.
SET TRANSACTION − Places a name on a transaction.
Transactional Control Commands
Transactional control commands are only used with the DML Commands such as -
INSERT, UPDATE and DELETE only. They cannot be used while creating tables or dropping
them because these operations are automatically committed in the database.
The COMMIT Command
The COMMIT command is the transactional command used to save changes invoked by a
transaction to the database.
The COMMIT command is the transactional command used to save changes invoked by a
transaction to the database. The COMMIT command saves all the transactions to the
database since the last COMMIT or ROLLBACK command.
The syntax for the COMMIT command is as follows.
COMMIT;
Example
Consider the CUSTOMERS table having the following records −
+----+----------+-----+-----------+----------+
+----+----------+-----+-----------+----------+
www.Jntufastupdates.com 3
| 6 | Komal | 22 | MP | 4500.00 |
+----+----------+-----+-----------+----------+
Following is an example which would delete those records from the table which have age =
25 and then COMMIT the changes in the database.
SQL> COMMIT;
Thus, two rows from the table would be deleted and the SELECT statement would produce
the following result.
+----+----------+-----+-----------+----------+
| ID | NAME | AGE | ADDRESS | SALARY |
+----+----------+-----+-----------+----------+
| 1 | Ramesh | 32 | Ahmedabad | 2000.00 |
| 3 | kaushik | 23 | Kota | 2000.00 |
| 5 | Hardik | 27 | Bhopal | 8500.00 |
| 6 | Komal | 22 | MP | 4500.00 |
| 7 | Muffy | 24 | Indore | 10000.00 |
+----+----------+-----+-----------+----------+
Example
Consider the CUSTOMERS table having the following records −
+----+----------+-----+-----------+----------+
+----+----------+-----+-----------+----------+
www.Jntufastupdates.com 4
| 4 | Chaitali | 25 | Mumbai | 6500.00 |
| 6 | Komal | 22 | MP | 4500.00 |
+----+----------+-----+-----------+----------+
Following is an example, which would delete those records from the table which have the
age = 25 and then ROLLBACK the changes in the database.
SQL> ROLLBACK;
Thus, the delete operation would not impact the table and the SELECT statement would
produce the following result.
+----+----------+-----+-----------+----------+
| ID | NAME | AGE | ADDRESS | SALARY |
+----+----------+-----+-----------+----------+
| 1 | Ramesh | 32 | Ahmedabad | 2000.00 |
| 2 | Khilan | 25 | Delhi | 1500.00 |
| 3 | kaushik | 23 | Kota | 2000.00 |
| 4 | Chaitali | 25 | Mumbai | 6500.00 |
| 5 | Hardik | 27 | Bhopal | 8500.00 |
| 6 | Komal | 22 | MP | 4500.00 |
| 7 | Muffy | 24 | Indore | 10000.00 |
+----+----------+-----+-----------+----------+
www.Jntufastupdates.com 5
Example
Consider the CUSTOMERS table having the following records.
+----+----------+-----+-----------+----------+
+----+----------+-----+-----------+----------+
| 6 | Komal | 22 | MP | 4500.00 |
+----+----------+-----+-----------+----------+
The following code block contains the series of operations.
Savepoint created.
1 row deleted.
Savepoint created.
1 row deleted.
Savepoint created.
www.Jntufastupdates.com 6
1 row deleted.
Now that the three deletions have taken place, let us assume that you have changed your
mind and decided to ROLLBACK to the SAVEPOINT that you identified as SP2. Because SP2
was created after the first deletion, the last two deletions are undone −
Rollback complete.
Notice that only the first deletion took place since you rolled back to SP2.
+----+----------+-----+-----------+----------+
+----+----------+-----+-----------+----------+
| 6 | Komal | 22 | MP | 4500.00 |
+----+----------+-----+-----------+----------+
6 rows selected.
www.Jntufastupdates.com 7
The syntax for a SET TRANSACTION command is as follows.
SET TRANSACTION [ READ WRITE | READ ONLY ];
Concurrency Control
It may affect the transaction result. It is highly important to maintain the order of execution
of those transactions.
Here,
www.Jntufastupdates.com 8
o A transaction T1 updates a record which is read by T2. If T1 aborts then T2 now has
values which have never formed part of the stable database.
Example:
www.Jntufastupdates.com 9
o Here, transaction-X produces the result of 550 which is incorrect. If we write this
produced result in the database, the database will become an inconsistent state
because the actual sum is 600.
o Here, transaction-X has seen an inconsistent state of the database.
Concurrency Control Protocol
Concurrency control protocols ensure atomicity, isolation, and serializability of concurrent
transactions. The concurrency control protocol can be divided into three categories:
1. Lock based protocol
2. Time-stamp protocol
3. Validation based protocol
1. Lost Updates:
The lost update problem occurs when two concurrent transactions, T1 and T2, are updating
the same data element and one of the updates is lost (overwritten by the other
transaction). Consider the following PRODUCT table example.
One of the PRODUCT table’s attributes is a product’s quantity on hand (PROD_QOH).
Assume that you have a product whose current PROD_QOH value is 35. Also assume that
two concurrent transactions, T1 and T2, occur that update the PROD_QOH value for some
item in the PRODUCT table.
The transactions are as follows.
Transaction Operation
T1: Purchase 100 units PROD_QOH = PROD_QOH + 100
T2: Sell 30 units PROD_QOH = PROD_QOH – 30
The Following table shows the serial execution of those transactions under normal
circumstances, yielding the correct answer PROD_QOH = 105.
www.Jntufastupdates.com 10
But suppose that a transaction is able to read a product’s PROD_QOH value from the table
before a previous transaction (using the same product) has been committed.
The sequence depicted in the following Table shows how the lost update problem can arise.
Note that the first transaction (T1) has not yet been committed when the second
transaction (T2) is executed. Therefore, T2 still operates on the value 35, and its
subtraction yields 5 in memory. In the meantime, T1 writes the value 135 to disk, which is
promptly overwritten by T2. In short, the addition of 100 units is “lost” during the process.
2. Uncommitted Data:
The phenomenon of uncommitted data occurs when two transactions, T1 and T2, are
executed concurrently and the first transaction (T1) is rolled back after the second
transaction (T2) has already accessed the uncommitted data—thus violating the isolation
property of transactions.
To illustrate that possibility, let’s use the same transactions described during the lost
updates discussion. T1 has two atomic parts to it, one of which is the update of the
inventory, the other possibly being the update of the invoice total (not shown). T1 is forced
to roll back due to an error during the updating of the invoice’s total; hence, it rolls back all
the way, undoing the inventory update as well. This time, the T1 transaction is rolled back
to eliminate the addition of the 100 units. Because T2 subtracts 30 from the original 35
units, the correct answer should be 5.
Transaction Operation
T1: Purchase 100 units PROD_QOH = PROD_QOH + 100 (Rolled back)
T2: Sell 30 units PROD_QOH = PROD_QOH – 30
www.Jntufastupdates.com 11
The following Table shows how, under normal circumstances, the serial execution of those
transactions yields the correct answer.
The following Table shows how the uncommitted data problem can arise when the
ROLLBACK is completed after T2 has begun its execution.
3. Inconsistent Retrievals:
Inconsistent retrievals occur when a transaction accesses data before and after another
transaction(s) finish working with such data. For example, an inconsistent retrieval would
occur if transaction T1 calculated some summary (aggregate) function over a set of data
while another transaction (T2) was updating the same data. The problem is that the
transaction might read some data before they are changed and other data after they are
changed, thereby yielding inconsistent results.
1. T1 calculates the total quantity on hand of the products stored in the PRODUCT table.
2. At the same time, T2 updates the quantity on hand (PROD_QOH) for two of the PRODUCT
table’s products.
The two transactions are shown in the following Table:
www.Jntufastupdates.com 12
While T1 calculates the total quantity on hand (PROD_QOH) for all items, T2 represents the
correction of a typing error: the user added 10 units to product 1558-QW1’s PROD_QOH but
meant to add the 10 units to product 1546-QQ2’s PROD_QOH. To correct the problem, the
user adds 10 to product 1546-QQ2’s PROD_QOH and subtracts 10 from product 1558-
QW1’s PROD_QOH. The initial and final PROD_QOH values are reflected in the following
Table
The following table demonstrates that inconsistent retrievals are possible during the
transaction execution, making the result of T1’s execution incorrect. The “After” summation
shown in Table 10.9 reflects the fact that the value of 25 for product 1546-QQ2 was read
after the WRITE statement was completed. Therefore, the “After” total is 40 + 25 = 65. The
“Before” total reflects the fact that the value of 23 for product 1558-QW1 was read before
the next WRITE statement was completed to reflect the corrected update of 13. Therefore,
the “Before” total is 65 + 23 = 88.
www.Jntufastupdates.com 13
The computed answer of 102 is obviously wrong because you know from the previous Table
that the correct answer is 92. Unless the DBMS exercises concurrency control, a multiuser
database environment can create havoc within the information system.
www.Jntufastupdates.com 14
CONCURRENCY CONTROL WITH LOCKING METHODS
A lock guarantees exclusive use of a data item to a current transaction. In other words, transaction T2
does not have access to a data item that is currently being used by transaction T1. A transaction
acquires a lock prior to data access; the lock is released (unlocked) when the transaction is completed so
that another transaction can lock the data item for its exclusive use.
Most multiuser DBMSs automatically initiate and enforce locking procedures. All lock information is
managed by a lock manager.
Lock Granularity
Indicates the level of lock use. Locking can take place at the following levels: database, table, page,
row or even field.
LOCK TYPES
Regardless of the level of locking, the DBMS may use different lock types:
1. Binary Locks
2. Shared/Exclusive Locks
An exclusive lock exists when access is reserved specifically for the transaction that locked the object
.The exclusive lock must be used when the potential for conflict exists. A shared lock exists when
concurrent transactions are granted read access on the basis of common lock. A shared lock produces
no conflict as long as all the concurrent transactions are read only.
DEADLOCKS
A deadlock occurs when two transactions wait indefinitely for each other to unlock data.
www.Jntufastupdates.com 15
The three basic techniques to control deadlocks are:
Deadlock preventation . A transaction requesting a new lock is aborted when there is the
possibility that a deadlock can occur. if the transaction is aborted , all changes made by this
transaction are rolled back and all locks obtained by the transaction are released .The
transaction is then rescheduled for execution.
Deadlock detection. The DBMS periodically tests the database for deadlocks. if a deadlock is
found one of the transactions is aborted (rolled back and restarted) and the other transaction
are continues.
Deadlock avoidance. The transaction must obtain all of the locks it needs before it can be
executed .This technique avoids the rollback of conflicting transactions by requiring that locks be
obtained in succession
Two-Phase Locking (2PL) is a concurrency control method which divides the execution
phase of a transaction into three parts.
It ensures conflict serializable schedules.
If read and write operations introduce the first unlock operation in the transaction, then it
is said to be Two-Phase Locking Protocol.
www.Jntufastupdates.com 16
Conservative Two – Phase Locking Protocol is also called as Static Two – Phase Locking
Protocol.
This protocol is almost free from deadlocks as all required items are listed in advanced.
It requires locking of all data items to access before the transaction starts.
Timestamp-based Protocols
The most commonly used concurrency protocol is the timestamp based protocol. This
protocol uses either system time or logical counter as a timestamp.
Lock-based protocols manage the order between the conflicting pairs among transactions
at the time of execution, whereas timestamp-based protocols start working as soon as a
transaction is created.
Every transaction has a timestamp associated with it, and the ordering is determined by
the age of the transaction. A transaction created at 0002 clock time would be older than all
other transactions that come after it. For example, any transaction 'y' entering the system
at 0004 is two seconds younger and the priority would be given to the older one.
In addition, every data item is given the latest read and write-timestamp. This lets the
system know when the last ‘read and write’ operation was performed on the data item.
www.Jntufastupdates.com 17
In a multi-process system, deadlock is an unwanted situation that arises in a shared
resource environment, where a process indefinitely waits for a resource that is held by
another process.
For example, assume a set of transactions {T0, T1, T2, ...,Tn}. T0 needs a resource X to
complete its task. Resource X is held by T1, and T1 is waiting for a resource Y, which is held
by T2. T2 is waiting for resource Z, which is held by T0. Thus, all the processes wait for each
other to release resources. In this situation, none of the processes can finish their task.
This situation is known as a deadlock.
Deadlocks are not healthy for a system. In case a system is stuck in a deadlock, the
transactions involved in the deadlock are either rolled back or restarted.
Deadlock Prevention
To prevent any deadlock situation in the system, the DBMS aggressively inspects all the
operations, where transactions are about to execute. The DBMS inspects the operations
and analyzes if they can create a deadlock situation. If it finds that a deadlock situation
might occur, then that transaction is never allowed to be executed.
There are deadlock prevention schemes that use timestamp ordering mechanism of
transactions in order to predetermine a deadlock situation.
Wait-Die Scheme
In this scheme, if a transaction requests to lock a resource (data item), which is already
held with a conflicting lock by another transaction, then one of the two possibilities may
occur −
If TS(Ti) < TS(Tj) − that is Ti, which is requesting a conflicting lock, is older than T j − then Ti is
allowed to wait until the data-item is available.
If TS(Ti) > TS(tj) − that is Ti is younger than Tj − then Ti dies. Ti is restarted later with a random
delay but with the same timestamp.
This scheme allows the older transaction to wait but kills the younger one.
Wound-Wait Scheme
In this scheme, if a transaction requests to lock a resource (data item), which is already
held with conflicting lock by some another transaction, one of the two possibilities may
occur −
If TS(Ti) < TS(Tj), then Ti forces Tj to be rolled back − that is Tiwounds Tj. Tj is restarted later with
a random delay but with the same timestamp.
If TS(Ti) > TS(Tj), then Ti is forced to wait until the resource is available.
This scheme, allows the younger transaction to wait; but when an older transaction
requests an item held by a younger one, the older transaction forces the younger one to
abort and release the item.
In both the cases, the transaction that enters the system at a later stage is aborted.
Deadlock Avoidance
Aborting a transaction is not always a practical approach. Instead, deadlock avoidance
mechanisms can be used to detect any deadlock situation in advance. Methods like "wait-
for graph" are available but they are suitable for only those systems where transactions are
lightweight having fewer instances of resource. In a bulky system, deadlock prevention
techniques may work well.
www.Jntufastupdates.com 18
Crash Recovery
DBMS is a highly complex system with hundreds of transactions being executed every
second. The durability and robustness of a DBMS depends on its complex architecture and
its underlying hardware and system software. If it fails or crashes amid transactions, it is
expected that the system would follow some sort of algorithm or techniques to recover lost
data.
Failure Classification
To see where the problem has occurred, we generalize a failure into various categories, as
follows −
Transaction failure
A transaction has to abort when it fails to execute or when it reaches a point from where it
can’t go any further. This is called transaction failure where only a few transactions or
processes are hurt.
Reasons for a transaction failure could be −
Logical errors − Where a transaction cannot complete because it has some code
error or any internal error condition.
System errors − Where the database system itself terminates an active transaction
because the DBMS is not able to execute it, or it has to stop because of some
system condition. For example, in case of deadlock or resource unavailability, the
system aborts an active transaction.
System Crash
There are problems − external to the system − that may cause the system to stop abruptly
and cause the system to crash. For example, interruptions in power supply may cause the
failure of underlying hardware or software failure.
Examples may include operating system errors.
Disk Failure
In early days of technology evolution, it was a common problem where hard-disk drives or
storage drives used to fail frequently.
Disk failures include formation of bad sectors, unreachability to the disk, disk head crash or
any other failure, which destroys all or a part of disk storage.
Storage Structure
We have already described the storage system. In brief, the storage structure can be
divided into two categories −
Volatile storage − As the name suggests, a volatile storage cannot survive system
crashes. Volatile storage devices are placed very close to the CPU; normally they
are embedded onto the chipset itself. For example, main memory and cache
memory are examples of volatile storage. They are fast but can store only a small
amount of information.
Non-volatile storage − These memories are made to survive system crashes.
They are huge in data storage capacity, but slower in accessibility. Examples may
include hard-disks, magnetic tapes, flash memory, and non-volatile (battery backed
up) RAM.
Recovery and Atomicity
When a system crashes, it may have several transactions being executed and various files
opened for them to modify the data items. Transactions are made of various operations,
which are atomic in nature. But according to ACID properties of DBMS, atomicity of
transactions as a whole must be maintained, that is, either all the operations are executed
or none.
When a DBMS recovers from a crash, it should maintain the following −
www.Jntufastupdates.com 19
It should check the states of all the transactions, which were being executed.
A transaction may be in the middle of some operation; the DBMS must ensure the
atomicity of the transaction in this case.
It should check whether the transaction can be completed now or it needs to be
rolled back.
No transactions would be allowed to leave the DBMS in an inconsistent state.
There are two types of techniques, which can help a DBMS in recovering as well as
maintaining the atomicity of a transaction −
Maintaining the logs of each transaction, and writing them onto some stable storage
before actually modifying the database.
Maintaining shadow paging, where the changes are done on a volatile memory, and
later, the actual database is updated.
Log-based Recovery
Log is a sequence of records, which maintains the records of actions performed by a
transaction. It is important that the logs are written prior to the actual modification and
stored on a stable storage media, which is failsafe.
Log-based recovery works as follows −
The log file is kept on a stable storage media.
When a transaction enters the system and starts execution, it writes a log about it.
<Tn, Start>
When the transaction modifies an item X, it write logs as follows −
<Tn, X, V1, V2>
It reads Tn has changed the value of X, from V1 to V2.
When the transaction finishes, it logs −
<Tn, commit>
The database can be modified using two approaches −
Deferred database modification − All logs are written on to the stable storage
and the database is updated when a transaction commits.
Immediate database modification − Each log follows an actual database
modification. That is, the database is modified immediately after every operation.
Recovery with Concurrent Transactions
When more than one transaction are being executed in parallel, the logs are interleaved. At
the time of recovery, it would become hard for the recovery system to backtrack all logs,
and then start recovering. To ease this situation, most modern DBMS use the concept of
'checkpoints'.
Checkpoint
Keeping and maintaining logs in real time and in real environment may fill out all the
memory space available in the system. As time passes, the log file may grow too big to be
handled at all. Checkpoint is a mechanism where all the previous logs are removed from
the system and stored permanently in a storage disk. Checkpoint declares a point before
which the DBMS was in consistent state, and all the transactions were committed.
Recovery
When a system with concurrent transactions crashes and recovers, it behaves in the
following manner −
www.Jntufastupdates.com 20
The recovery system reads the logs backwards from the end to the last checkpoint.
It maintains two lists, an undo-list and a redo-list.
If the recovery system sees a log with <Tn, Start> and <Tn, Commit> or just <Tn,
Commit>, it puts the transaction in the redo-list.
If the recovery system sees a log with <T n, Start> but no commit or abort log
found, it puts the transaction in undo-list.
All the transactions in the undo-list are then undone and their logs are removed. All the
transactions in the redo-list and their previous logs are removed and then redone before
saving their logs.
www.Jntufastupdates.com 21