0% found this document useful (0 votes)
8 views27 pages

Transaction Online

The document discusses transaction processing in database management systems (DBMS), detailing the nature of transactions, their states, and the ACID properties essential for maintaining data integrity and consistency. It outlines various transaction states such as Active, Partially Committed, Failed, Aborted, Committed, and Terminated, along with the importance of concurrency control techniques to manage simultaneous transactions. Additionally, it highlights the advantages and disadvantages of ACID properties and concurrency control, including performance impacts and the complexity of implementation.

Uploaded by

arynshr293
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views27 pages

Transaction Online

The document discusses transaction processing in database management systems (DBMS), detailing the nature of transactions, their states, and the ACID properties essential for maintaining data integrity and consistency. It outlines various transaction states such as Active, Partially Committed, Failed, Aborted, Committed, and Terminated, along with the importance of concurrency control techniques to manage simultaneous transactions. Additionally, it highlights the advantages and disadvantages of ACID properties and concurrency control, including performance impacts and the complexity of implementation.

Uploaded by

arynshr293
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 27

UNIT-5

TRANSACTION PROCESSING

Transaction

o The transaction is a set of logically related operation. It contains a group of tasks.


o A transaction is an action or series of actions. It is performed by a single user to
perform operations for accessing the contents of the database.

Example: Suppose an employee of bank transfers Rs 800 from X's account to Y's account.
This small transaction contains several low-level tasks:

X's Account

1. Open_Account(X)
2. Old_Balance = X.balance
3. New_Balance = Old_Balance - 800
4. X.balance = New_Balance
5. Close_Account(X)

Y's Account

1. Open_Account(Y)
2. Old_Balance = Y.balance
3. New_Balance = Old_Balance + 800
4. Y.balance = New_Balance
5. Close_Account(Y)

Read(X): Read operation is used to read the value of X from the database and stores it in a
buffer in main memory.

Write(X): Write operation is used to write the value back to the database from the buffer.

Let's take an example to debit transaction from an account which consists of following
operations:

1. 1. R(X);
2. 2. X = X - 500;
3. 3. W(X);

Transaction in DBMS is a set of logically related operations executed as a single unit. These
logic are followed to perform modification on data while maintaining integrity and
consistency. Transactions are performed in a way that concurrent actions from different users
don’t malfunction the database. Transfer of money from one account to another in a bank
management system is the best example of Transaction.

A transaction goes through several states during its lifetime. These states indicate the current
status of the transaction and guide how it will proceed. They determine whether the
transaction will be successfully completed (committed) or stopped (aborted). These states
also use a transaction log to keep track of the process.

What is a Transaction State?

A transaction is a set of operations or tasks performed to complete a logical process, which


may or may not change the data in a database. To handle different situations, like system
failures, a transaction is divided into different states.

A transaction state refers to the current phase or condition of a transaction during its
execution in a database. It represents the progress of the transaction and determines whether
it will successfully complete (commit) or fail (abort).

A transaction involves two main operations:

Read Operation: Reads data from the database, stores it temporarily in memory (buffer), and
uses it as needed.

Write Operation: Updates the database with the changed data using the buffer.

From the start of executing instructions to the end, these operations are treated as a single
transaction. This ensures the database remains consistent and reliable throughout the process.
Different Types of Transaction States in DBMS

These are different types of Transaction States :

1. Active State – This is the first stage of a transaction, when the transaction’s instructions are
being executed.

It is the first stage of any transaction when it has begun to execute. The execution of the
transaction takes place in this state.

Operations such as insertion, deletion, or updation are performed during this state.

During this state, the data records are under manipulation and they are not saved to the
database, rather they remain somewhere in a buffer in the main memory.

2. Partially Committed –

The transaction has finished its final operation, but the changes are still not saved to the
database.
After completing all read and write operations, the modifications are initially stored in main
memory or a local buffer. If the changes are made permanent on the DataBase then the state
will change to “committed state” and in case of failure it will go to the “failed state”.

3. Failed State –If any of the transaction-related operations cause an error during the active or
partially committed state, further execution of the transaction is stopped and it is brought into
a failed state. Here, the database recovery system makes sure that the database is in a
consistent state.

5. Aborted State- If a transaction reaches the failed state due to a failed check, the database
recovery system will attempt to restore it to a consistent state. If recovery is not possible, the
transaction is either rolled back or cancelled to ensure the database remains consistent.

In the aborted state, the DBMS recovery system performs one of two actions:

Kill the transaction: The system terminates the transaction to prevent it from affecting other
operations.

Restart the transaction: After making necessary adjustments, the system reverts the
transaction to an active state and attempts to continue its execution.

6. Commuted- This state of transaction is achieved when all the transaction-related operations
have been executed successfully along with the Commit operation, i.e. data is saved into the
database after the required manipulations in this state. This marks the successful completion
of a transaction.

6. Terminated State – If there isn’t any roll-back or the transaction comes from the
“committed state”, then the system is consistent and ready for new transaction and the
old transaction is terminated.
ACID Properties in DBMS
A transaction is a single logical unit of work that interacts with the database,
potentially modifying its content through read and write operations. To maintain
database consistency both before and after a transaction, specific properties, known as
ACID properties must be followed.

This article focuses on the ACID properties in DBMS, which are essential for
ensuring data consistency, integrity, and reliability during database transactions.

Atomicity:
By this, we mean that either the entire transaction takes place at once or doesn’t
happen at all. There is no midway i.e. transactions do not occur partially. Each
transaction is considered as one unit and either runs to completion or is not executed
at all. It involves the following two operations.
— Abort : If a transaction aborts, changes made to the database are not visible.
— Commit : If a transaction commits, changes made are visible.
Atomicity is also known as the ‘All or nothing rule’.

Consider the following transaction T consisting of T1 and T2 : Transfer of 100 from


account X to account Y .
If the transaction fails after completion of T1 but before completion of T2 ( say, after
write(X) but before write(Y) ), then the amount has been deducted from X but not
added to Y . This results in an inconsistent database state. Therefore, the transaction
must be executed in its entirety in order to ensure the correctness of the database state.

Consistency:
Consistency ensures that a database remains in a valid state before and after a
transaction. It guarantees that any transaction will take the database from one
consistent state to another, maintaining the rules and constraints defined for the data.
Referring to the example above,
The total amount before and after the transaction must be maintained.
Total before T occurs = 500 + 200 = 700 .
Total after T occurs = 400 + 300 = 700 .
Therefore, the database is consistent . Inconsistency occurs in case T1 completes but
T2 fails.

Isolation:
This property ensures that multiple transactions can occur concurrently without
leading to the inconsistency of the database state. Transactions occur independently
without interference. Changes occurring in a particular transaction will not be visible
to any other transaction until that particular change in that transaction is written to
memory or has been committed. This property ensures that when multiple transactions
run at the same time, the result will be the same as if they were run one after another
in a specific order.
Let X = 500, Y = 500.
Consider two transactions T and T”.

Suppose T has been executed till Read (Y) and then T’’ starts. As a result, interleaving
of operations takes place due to which T’’ reads the correct value of X but the
incorrect value of Y and sum computed by
T’’: (X+Y = 50, 000+500=50, 500) .
is thus not consistent with the sum at end of the transaction:
T: (X+Y = 50, 000 + 450 = 50, 450) .
This results in database inconsistency, due to a loss of 50 units. Hence, transactions
must take place in isolation and changes should be visible only after they have been
made to the main memory.

Durability:
This property ensures that once the transaction has completed execution, the updates
and modifications to the database are stored in and written to disk and they persist
even if a system failure occurs. These updates now become permanent and are stored
in non-volatile memory. The effects of the transaction, thus, are never lost.
The ACID properties, in totality, provide a mechanism to ensure the correctness and
consistency of a database in a way such that each transaction is a group of operations
that acts as a single unit, produces consistent results, acts in isolation from other
operations, and updates that it makes are durably stored.

ACID properties are the four key characteristics that define the reliability and
consistency of a transaction in a Database Management System (DBMS).
The acronym ACID stands for Atomicity, Consistency, Isolation, and Durability.
Here is a brief description of each of these properties:

Atomicity: Atomicity ensures that a transaction is treated as a single, indivisible unit


of work. Either all the operations within the transaction are completed successfully, or
none of them are. If any part of the transaction fails, the entire transaction is rolled
back to its original state, ensuring data consistency and integrity.
Consistency: Consistency ensures that a transaction takes the database from one
consistent state to another consistent state. The database is in a consistent state both
before and after the transaction is executed. Constraints, such as unique keys and
foreign keys, must be maintained to ensure data consistency.
Isolation: Isolation ensures that multiple transactions can execute concurrently
without interfering with each other. Each transaction must be isolated from other
transactions until it is completed. This isolation prevents dirty reads, non-repeatable
reads, and phantom reads.
Durability: Durability ensures that once a transaction is committed, its changes are
permanent and will survive any subsequent system failures. The transaction’s changes
are saved to the database permanently, and even if the system crashes, the changes
remain intact and can be recovered.
Overall, ACID properties provide a framework for ensuring data consistency,
integrity, and reliability in DBMS. They ensure that transactions are executed in a
reliable and consistent manner, even in the presence of system failures, network
issues, or other problems. These properties make DBMS a reliable and efficient tool
for managing data in modern organizations.

Advantages of ACID Properties in DBMS


Data Consistency: ACID properties ensure that the data remains consistent and
accurate after any transaction execution.
Data Integrity: ACID properties maintain the integrity of the data by ensuring that
any changes to the database are permanent and cannot be lost.
Concurrency Control: ACID properties help to manage multiple transactions
occurring concurrently by preventing interference between them.
Recovery: ACID properties ensure that in case of any failure or crash, the system can
recover the data up to the point of failure or crash.
Disadvantages of ACID Properties in DBMS
Performance: The ACID properties can cause a performance overhead in the system,
as they require additional processing to ensure data consistency and integrity.
Scalability: The ACID properties may cause scalability issues in large distributed
systems where multiple transactions occur concurrently.
Complexity: Implementing the ACID properties can increase the complexity of the
system and require significant expertise and resources.
Overall, the advantages of ACID properties in DBMS outweigh the disadvantages.
They provide a reliable and consistent approach to data management, ensuring data
integrity, accuracy, and reliability. However, in some cases, the overhead of
implementing ACID properties can cause performance and scalability issues.
Therefore, it’s important to balance the benefits of ACID properties against the
specific needs and requirements of the system.

Concurrency Control Techniques

A database management system (DBMS), allowing transactions to run concurrently has


significant advantages, such as better system resource utilization and higher throughput.
However, it is crucial that these transactions do not conflict with each other. The ultimate
goal is to ensure that the database remains consistent and accurate. For instance, if two users
try to book the last available seat on a flight at the same time, the system must ensure that
only one booking succeeds.Concurrency control is a critical mechanism in DBMS that
ensures the consistency and integrity of data when multiple operations are performed at the
same time.
Concurrency control is a concept in Database Management Systems (DBMS) that ensures
multiple transactions can simultaneously access or modify data without causing errors or
inconsistencies. It provides mechanisms to handle the concurrent execution in a way that
maintains ACID properties.

By implementing concurrency control, a DBMS allows transactions to execute concurrently


while avoiding issues such as deadlocks, race conditions, and conflicts between operations.

The main goal of concurrency control is to ensure that simultaneous transactions do not lead
to data conflicts or violate the consistency of the database. The concept of serializability is
often used to achieve this goal.

Advantages of Concurrency

In general, concurrency means that more than one transaction can work on a system.
The advantages of a concurrent system are:

Waiting Time: It means if a process is in a ready state but still the process does not get the
system to get execute is called waiting time. So, concurrency leads to less waiting time.

Response Time: The time wasted in getting the response from the CPU for the first time, is
called response time. So, concurrency leads to less Response Time.
Resource Utilization: The amount of Resource utilization in a particular system is called
Resource Utilization. Multiple transactions can run parallel in a system. So, concurrency
leads to more Resource Utilization.

Efficiency: The amount of output produced in comparison to given input is called efficiency.
So, Concurrency leads to more Efficiency.

Disadvantages of Concurrency

Overhead: Implementing concurrency control requires additional overhead, such as


acquiring and releasing locks on database objects. This overhead can lead to slower
performance and increased resource consumption, particularly in systems with high levels of
concurrency.

Deadlocks: Deadlocks can occur when two or more transactions are waiting for each other to
release resources, causing a circular dependency that can prevent any of the transactions from
completing. Deadlocks can be difficult to detect and resolve, and can result in reduced
throughput and increased latency.

Reduced concurrency: Concurrency control can limit the number of users or applications
that can access the database simultaneously. This can lead to reduced concurrency and slower
performance in systems with high levels of concurrency.

Complexity: Implementing concurrency control can be complex, particularly in distributed


systems or in systems with complex transactional logic. This complexity can lead to
increased development and maintenance costs.

Inconsistency: In some cases, concurrency control can lead to inconsistencies in the


database. For example, a transaction that is rolled back may leave the database in an
inconsistent state, or a long-running transaction may cause other transactions to wait for
extended periods, leading to data staleness and reduced accuracy.

Lock-Based Protocol

In this type of protocol, any transaction cannot read or write data until it acquires an
appropriate lock on it. There are two types of lock:

What is a Lock?
It is associated with each data item that represents the state of the item with respect to the
possible operations that can be performed on it. Its value is used in a locking scheme to
manipulation of the associated data item and control concurrent access. Locking a data item
being used by a transaction can prevent other transactions running simultaneously from using
these locked data items. This is one of the most commonly used techniques to ensure
serialization, manipulation of the value of a lock is called locking.

Types of Locks

Various types of locks are used to control concurrency. Depending on the type of lock, the
lock manager grants or denies access to other operations on the same data item. First of all
discuss the binary locks which are simple and have less practical use. Then we will discuss
shared and exclusive locks also called read/write locks which have more locking capabilities
and have large practical usage.

Binary locks

A binary lock can have only two values:

o Locked
o Unlocked

Which are represented by 1 or 0 for simplicity. Each data item has a separate lock associated
with it. If the data item is locked then it cannot be accessed by database operations that
request the data item and if the data item is unlocked then it can be accessed when requested.

Following two operations are associated with binary locking of a data item A.

o Lock (A)
o Unlock (A)

A transaction requests access to a data item by first locking the data item using the Lock()
operation. While doing so, if another operation of another concurrent transaction tries to
access the same data item, it is forced to wait until the transaction that locked the data item
has unlocked the same data item using the Unlock() operation. When a given transaction has
locked a data item completes all its operations with that data item, it automatically unlocks
the data item so that other transactions can use it.
The following rules must be followed whenever binary locking schemes are used:

o Lock (): This operation must be issued by the transaction before any update
operations such as a read or write performed on the transaction.
o Unlock (): This operation must be issued by the transaction after all read or write
operations in the transaction have completed.
o A lock() operation cannot be released by a transaction if it already holds a lock on the
data item.
o An Unlock () operation cannot be issued by the transaction unless it already holds a
lock on the data item.

Consider two transactions T1 and T2 both update the account balance by Rs 200 and Rs 300
respectively and the possible schedules is shown if these transactions are made to run
concurrently.

Handling Deadlocks

Deadlock is a situation where a process or a set of processes is blocked, waiting for some
other resource that is held by some other waiting process. It is an undesirable state of the
system. In other words, Deadlock is a critical situation in computing where a process, or a
group of processes, becomes unable to proceed because each is waiting for a resource that is
held by another process in the same group. This scenario leads to a complete standstill,
rendering the affected processes inactive and the system inefficient.
Necessary Condition for a Deadlock

The following are the four conditions that must hold simultaneously for a deadlock to
occur.

Mutual Exclusion: A resource can be used by only one process at a time. If another process
requests for that resource, then the requesting process must be delayed until the resource has
been released.

Hold and wait: Some processes must be holding some resources in the non-shareable mode
and at the same time must be waiting to acquire some more resources, which are currently
held by other processes in the non-shareable mode.

No pre-emption: Resources granted to a process can be released back to the system only as a
result of voluntary action of that process after the process has completed its task.

Circular wait: Deadlocked processes are involved in a circular chain such that each process
holds one or more resources being requested by the next process in the chain.

Methods of Handling Deadlocks

There are four approaches to dealing with deadlocks.

1. Deadlock Prevention

2. Deadlock avoidance (Banker's Algorithm)

3. Deadlock detection & recovery

4. Deadlock Ignorance (Ostrich Method)

These are explained below.

1. Deadlock Prevention
The strategy of deadlock prevention is to design the system in such a way that the possibility
of deadlock is excluded. The indirect methods prevent the occurrence of one of three
necessary conditions of deadlock i.e., mutual exclusion, no pre-emption, and hold and wait.
The direct method prevents the occurrence of circular wait. Prevention techniques -Mutual
exclusion – are supported by the OS. Hold and Wait – the condition can be prevented by
requiring that a process requests all its required resources at one time and blocking the
process until all of its requests can be granted at the same time simultaneously. But this
prevention does not yield good results because:

• long waiting time required


• inefficient use of allocated resource
• A process may not know all the required resources in advance

No pre-emption – techniques for ‘no pre-emption are’

If a process that is holding some resource, requests another resource that can not be
immediately allocated to it, all resources currently being held are released and if necessary,
request again together with the additional resource.

If a process requests a resource that is currently held by another process, the OS may pre-
empt the second process and require it to release its resources. This works only if both
processes do not have the same priority.

Circular wait One way to ensure that this condition never holds is to impose a total ordering
of all resource types and to require that each process requests resources in increasing order of
enumeration, i.e., if a process has been allocated resources of type R, then it may
subsequently request only those resources of types following R in ordering.

2. Deadlock Avoidance

The deadlock avoidance Algorithm works by proactively looking for potential deadlock
situations before they occur. It does this by tracking the resource usage of each process and
identifying conflicts that could potentially lead to a deadlock. If a potential deadlock is
identified, the algorithm will take steps to resolve the conflict, such as rolling back one of the
processes or pre-emptively allocating resources to other processes. The Deadlock Avoidance
Algorithm is designed to minimize the chances of a deadlock occurring, although it cannot
guarantee that a deadlock will never occur. This approach allows the three necessary
conditions of deadlock but makes judicious choices to assure that the deadlock point is never
reached. It allows more concurrency than avoidance detection A decision is made
dynamically whether the current resource allocation request will, if granted, potentially lead
to deadlock. It requires knowledge of future process requests. Two techniques to avoid
deadlock :

• Process initiation denial


• Resource allocation denial

Advantages

• Not necessary to pre-empt and rollback processes


• Less restrictive than deadlock prevention

Disadvantages

• Future resource requirements must be known in advance


• Processes can be blocked for long periods
• Exists a fixed number of resources for allocation

Banker’s Algorithm

The Banker’s Algorithm is based on the concept of resource allocation graphs. A resource
allocation graph is a directed graph where each node represents a process, and each edge
represents a resource. The state of the system is represented by the current allocation of
resources between processes. For example, if the system has three processes, each of which is
using two resources, the resource allocation graph would look like this:

Processes A, B, and C would be the nodes, and the resources they are using would be the
edges connecting them. The Banker’s Algorithm works by analyzing the state of the system
and determining if it is in a safe state or at risk of entering a deadlock.
To determine if a system is in a safe state, the Banker’s Algorithm uses two matrices: the
available matrix and the need matrix. The available matrix contains the amount of each
resource currently available. The need matrix contains the amount of each resource required
by each process.

The Banker’s Algorithm then checks to see if a process can be completed without
overloading the system. It does this by subtracting the amount of each resource used by the
process from the available matrix and adding it to the need matrix. If the result is in a safe
state, the process is allowed to proceed, otherwise, it is blocked until more resources become
available.

The Banker’s Algorithm is an effective way to prevent deadlocks in multiprogramming


systems. It is used in many operating systems, including Windows and Linux. In addition, it
is used in many other types of systems, such as manufacturing systems and banking systems.

The Banker’s Algorithm is a powerful tool for resource allocation problems, but it is not
foolproof. It can be fooled by processes that consume more resources than they need, or by
processes that produce more resources than they need. Also, it can be fooled by processes
that consume resources in an unpredictable manner. To prevent these types of problems, it is
important to carefully monitor the system to ensure that it is in a safe state.

3. Deadlock Detection

Deadlock detection is used by employing an algorithm that tracks the circular waiting and
kills one or more processes so that the deadlock is removed. The system state is examined
periodically to determine if a set of processes is deadlocked. A deadlock is resolved by
aborting and restarting a process, relinquishing all the resources that the process held.

• This technique does not limit resource access or restrict process action.
• Requested resources are granted to processes whenever possible.
• It never delays the process initiation and facilitates online handling.
• The disadvantage is the inherent pre-emption losses.
4.Deadlock Ignorance

In the Deadlock ignorance method the OS acts like the deadlock never occurs and completely
ignores it even if the deadlock occurs. This method only applies if the deadlock occurs very
rarely. The algorithm is very simple. It says, ” if the deadlock occurs, simply reboot the
system and act like the deadlock never occurred.” That’s why the algorithm is called the
Ostrich Algorithm.

Advantages

• Ostrich Algorithm is relatively easy to implement and is effective in most cases.


• It helps in avoiding the deadlock situation by ignoring the presence of deadlocks.

Disadvantages

• Ostrich Algorithm does not provide any information about the deadlock situation.
• It can lead to reduced performance of the system as the system may be blocked for a
long time.

Timestamp based Concurrency Control

Timestamp-based concurrency control is a method used in database systems to ensure that


transactions are executed safely and consistently without conflicts, even when multiple
transactions are being processed simultaneously. This approach relies on timestamps to
manage and coordinate the execution order of transactions. Refer to the timestamp of a
transaction T as TS(T).

What is Timestamp Ordering Protocol?

The Timestamp Ordering Protocol is a method used in database systems to order transactions
based on their timestamps. A timestamp is a unique identifier assigned to each transaction,
typically determined using the system clock or a logical counter. Transactions are executed in
the ascending order of their timestamps, ensuring that older transactions get higher priority.

For example:

If Transaction T1 enters the system first, it gets a timestamp TS(T1) = 007 (assumption).

If Transaction T2 enters after T1, it gets a timestamp TS(T2) = 009 (assumption).

This means T1 is “older” than T2 and T1 should execute before T2 to maintain consistency.

Key Features of Timestamp Ordering Protocol:

Transaction Priority:

Older transactions (those with smaller timestamps) are given higher priority.

For example, if transaction T1 has a timestamp of 007 times and transaction T2 has a
timestamp of 009 times, T1 will execute first as it entered the system earlier.

Early Conflict Management:

Unlike lock-based protocols, which manage conflicts during execution, timestamp-based


protocols start managing conflicts as soon as a transaction is created.

Ensuring Serializability:

The protocol ensures that the schedule of transactions is serializable. This means the
transactions can be executed in an order that is logically equivalent to their timestamp order.

Basic Timestamp Ordering


The Basic Timestamp Ordering (TO) Protocol is a method in database systems that uses
timestamps to manage the order of transactions. Each transaction is assigned a unique
timestamp when it enters the system ensuring that all operations follow a specific order
making the schedule conflict-serializable and deadlock-free.

Suppose, if an old transaction Ti has timestamp TS(Ti), a new transaction Tj is assigned


timestamp TS(Tj) such that TS(Ti) < TS(Tj).

The protocol manages concurrent execution such that the timestamps determine the
serializability order.

The timestamp ordering protocol ensures that any conflicting read and write operations are
executed in timestamp order.

Whenever some Transaction T tries to issue a R_item(X) or a W_item(X), the Basic TO


algorithm compares the timestamp of T with R_TS(X) & W_TS(X) to ensure that the
Timestamp order is not violated.

This describes the Basic TO protocol in the following two cases:

Whenever a Transaction T issues a W_item(X) operation, check the following conditions:

If R_TS(X) > TS(T) and if W_TS(X) > TS(T), then abort and rollback T and reject the
operation. else,
Execute W_item(X) operation of T and set W_TS(X) to TS(T) to the larger of TS(T) and
current W_TS(X).

Whenever a Transaction T issues a R_item(X) operation, check the following conditions:

If W_TS(X) > TS(T), then abort and reject T and reject the operation, else

If W_TS(X) <= TS(T), then execute the R_item(X) operation of T and set R_TS(X) to the
larger of TS(T) and current R_TS(X).

Whenever the Basic TO algorithm detects two conflicting operations that occur in an
incorrect order, it rejects the latter of the two operations by aborting the Transaction that
issued it.

Advantages of Basic TO Protocol

Conflict Serializable: Ensures all conflicting operations follow the timestamp order.

Deadlock-Free: Transactions do not wait for resources, preventing deadlocks.

Strict Ordering: Operations are executed in a predefined, conflict-free order based on


timestamps.

Drawbacks of Basic Timestamp Ordering (TO) Protocol


Cascading Rollbacks : If a transaction is aborted, all dependent transactions must also be
aborted, leading to inefficiency.

Starvation of Newer Transactions : Older transactions are prioritized, which can delay or
starve newer transactions.

High Overhead: Maintaining and updating timestamps for every data item adds significant
system overhead.

Inefficient for High Concurrency: The strict ordering can reduce throughput in systems with
many concurrent transactions.

Strict Timestamp Ordering

The Strict Timestamp Ordering Protocol is an enhanced version of the Basic Timestamp
Ordering Protocol. It ensures a stricter control over the execution of transactions to avoid
cascading rollbacks and maintain a more consistent schedule.

Database Recovery Techniques in DBMS

Database Systems like any other computer system, are subject to failures but the data stored
in them must be available as and when required. When a database fails it must possess the
facilities for fast recovery. It must also have atomicity i.e. either transactions are completed
successfully and committed (the effect is recorded permanently in the database) or the
transaction should have no effect on the database.

Types of Recovery Techniques in DBMS

Database recovery techniques are used in database management systems (DBMS) to restore a
database to a consistent state after a failure or error has occurred. The main goal of recovery
techniques is to ensure data integrity and consistency and prevent data loss.

There are mainly two types of recovery techniques used in DBMS

• Rollback/Undo Recovery Technique


• Commit/Redo Recovery Technique

CheckPoint Recovery Technique


Database recovery techniques ensure data integrity in case of system failures. Understanding
how these techniques work is crucial for managing databases effectively. The GATE CS Self-
Paced Course covers recovery strategies in DBMS, providing practical insights into
maintaining data consistency

Rollback/Undo Recovery Technique

The rollback/undo recovery technique is based on the principle of backing out or undoing the
effects of a transaction that has not been completed successfully due to a system failure or
error. This technique is accomplished by undoing the changes made by the transaction using
the log records stored in the transaction log. The transaction log contains a record of all the
transactions that have been performed on the database. The system uses the log records to
undo the changes made by the failed transaction and restore the database to its previous state.

Commit/Redo Recovery Technique

The commit/redo recovery technique is based on the principle of reapplying the changes
made by a transaction that has been completed successfully to the database. This technique is
accomplished by using the log records stored in the transaction log to redo the changes made
by the transaction that was in progress at the time of the failure or error. The system uses the
log records to reapply the changes made by the transaction and restore the database to its
most recent consistent state.

Checkpoint Recovery Technique

Checkpoint Recoveryis a technique used to improve data integrity and system stability,
especially in databases and distributed systems. It entails preserving the system’s state at
regular intervals, known as checkpoints, at which all ongoing transactions are either
completed or not initiated. This saved state, which includes memory and CPU registers, is
kept in stable, non-volatile storage so that it can withstand system crashes. In the event of a
breakdown, the system can be restored to the most recent checkpoint, which reduces data loss
and downtime. The frequency of checkpoint formation is carefully regulated to decrease
system overhead while ensuring that recent data may be restored quickly.
Overall, recovery techniques are essential to ensure data consistency and availability in
Database Management System, and each technique has its own advantages and limitations
that must be considered in the design of a recovery system.

Database Systems

There are both automatic and non-automatic ways for both, backing up data and recovery
from any failure situations. The techniques used to recover lost data due to system crashes,
transaction errors, viruses, catastrophic failure, incorrect command execution, etc. are
database recovery techniques. So to prevent data loss recovery techniques based on deferred
updates and immediate updates or backing up data can be used. Recovery techniques are
heavily dependent upon the existence of a special file known as a system log. It contains
information about the start and end of each transaction and any updates which occur during
the transaction. The log keeps track of all transaction operations that affect the values of
database items. This information is needed to recover from transaction failure.

The log is kept on disk start_transaction(T): This log entry records that transaction T starts
the execution.

read_item(T, X): This log entry records that transaction T reads the value of database
item X.

write_item(T, X, old_value, new_value): This log entry records that transaction T changes
the value of the database item X from old_value to new_value. The old value is sometimes
known as a before an image of X, and the new value is known as an afterimage of X.

commit(T): This log entry records that transaction T has completed all accesses to the
database successfully and its effect can be committed (recorded permanently) to the database.

abort(T): This records that transaction T has been aborted.

checkpoint: A checkpoint is a mechanism where all the previous logs are removed from the
system and stored permanently in a storage disk. Checkpoint declares a point before which
the DBMS was in a consistent state, and all the transactions were committed.
A transaction T reaches its commit point when all its operations that access the database have
been executed successfully i.e. the transaction has reached the point at which it will not abort
(terminate without completing). Once committed, the transaction is permanently recorded in
the database. Commitment always involves writing a commit entry to the log and writing the
log to disk. At the time of a system crash, the item is searched back in the log for all
transactions T that have written a start_transaction(T) entry into the log but have not written a
commit(T) entry yet; these transactions may have to be rolled back to undo their effect on the
database during the recovery process.

Undoing: If a transaction crashes, then the recovery manager may undo transactions i.e.
reverse the operations of a transaction. This involves examining a transaction for the log
entry write_item(T, x, old_value, new_value) and setting the value of item x in the database
to old-value. There are two major techniques for recovery from non-catastrophic transaction
failures: deferred updates and immediate updates.

Deferred Update: This technique does not physically update the database on disk until a
transaction has reached its commit point. Before reaching commit, all transaction updates are
recorded in the local transaction workspace. If a transaction fails before reaching its commit
point, it will not have changed the database in any way so UNDO is not needed. It may be
necessary to REDO the effect of the operations that are recorded in the local transaction
workspace, because their effect may not yet have been written in the database. Hence, a
deferred update is also known as the No-undo/redo algorithm.

Immediate Update: In the immediate update, the database may be updated by some
operations of a transaction before the transaction reaches its commit point. However, these
operations are recorded in a log on disk before they are applied to the database, making
recovery still possible. If a transaction fails to reach its commit point, the effect of its
operation must be undone i.e. the transaction must be rolled back hence we require both undo
and redo. This technique is known as undo/redo algorithm.

Caching/Buffering: In this one or more disk pages that include data items to be updated are
cached into main memory buffers and then updated in memory before being written back to
disk. A collection of in-memory buffers called the DBMS cache is kept under the control of
DBMS for holding these buffers. A directory is used to keep track of which database items
are in the buffer. A dirty bit is associated with each buffer, which is 0 if the buffer is not
modified else 1 if modified.

Shadow Paging: It provides atomicity and durability. A directory with n entries is


constructed, where the ith entry points to the ith database page on the link. When a
transaction began executing the current directory is copied into a shadow directory. When a
page is to be modified, a shadow page is allocated in which changes are made and when it is
ready to become durable, all pages that refer to the original are updated to refer new
replacement page.

Backward Recovery: The term ” Rollback ” and ” UNDO ” can also refer to backward
recovery. When a backup of the data is not available and previous modifications need to be
undone, this technique can be helpful. With the backward recovery method, unused
modifications are removed and the database is returned to its prior condition. All adjustments
made during the previous traction are reversed during the backward recovery. In other words,
it reprocesses valid transactions and undoes the erroneous database updates.

Forward Recovery: “ Roll forward “and ” REDO ” refers to forwarding recovery. When a
database needs to be updated with all changes verified, this forward recovery technique is
helpful. Some failed transactions in this database are applied to the database to roll those
modifications forward. In other words, the database is restored using preserved data and valid
transactions counted by their past saves.

Backup Techniques

There are different types of Backup Techniques. Some of them are listed below.

Full database Backup: In this full database including data and database, Meta information
needed to restore the whole database, including full-text catalogs are backed up in a
predefined time series.

Differential Backup: It stores only the data changes that have occurred since the last full
database backup. When some data has changed many times since the last full database
backup, a differential backup stores the most recent version of the changed data. For this first,
we need to restore a full database backup.
Transaction Log Backup: In this, all events that have occurred in the database, like a record
of every single statement executed is backed up. It is the backup of transaction log entries and
contains all transactions that had happened to the database. Through this, the database can be
recovered to a specific point in time. It is even possible to perform a backup from a
transaction log if the data files are destroyed and not even a single committed transaction is
lost.

You might also like