DBMS Unit-3
DBMS Unit-3
Definition: A transaction is a logical unit of work must be entirely completed or entirely aborted no
intermediate states are acceptable. A successful transaction changes the database from one consistent
state to another. Consistent database state is one in which all data integrity constraints are satisfied.
Example for transaction: If a bank has to transfer $1000 from account A to account B then the following
steps are required before the transfer is successful
In database terms the above steps will require two statements or commands to be executed if anyone
or both of the statements fails the money transfer would not be successful therefore for this example
these two statements together form a transaction. A transaction is a unit of program execution that
accesses and the possibly updates various data items. A transaction is initiated by a user program
written in a high level data manipulation language or programming language with embedded database
accesses in JDBC or ODBC. A transaction is delimited by statements of the 'begin transaction 'and 'end
transaction'.
Transaction properties:
Individual transaction must display atomicity, consistency, isolation and durability. These properties are
sometimes referred to as the acid test. In addition when executing multiple transactions the DBMS must
schedule the concurrent execution of the transactions operations. Schedule of such transactions
operations must exhibit the property of serializability.
Atomicity: Atomicity Requires that all operations off a transaction be completed if not the transaction is
aborted. If a transaction T1 has four SQL requests all four requests must be successfully completed,
Otherwise the entire transaction is aborted. In other words a transaction is treated as a single indivisible
logical unit of work.
Consistency: consistency indicates Permanence of the databases consistent state. A transaction takes a
database from one consistent state to another consistent state. When a transaction is completed the
database must be in a consistent state if any of the transaction parts violates an integrity constraint the
entire transaction is aborted.
Isolation: Isolation means that the data used during the execution of a transaction cannot be used by a
second transaction until the first one is completed. In other words if a transaction T1 is being executed
and is using that data item X that data item cannot be access by any other transaction until T1 ends .
This property is particularly useful in multi user database environment because several users can access
and update the database at the same time.
Durability: Durability ensures that once transaction changes are done they cannot be undone or lost
even in the event of a system failure.
Serializability: Serializability ensures that the schedule of the concurrent execution of the transactions
yields consistent results .This property is important in multiuser and distributed databases where
multiple transactions are likely to be executive concurrently.
States of Transactions: In the absence of failures, all transactions complete successfully. However if a
transaction may not always complete its execution successfully, is termed as aborted. if we have to
ensure the automaticity property ,an aborted transaction must have no effect on the state of the
database . Thus any changes that the aborted transaction made to the database must be undone. Once
the changes caused by an aborted transaction have been undone , we say that the transaction has been
rolled back. Transaction that completes it's execution successfully is said to be committed . A committed
transaction that has performed updates transforms the database in to a new consistent state which
must persist even if there is a system failure. Once a transaction has committed its effects but aborting
it. the only way to undo the effects off your committed transaction is to execute uh compensating
transaction for instance if a transaction added $20 to an account the compensating transaction would
subtract $20 from the account. However it is not always possible to create such a compensating
transaction. Therefore the responsibility of writing and executing a compensating transaction is left to
the user.
States of Transaction:
Active: The initial state the transaction stays in this state while it is executing
Partially committed: when a transaction executes it’s final operation, it is said to be in a partially
committed state. At this point the transaction has completed it’s execution, but it is still possible that it
may have to be aborted.
Failed: A transaction is said to be in a failed state after the system determines that the transaction can
no longer proceed with its normal execution because of some hardware or software error. Such a
transaction must be rolled back.
Aborted :After the transaction has been rolled back the database has being restored to its state prior to
the start of the transaction. Transaction in this state are called aborted.
The database recovery module can select one of the two operations after a transaction aborts
Committed: After successful completion. If a transaction executes all its operations successfully, it is
said to be committed.
Once a transaction has committed we cannot undo its effect by aborting it. The only way to undo the
effect of a committed transaction is to execute a compensating transaction. For instance if a transaction
added $20 to an account, the compensating transaction would subtract $20 from the account. However
it is not always possible to create such a compensating transaction therefore the responsibility of writing
and executing a compensating transaction is left to the user and is not handled by the database system.
A transaction must be in one of the following states. active, partially committed, failed,, and
committed.
We say that it transaction has aborted only if it has entered the aborted state. A transaction is set to
have terminated if it has either committed or aborted.
A transaction starts in the active state., when it finishes its final statement, it enters the partially
committed state., at this point, the transaction has completed its execution but it is still possible that it
may have to be aborted, since the actual output may still be temporarily residing in main memory, and
does the hardware failure maybe preclude it's successful completion.
The database system then rights out enough information two disc, that even in the event of failure, the
updates performed by the transaction can be recreated when the system restarts after the failure.
When the last of this information is written out the transaction enters the committed state.
A transaction enters the failed state after the system determines that the transaction can no longer
proceed with the normal execution. Such a transaction must be rolled back. Then it enters the aborted
state, at this point the system has two options-
1 It can restart the transaction, but only if the transaction was aborted as a result of some hardware or
software error that was not created through the internal logic of the transaction. A restarted transaction
is considered to be a new transaction.
2 It can kill the transaction. It usually does so because of some internal logical error that can be
corrected only by rewriting the application program, or because the input was bad, or. Because the
desired data were not found in the database
By using above techniques the database system maintains the atomicity and durability of the
transaction.
Concurrent Executions:
Section processing systems usually allow multiple transactions. To run concurrently. Allowing multiple
transactions to update data concurrently causes several complications with the consistency of the data.
Ensuring consistency in spite of concurrent execution of transactions requires extra work, it is far easier
to insist that transactions run serially that is one at a time, each starting only after the previous one has
completed. However, there are two good reasons for allowing concurrency:
1.Improved throughput and resource utilization :A transaction consists of many steps. Call input output
activity, others involve CPU activity. The CPU and the disks in a computer system can operate in parallel.
Therefore input output activity can be done in parallel with the processing at the CPU. That parallelism
of the CPU and the I / O system can therefore be exploited to run multiple transactions in parallel.
While a read or write on behalf of one transaction is in progress on one disk, another transaction can
be running in the CPU, while another disk maybe executing a read or write on behalf of a third
transaction. All of these increases the throughput of the system that is, the number of transactions
executed in a given amount of time. Correspondingly the processor and disc utilization also increase, in
other words the processor and disc spend less time idle.
There may be a mix of transactions running on a system, some short and some long. If transactions, run
serially, a short transaction may have to wait for a preceding long transaction, to complete, which can
lead to unpredictable delays in running a transaction. If the transactions are operating on different Parts
of database, It is better to let them run concurrently, sharing the CPU cycles and disc access among
them. Concurrent execution reduces the unpredictable delays in running transaction. Moreover, it also
reduces the average response time: Average time for a transaction to be complete after it has been
submitted.
Motivation for using concurrent execution in a database is essentially the same as the motivation for
using multiprogramming in an operating system. When several transactions run concurrently, the
isolation property may be violated, resulting in database consistency being destroyed despite the
correctness of each individual transaction. The concept of schedules to help identify those executions
that or guaranteed to ensure the isolation property and thus Database consistency. The database
system most election transactions to prevent them from destroying the consistency of the database. It
does so through a variety of mechanisms called concurrency control schemes.
Example
Simplified banking system which have several accounts, and a set of transactions that access and
update those accounts.
Let T1and T2 be two transactions that transfer Funds from one account to another.
T1: read(A) ;
A:A-50;
Write(A) ;
Read(B) ;
B:B+50;
Write(B) .
Transaction T2 transfers 10 Percent of the balance from account A to account B. It is defined as:
T2:
Read(A);
Temp:=A*0.1
A:=A-temp
Write(A)
Read(B)
B:=B+temp;
Write(B)
Suppose the current values of accounts A and B are $1000 and $2000,. Respectively. Suppose
also that the two transactions are executed one at a time in the order T1 followed by. T2. This
execution sequence appears in the below figure the sequence of instruction steps is in
chronological order from top to bottom, with transactions of T1 appearing in the left column
and instructions of T2 appearing in the right column. The final values of, account A and B, after
the execution in figure takes place , are $855 and 2145 respectively. Thus the total amount of
money in accounts A and B---- that is the A+B- is preserved after the execution of both
transactions.
Recovery system
An integral part of a database system is a recovery scheme that can restore the
database to the consistent state that existed before the failure. The recovery
scheme must also provide high availability, that is, it must minimize the time for to
which the database Is not usable after a failure.
Failure classification
There are various types of failure that may occur in your system, each of which
needs to dealt with in different manner.
The following are types of failures:
1. Transaction failure: There are two types of Errors that may cause a
transaction to fail.
Logical Error: The transaction can no longer continue with its normal
execution because of some internal condition such as bad input, data not
found, overflow or resource limit exceeded.
System error: The system has entered an undesirable state for example
deadlock as a result of which a transaction cannot continue with its normal
execution. The transaction, however, can be re execute.
2. System crash there is a hardware malfunction, or a bug in the database
software or the operating system that causes the loss of the content of
volatile storage, and brings transaction processing to a halt. The content of
nonvolatile storage remains intact, and is not corrupted.
Assumption that hardware errors and bugs in the software bring the system
to a halt but do not corrupt the nonvolatile storage contents, is known as fail
stop assumption. Well-designed systems have numerous internal checks,
the hardware and the software level that bring the system to a halt when
there is an error .Hence the failed stop assumption is a reasonable one.
3. Disk failure: A disk block loses its content as a result of either head crash Or
failure during a data transfer operation. Copy the data on other desks or
archival backups on tertiary media, such as DVD or tapes, or used to recover
from the failure.
To determine how the system should recover from failure we need to identify the
failure modes of those devices used for storing data. Next we must consider how
these failure modes affect the contents of the database. We can then propose
algorithms to ensure database consistency and the transaction atomicity despite
failures. These algorithms are known as recovery algorithms, have two parts
1. Actions taken during normal transaction processing to ensure that enough
information exists to allow. Recovery from failure
2. Actions taken after a failure to recover the database contents to a state that
ensures database consistency, transaction atomicity, and durability.
Storage
Various data items in the database may be stored and accessed in number of
different storage media.
Storage media can be distinguished by their relative speed, capacity and resilience
to failure.
There are three categories of storage.
1 Volatile storage
2 2. Non volatile storage
Stable storage , plays a critical role in recovery algorithms.
Stable storage implementation:
To implement stable storage, we need to replicate the needed information in
several nonvolatile storage media like disc with independent Failures modes,
And to update the Information in a controlled manner to ensure that the failure
during data transfer does not damage the needed information. The RAID
systems guarantee that Failure of a single disc will not result in loss of data.
The simplest and faster form of RAID is the mirrored disk, which keeps two
copies of each block, on separate disks.
RAID systems, however cannot Guard against data loss due to Disasters such as
fires or flooding. Many systems store archival backups of tapes off site to guard
against such disasters. However since babes cannot be carried off site
continually updates since the most recent time the tapes were carried off site
could be lost in such disaster. More secure systems keep a copy of each block
half table storage at a remote site writing it out over a computer network in
addition to storing the block on your local disk system. Since the blocks are
output to a remote system as in when they are output to local storage, once an
output operation is complete the output is not lost, even in the event of a
Disaster such as a fire or flood.
Block transfer between memory and disk storage can result in
Successful completion. Transferred information arrived safely at its
destination
Partial failure. a failure occurred in the midst of transfer and the
destination block has incorrect information
Total failure. The failure occurred sufficiently early during the transfer
that the destination block remains intact
If a data transfer failure occurs the system detects it and invokes a recovery
procedure to restore the block to a consistent state to do so the system must
maintain physical block for each logical database block, in the case of
mirrored disks, both b the same location, in the case of remote backup, one
of the blocks is local, where as the other is at remote site. Output operation is
it executed as follows
1. Write the information onto the first physical block
2. When the first right completes successfully write the same information on
to the second physical block
3. The output is completed only after the 2 nd write completes successfully.
If this system fails while blocks being written it is possible that please help are
inconsistent with each other. During recovery each block the system would need
to examine two copies of blocks. If both are the same, no detectable error exists
then no further actions are necessary.
If the system detects error in one block then it replaces its content with the
content of the other block. If both blocks contain no Detectable error, but they
differ in content, then the system replaces the content of the first block with the
value of the second. Recovery procedure ensures that a write to stable storage
either succeeds completely or results in no change.
The recruitment of comparing every corresponding pair of block Bring recovery
is expensive to meet. We can reduce the cost greatly by keeping track of block
writes that are in progress using a Small amount of nonvolatile RAM. On
recovery Lee blocks for which rights were in progress need to be compared.
We can extend this procedure easily to allow the use of an arbitrarily large
number of copies of each block of stable storage. Although a large number of
copies reduces the probability of a failure to even lower than two copies do, it is
usually reasonable to simulate stable storage with only two copies.
Data access
Database system resides permanently on nonvolatile storage and only parts of the
database are in memory at anytime .The database is partitioned into fixed length
storage units called blocks. Blocks are the units of data transfer to and from disk
and may contain several data items .we shall assume that no data item spans two
or more blocks ,this assumption is realistic for most data processing applications
such as a bank or a university.
Input information from the disk to main memory and then output the information
back onto the disk. The input and output operations are done in block units. The
blocks reciding on the disk are referred to as physical blocks, the blocks residing
temporarily in main memory are referred to as buffer blocks , the area of memory
where blocks reside temporarily is called the disk buffer.
Block movement between disk and main memory are initiated through the
following two operations
1. Input (B) transfers the physical block B to main memory.
2. Outpu(B) transfers the buffer block B to the disk, and replaces the
appropriate physical block there.
Conceptually each transaction Ti has a private work area in which copies of
data items accessed and updated by Ti are kept. The system creates this
work area when the transaction is initiated, the system removes it when the
transaction.