Chapter 3
Chapter 3
Transaction Processing
Concepts
Chapter Contents
Introduction
Transaction System Concepts
Properties of Transaction
Schedules and Recoverability
Serializability of Schedules
Transaction Support in SQL
Introduction
• Single-user System:
• At most one user at a time can use the database
management system.
• E.g. Personal computer system
• Multi-user System:
• Many users can access the DBMS concurrently.
• E.g. Air line reservation, Bank and the like system are
operated by many users who submit transaction
concurrently to the system
• This is achieved by multiprogramming, which allows the
computer to execute multiple programs /processes at the
same time.
Transaction System Concepts
A transaction is a unit of program execution that accesses and
possibly updates various data items.
A transaction must see a consistent database.
During transaction execution the database may be temporarily
inconsistent.
When the transaction completes successfully (is committed), the
database must be consistent.
After a transaction commits, the changes it has made to the
database persist, even if there are system failures.
Multiple transactions can execute in parallel.
Two main issues to deal with:
Failures of various kinds, such as hardware failures and system
crashes
Concurrent execution of multiple transactions
Transaction System Concepts
• Transaction
• Logical unit of database processing that includes one
or more access operations (read -retrieval, write -
insert or update, delete).
• A transaction (set of operations) may be stand-alone
specified in a high level language like SQL submitted
interactively, or may be embedded within an
application program.
• Transaction boundaries:
• Any single transaction in an application program is
bounded with Begin and End statements.
• Application program may contain several transactions
separated by the Begin and End transaction boundaries.
Transaction System Concepts
SIMPLE MODEL OF A DATABASE (for purposes of discussing
transactions):
• A database is a collection of named data items
• Granularity of data - a field, a record , or a whole disk block
(Concepts are independent of granularity)
• Basic operations are read and write
• read_item(X): Reads a database item named X into a
program variable. To simplify our notation, we assume
that the program variable is also named X.
• write_item(X): Writes the value of program variable X
into the database item named X.
Transaction System Concepts
Basic Operations of Transaction Processing
READ AND WRITE OPERATIONS:
• Basic unit of data transfer from the disk to the computer
main memory is one block.
• In general, a data item (what is read or written) will be the
field of some record in the database, although it may be a
larger unit such as a record or even a whole block.
• read_item(X) command includes the following steps:
• Find the address of the disk block that contains item X.
• Copy that disk block into a buffer in main memory (if that disk
block is not already in some main memory buffer).
• Copy item X from the buffer to the program variable named X.
Transaction System Concepts
Basic Operations of Transaction Processing
READ AND WRITE OPERATIONS (contd.):
• write_item(X) command includes the following steps:
• Find the address of the disk block that contains item X.
• Copy that disk block into a buffer in main memory (if that
disk block is not already in some main memory buffer).
• Copy item X from the program variable named X into its
correct location in the buffer.
• Store the updated block from the buffer back to disk
(either immediately or at some later point in time).
Transaction System Concepts
• Two sample transactions
(a) Transaction T1
(b) Transaction T2
Properties of Transactions
ACID Properties: provide a mechanism to ensure correctness and
consistency of a database in a way such that each transaction is a group
of operations that acts an atomic unit, produces consistent results, acts in
isolation from other operations and updates that it makes are durably
stored.
• Atomicity: A transaction is an atomic unit of processing; it is either
performed entirely or not performed at all.
• Consistency preservation: A correct execution of the transaction must
take the database from one consistent state to another.
• Isolation: A transaction should not make its updates visible to other
transactions until it is committed, When enforced strictly, solves the
temporary update problem and makes cascading rollbacks of
transactions unnecessary.
• Durability or permanency: Once a transaction changes the database
and the changes are committed, these changes must never be lost
because of subsequent failure.
Atomicity
• By this, we mean that either the entire transaction takes place
at once or doesn’t happen at all.
• There is no midway; i.e. transactions do not occur partially.
• Each transaction is considered as one unit and either runs to
completion or is not executed at all.
• It involves the following two operations.
—Abort: If a transaction aborts, changes made to database are
not visible.
—Commit: If a transaction commits, changes made are
visible.
• Atomicity is also known as the ‘All or nothing rule’.
Consistency
• This means that integrity constraints must be maintained so that the
database is consistent before and after the transaction.
• It refers to the correctness of a database.
• Referring to the example above, The total amount before and after
the transaction must be maintained.
• Total before T occurs = 500 + 200 = 700.
• Total after T occurs = 400 + 300 = 700.
• Therefore, database is consistent.
• Inconsistency occurs in case T1 completes but T2 fails. As a result T
is incomplete.
Isolation
• This property ensures that multiple transactions can
occur concurrently without leading to the
inconsistency of database state.
• Transactions occur independently without interference.
• Changes occurring in a particular transaction will not
be visible to any other transaction until that particular
change in that transaction is written to memory or has
been committed.
• This property ensures that the execution of transactions
concurrently will result in a state that is equivalent to a
state achieved these were executed serially in some
order.
Isolation continued..
• Let X= 500, Y = 500.
Consider two transactions T and T”.
• Suppose T has been executed till Read (Y) and then T’’ starts.
• As a result , interleaving of operations takes place due to which T’’ reads
correct value of X but incorrect value of Y and sum computed by
T’’: (X+Y = 50, 000+500=50, 500)
is thus not consistent with the sum at end of transaction:
T: (X+Y = 50, 000 + 450 = 50, 450).
• This results in database inconsistency, due to a loss of 50 units.
• Hence, transactions must take place in isolation and changes should be
visible only after they have been made to the main memory.
Durability
• This property ensures that once the transaction has
completed execution, the updates and
modifications to the database are stored in and
written to disk and they persist even if a system
failure occurs.
• These updates now become permanent and are
stored in non-volatile memory.
• The effects of the transaction, thus, are never lost.
What causes a Transaction to fail ?
1. Computer failure (system crash):
• A hardware or software error occurs in the computer system during
transaction execution.
• If the hardware crashes, the contents of the computer’s internal memory
may be lost.
2. Transaction or system error:
• Some operation in the transaction may cause it to fail, such as integer
overflow or division by zero.
• Transaction failure may also occur because of erroneous parameter values or
because of a logical programming error.
• In addition, the user may interrupt the transaction during its execution.
3. Exception conditions detected by the transaction:
• Certain conditions forces cancellation of the transaction.
• Data for the transaction may not be found, insufficient account balance in
a banking database, may cause a transaction, such as a fund withdrawal
from that account, to be canceled.
What causes a Transaction to fail ?...
4. Concurrency control enforcement:
• The concurrency control method may decide to abort
the transaction, to be restarted later, because it
violates serializability or because several transactions
are in a state of deadlock.
5. Disk failure:
• Some disk blocks may lose their data because of a read
or write malfunction or because of a disk read/write
head crash. This may happen during a read or a write
operation of the transaction.
6. Physical problems and catastrophes:
• This refers to an endless list of problems that includes
power or air-conditioning failure, fire, theft,
overwriting disks or tapes by mistake.
Recovery manager keeps track of the
following operations
• begin_transaction: This marks the beginning of transaction
execution.
• read or write: These specify read or write operations on the
database items that are executed as part of a transaction.
• end_transaction: This specifies that read and write transaction
operations have ended and marks the end limit of transaction
execution.
• Commit_transaction: This signals a successful end of the
transaction so that any changes (updates) executed by the
transaction can be safely committed to the database and will not
be undone.
• Rollback (abort): This signals that the transaction has ended
unsuccessfully, so that any changes or effects that the transaction
may have applied to the database must be undone.
Transaction States
• A transaction is an atomic unit of work that is either completed in
its entirety or not done at all.
• For recovery purposes, the system needs to keep track of when
the transaction starts, terminates, and commits or aborts.
• Transaction states:
• Active state: indicates the beginning of a transaction execution
• Partially committed state: shows the end of read/write
operation but this will not ensure permanent modification on
the data base
• Committed state: ensures that all the changes done on a
record by a transition were done persistently
• Failed state: happens when a transaction is aborted during its
active state or if one of the rechecking is fails
• Terminated State: corresponds to the transaction leaving the
system
Transaction States…
Schedules
Schedule – a sequence of instructions that specify the
chronological order in which instructions of concurrent transactions
are executed
a schedule for a set of transactions must consist of all
instructions for those transactions
must preserve the order in which the instructions appear in
each individual transaction.
A transaction that successfully completes its execution will have
commit instructions as the last statement (will be omitted if it is
obvious)
A transaction that fails to successfully complete its execution will
have abort instructions as the last statement (will be omitted if it is
obvious)
Schedule 1
Let T1 transfer $50 from A to B, and T2 transfer 10% of the balance
from A to B.
A serial schedule in which T1 is followed by T2:
Schedule 2
• A serial schedule where T2 is followed by T1
Schedule 3
Let T1 and T2 be the transactions defined previously. The following
schedule is not a serial schedule, but it is equivalent to Schedule 1.
Schedule 3 Schedule 6
Conflict Serializability (Cont.)
Example of a schedule that is not conflict serializable:
y
Example Schedule (Schedule A) + Precedence
Graph
T1 T2 T3 T4 T5
read(X)
read(Y)
read(Z)
read(V)
read(W) T1 T2
read(W)
read(Y)
write(Y)
write(Z)
read(U)
read(Y) T3 T4
write(Y)
read(Z)
write(Z)
read(U)
write(U)
Test for Conflict Serializability
A schedule is conflict serializable if and
only if its precedence graph is acyclic.
Cycle-detection algorithms exist which
take order n2 time, where n is the
number of vertices in the graph.
(Better algorithms take order n + e
where e is the number of edges.)
If the precedence graph is acyclic,
the serializability order can be
obtained by a topological sorting of
the graph.
This is a linear order
consistent with the partial order of
the graph.
For example, a serializability
order for Schedule A would be
T5 T1 T3 T2 T4
Test for View Serializability
The precedence graph test for conflict serializability
cannot be used directly to test for view serializability.
Extension to test for view serializability has cost
exponential in the size of the precedence graph.
The problem of checking if a schedule is view
serializable falls in the class of NP-complete problems.
Thus the existence of an efficient algorithm is
extremely unlikely.
However practical algorithms that just check some
sufficient conditions for view serializability can
still be used.
Recoverable Schedules
• Need to address the effect of transaction failures on
concurrently running transactions.
Recoverable schedule — if a transaction T reads a data
j