Transaction Management
Transaction Management
Transaction Concepts
Introduction
Concurrent Executions
Serializability
Recoverability
Concurrency Control
Lock-Based Protocols
Timestamp-Based Protocols
Deadlock Handling
Recovery
Failure Classification
Log-Based Recovery
Transaction Concept
A transaction is the execution of a program
unit that accesses and/or updates data in a
database.
Before the transaction is executed, the
database must be consistent.
During transaction execution the database
may be inconsistent.
When the transaction is committed, the
database must be consistent.
Two main issues to deal with:
Concurrent execution of multiple transactions
Recovery from failures of various kinds, such
as hardware failures and system crashes
ACID Properties
To preserve integrity of data, the database system must ensure:
Atomicity. Either all operations of the
transaction are executed or none of them is.
Consistency. Execution of a transaction
preserves the consistency of the database.
Isolation. A transaction should not make its
updates visible to other transactions until the
transaction is committed.
Durability. After a transaction completes
successfully, the changes it has made to the
database persist, even if there are system
failures.
Example of Fund Transfer
Transaction to transfer $50 from account A to
account B:
1. read(A)
2. A := A – 50
3. write(A)
4. read(B)
5. B := B + 50
6. write(B)
Consistency requirement – the sum of A and B is
unchanged by the execution of the transaction.
Atomicity requirement — if the transaction fails
after step 3 and before step 6, the system should
ensure that its updates are not reflected in the
database, else an inconsistency will result.
Example of Fund Transfer (Cont.)
[start_transaction, T]
[write_item, T, x, old_value, new_value]
[read_item, T, x]
[commit, T], [abort, T]
Commit point: the effect of all operations of the
transaction has been recorded in the log.
Checkpoint: is written into the log periodically when
the system writes out to the DB all write operations
of committed transacions.
Conflict Serializability
Instructions li and lj of transactions Ti and Tj
respectively, conflict if and only if there exists some
item Q accessed by both li and lj, and at least one of
these instructions wrote Q.
1. li = read(Q), lj = read(Q). li and lj don’t conflict.
2. li = read(Q), lj = write(Q). They conflict.
3. li = write(Q), lj = read(Q). They conflict
4. li = write(Q), lj = write(Q). They conflict
Intuitively, a conflict between li and lj forces a
(logical) temporal order between them. If li and lj
are consecutive in a schedule and they do not
conflict, their results would remain the same even if
they had been interchanged in the schedule.
Conflict Serializability (Cont.)
If a schedule S can be transformed into a schedule S´
by a series of swaps of non-conflicting instructions,
we say that S and S´ are conflict equivalent.
We say that a schedule S is conflict serializable if it is
conflict equivalent to a serial schedule
Example of a schedule that is not conflict serializable:
T3 T4
read(Q)
write(Q)
write(Q)
Schedule: r1(A)w1(A)r2(A)c2r1(A)a1
If T should abort, T would have read (and possibly
8 9
shown to the user) an inconsistent database state.
Hence database must ensure that schedules are
recoverable.
Recoverability (Cont.)
Schedule: r1(A)w1(A)r2(A)w2(A)c2a1
r1(A)w1(A)r2(A)w2(A)c1c2
r1(A)w1(A)c1r2(A)w2(A)c2
It is desirable to restrict the schedules to those that
are cascadeless
Implementation of Isolation
Schedules must be conflict or view serializable,
and recoverable, for the sake of database
consistency, and preferably cascadeless.
A policy in which only one transaction can
execute at a time generates serial schedules, but
provides a poor degree of concurrency..
Concurrency-control schemes tradeoff between
the amount of concurrency they allow and the
amount of overhead that they incur.
Some schemes allow only conflict-serializable
schedules to be generated, while others allow
view-serializable schedules that are not conflict-
serializable.
Transaction Definition in SQL
Data manipulation language must include a
construct for specifying the set of actions that
comprise a transaction.
In SQL, a transaction begins implicitly.
A transaction in SQL ends by:
Commit work commits current transaction and
begins a new one.
Rollback work causes current transaction to
abort.
Testing for Serializability
Consider some schedule of a set of transactions
T1, T2, ..., Tn
Precedence graph — a direct graph where the
vertices are the transactions (names).
We draw an arc from T to T if the two
i j
transactions conflict, and Ti accessed the data
item on which the conflict arose earlier.
We may label the arc by the item that was
accessed.
Example 1
x
y
Example Schedule (Schedule A)
T1 T2 T3 T4 T5
read(X)
read(Y)
read(Z)
read(V)
read(W)
read(W)
read(Y)
write(Y)
write(Z)
read(U)
read(Y)
write(Y)
read(Z)
write(Z)
read(U)
write(U)
Precedence Graph for Schedule A
T1 T2
T4
T3
Test for Conflict Serializability
A schedule is conflict serializable if and only if its
precedence graph is acyclic.
Cycle-detection algorithms exist which take order
n2 time, where n is the number of vertices in the
graph. (Better algorithms take order n + e where e
is the number of edges.)
If precedence graph is acyclic, the serializability
order can be obtained by a topological sorting of the
graph. This is a linear order consistent with the
partial order of the graph.
For example, a serializability order for Schedule A
would be
T5 T1 T3 T2 T4 .
Test for View Serializability
The precedence graph test for conflict
serializability must be modified to apply to a test
for view serializability.
The problem of checking if a schedule is view
serializable falls in the class of NP-complete
problems. Thus existence of an efficient algorithm
is unlikely.
However practical algorithms that just check some
sufficient conditions for view serializability can
still be used.
Concurrency Control vs. Serializability
Tests
– First Phase:
can acquire a lock-S on item
can acquire a lock-X on item
can convert a lock-S to a lock-X (upgrade)
– Second Phase:
can release a lock-S
can release a lock-X
can convert a lock-X to a lock-S (downgrade)
This protocol assures serializability. But still relies
on the programmer to insert the various locking
instructions.
Implementation of Locking
A Lock manager can be implemented as a separate
process to which transactions send lock and unlock
requests
The lock manager replies to a lock request by
sending a lock grant messages (or a message asking
the transaction to roll back, in case of a deadlock)
The requesting transaction waits until its request is
answered
The lock manager maintains a data structure called a
lock table to record granted locks and pending
requests
The lock table is usually implemented as an in-
memory hash table indexed on the name of the data
item being locked
Timestamp-Based Protocols
Each transaction is issued a timestamp when it enters the
system. If an old transaction Ti has time-stamp TS(Ti), a new
transaction Tj is assigned time-stamp TS(Tj) such that TS(Ti)
<TS(Tj).
The protocol manages concurrent execution such that the
time-stamps determine the serializability order.
In order to assure such behavior, the protocol maintains for
each data Q two timestamp values:
W-timestamp(Q) is the largest time-stamp of any transaction that
executed write(Q) successfully.
R-timestamp(Q) is the largest time-stamp of any transaction that
executed read(Q) successfully.
Timestamp-Based Protocols (Cont.)
The timestamp ordering protocol ensures that any conflicting
read and write operations are executed in timestamp order.
Suppose a transaction T i issues a read(Q)
transaction transaction
with smaller with larger
timestamp timestamp
T1 T2
lock-X on X
write (X)
lock-X on Y
write (X)
wait for lock-X on X
wait for lock-X on Y
Deadlock Handling
System is deadlocked if there is a set of transactions
such that every transaction in the set is waiting for
another transaction in the set.
Deadlock prevention protocols ensure that the system
will never enter into a deadlock state. Some prevention
strategies :
Require that each transaction locks all its data items
before it begins execution (predeclaration).
Impose partial ordering of all data items and require that a
transaction can lock data items only in the order specified
by the partial order (graph-based protocol).
More Deadlock Prevention
Strategies
Tc Tf
T1
T2
T3
T4
T4 undone
System log: keeps track of all transaction operations
affecting the data values. It is kept on disk.
Log entries:
[start_-transaction,T]
[write_-item, T, X, old_-value, new_-value]
[read_-item, T, X]
[commit, T], [abort, T]