0% found this document useful (0 votes)
16 views69 pages

Transaction Management

The document discusses transaction management concepts including concurrency control, recovery, and ACID properties. It covers topics such as lock-based and timestamp-based concurrency control protocols, log-based recovery, and failure classification for recovery. Transaction concepts such as serializability and recoverability are explained.

Uploaded by

Shohanur Rahman
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views69 pages

Transaction Management

The document discusses transaction management concepts including concurrency control, recovery, and ACID properties. It covers topics such as lock-based and timestamp-based concurrency control protocols, log-based recovery, and failure classification for recovery. Transaction concepts such as serializability and recoverability are explained.

Uploaded by

Shohanur Rahman
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 69

Transaction Management

 Transaction Concepts
 Introduction
 Concurrent Executions
 Serializability
 Recoverability
 Concurrency Control
 Lock-Based Protocols
 Timestamp-Based Protocols
 Deadlock Handling
 Recovery
 Failure Classification
 Log-Based Recovery
Transaction Concept
 A transaction is the execution of a program
unit that accesses and/or updates data in a
database.
 Before the transaction is executed, the
database must be consistent.
 During transaction execution the database
may be inconsistent.
 When the transaction is committed, the
database must be consistent.
 Two main issues to deal with:
 Concurrent execution of multiple transactions
 Recovery from failures of various kinds, such
as hardware failures and system crashes
ACID Properties
To preserve integrity of data, the database system must ensure:
 Atomicity. Either all operations of the
transaction are executed or none of them is.
 Consistency. Execution of a transaction
preserves the consistency of the database.
 Isolation. A transaction should not make its
updates visible to other transactions until the
transaction is committed.
 Durability. After a transaction completes
successfully, the changes it has made to the
database persist, even if there are system
failures.
Example of Fund Transfer
 Transaction to transfer $50 from account A to
account B:
1. read(A)
2. A := A – 50
3. write(A)
4. read(B)
5. B := B + 50
6. write(B)
 Consistency requirement – the sum of A and B is
unchanged by the execution of the transaction.
 Atomicity requirement — if the transaction fails
after step 3 and before step 6, the system should
ensure that its updates are not reflected in the
database, else an inconsistency will result.
Example of Fund Transfer (Cont.)

 Durability requirement — once the user has been


notified that the transaction has completed (i.e.,
the transfer of the $50 has taken place), the
updates to the database by the transaction must
persist despite failures.
 Isolation requirement — if between steps 3 and 6,
another transaction is allowed to access the
partially updated database, it will see an
inconsistent database
(the sum A + B will be less than it should be).
Can be ensured trivially by running transactions
serially, that is one after the other. However,
executing multiple transactions concurrently has
significant benefits, as we will see.
Transaction States
 Active, the initial state; the transaction stays in
this state while it is executing
 Partially committed, after the final statement has
been executed.
 Failed, after the discovery that normal execution
can no longer proceed.
 Aborted, after the transaction has been rolled back
and the database restored to its state prior to the
start of the transaction. Two options after it has
been aborted:
 restart the transaction – only if no internal logical
error
 kill the transaction
 Committed, after successful completion.
Transaction State (Cont.)
Concurrent Executions
 Multiple transactions are allowed to run
concurrently in the system. Advantages are:
 increased processor and disk utilization, leading to
better transaction throughput: one transaction can
be using the CPU while another is reading from or
writing to the disk
 reduced average response time for transactions:
short transactions need not wait behind long ones.
 Concurrency control schemes – mechanisms to
achieve isolation, i.e., to control the interaction
among the concurrent transactions in order to
prevent them from destroying the consistency of
the database
Schedules
 Schedules – sequences that indicate the
chronological order in which instructions of
concurrent transactions are executed
 a schedule for a set of transactions must consist of all
instructions of those transactions
 must preserve the order in which the instructions
appear in each individual transaction.
Example Schedules
 Let T1 transfer $50 from A to B, and T2 transfer 10% of
the balance from A to B. The following is a serial
schedule, in which T1 is followed by T2.
Example Schedule (Cont.)
 Let T1 and T2 be the transactions defined
previously. The following schedule is not a serial
schedule, but it is equivalent to previous schedule.

In both Schedules & 2, the sum A + B is preserved.


Example Schedules (Cont.)
 The following concurrent schedule (Schedule 3)
does not preserve the value of the the sum A + B.
Serializability
 Basic Assumption – Each transaction preserves
database consistency.
 Thus serial execution of a set of transactions
preserves database consistency.
 A (possibly concurrent) schedule is serializable if it
is equivalent to a serial schedule. Different forms
of schedule equivalence give rise to the notions of:
1. conflict serializability
2. view serializability
 We ignore operations other than read and write
instructions, and we assume that transactions may
perform arbitrary computations on data in local
buffers in between reads and writes. Our simplified
schedules consist of only read and write
instructions.
Transaction Actions
 X can be a tuple, attributes, table
 Read(x): reads a database item named x into a
program variable.
Find disk address for x, copy the disk block -> buffer ->
program variable.

 Write(x): write x into the database item named x.


 Find disk address
 Copy the disk block -> memory
 Copy x from variable -> buffer, write to disk.

 System log: keeps track of all transaction


operations affecting the data values. It is kept on
disk.
Transaction States
 System log entries:

[start_transaction, T]
[write_item, T, x, old_value, new_value]
[read_item, T, x]
[commit, T], [abort, T]
 Commit point: the effect of all operations of the
transaction has been recorded in the log.
 Checkpoint: is written into the log periodically when
the system writes out to the DB all write operations
of committed transacions.
Conflict Serializability
 Instructions li and lj of transactions Ti and Tj
respectively, conflict if and only if there exists some
item Q accessed by both li and lj, and at least one of
these instructions wrote Q.
1. li = read(Q), lj = read(Q). li and lj don’t conflict.
2. li = read(Q), lj = write(Q). They conflict.
3. li = write(Q), lj = read(Q). They conflict
4. li = write(Q), lj = write(Q). They conflict
 Intuitively, a conflict between li and lj forces a
(logical) temporal order between them. If li and lj
are consecutive in a schedule and they do not
conflict, their results would remain the same even if
they had been interchanged in the schedule.
Conflict Serializability (Cont.)
 If a schedule S can be transformed into a schedule S´
by a series of swaps of non-conflicting instructions,
we say that S and S´ are conflict equivalent.
 We say that a schedule S is conflict serializable if it is
conflict equivalent to a serial schedule
 Example of a schedule that is not conflict serializable:

T3 T4
read(Q)
write(Q)
write(Q)

We are unable to swap instructions in the above


schedule to obtain either the serial schedule < T3, T4
>, or the serial schedule < T4, T3 >.
Conflict Serializability (Cont.)

 The schedule below can be transformed into a


serial schedule where T2 follows T1, by series of
swaps of non-conflicting instructions. Therefore
the schedule is conflict serializable.
View Serializability
 Let S and S´ be two schedules with the same set of
transactions. S and S´ are view equivalent if the
following three conditions are met:
1. For each data item Q, if transaction Ti reads the initial
value of Q in schedule S, then transaction Ti must, in
schedule S´, also read the initial value of Q.
2. For each data item Q if transaction Ti executes read(Q)
in schedule S, and that value was produced by
transaction Tj (if any), then transaction Ti must in
schedule S´ also read the value of Q that was produced
by transaction Tj .
3. For each data item Q, the transaction (if any) that
performs the final write(Q) operation in schedule S must
perform the final write(Q) operation in schedule S´.
As can be seen, view equivalence is also based purely
on reads
and writes alone.
View Serializability (Cont.)
 A schedule S is view serializable it is view equivalent to a
serial schedule.
 Every conflict serializable schedule is also view
serializable.
 A schedule which is view-serializable but not conflict
serializable.

 Every view serializable schedule that is not conflict


serializable has blind writes (the value written by an
operation w(x) in T is independent of its old value from
DB).
Other Notions of Serializability
 Schedule given below produces same outcome as
the serial schedule < T1, T5 >, yet is not conflict
equivalent or view equivalent to it.

 Determining such equivalence requires analysis


of operations other than read and write.
Why recovery is needed
 A computer failure (system crash):
 A hardware or software error occurs during transaction
execution. If the hardware crashes, the contents of the internal
memory may be lost.
 A transaction or system error:
 Some operation in the transaction may cause it to fail, e.g.,
division by zero. Transaction failure may also occur because of
erroneous parameter values or because of a logical programming
error.
 Local errors or exception conditions detected by the
transaction:
 Certain conditions necessitate cancellation of the transaction.
For example, data for the transaction may not be found.
 A programmed abort in the transaction causes it to fail.
 Concurrency control enforcement:
 The concurrency control method may decide to abort the
transaction, to be restarted later, because it violates
serializability or because several transactions are in a state of
Logging
 Recovery manager keeps track of the following
operations:
 begin_transaction: This marks the beginning of
transaction execution.
 read or write: These specify read or write operations
on the database items that are executed as part of a
transaction.
 end_transaction: This specifies that read and write
transaction operations have ended and marks the
end limit of transaction execution.
 At this point it may be necessary to check whether
the changes introduced by the transaction can be
permanently applied to the database or whether
the transaction has to be aborted because it
violates concurrency control or for some other
reason.
Transaction States
Recoverability
ed to address the effect of transaction failures on concurrently
nning transactions.
 Recoverable schedule — if a transaction T reads a data
j
items previously written by a transaction Ti , the commit
operation of Ti appears before the commit operation of Tj.
 The following schedule is not recoverable if T commits
9
immediately after the read

Schedule: r1(A)w1(A)r2(A)c2r1(A)a1
 If T should abort, T would have read (and possibly
8 9
shown to the user) an inconsistent database state.
Hence database must ensure that schedules are
recoverable.
Recoverability (Cont.)

 Cascading rollback – a single transaction failure


leads to a series of transaction rollbacks.
Consider the following schedule where none of
the transactions has yet committed (so the
schedule is recoverable)

If T10 fails, T11 and T12 must also be rolled back.


 Can lead to the undoing of a significant amount
of work
Recoverability (Cont.)
 Cascadeless schedules — cascading rollbacks
cannot occur; for each pair of transactions Ti and Tj
such that Tj reads a data item previously written by
Ti, the commit operation of Ti appears before the
read operation of Tj.
 Every cascadeless schedule is also recoverable

Schedule: r1(A)w1(A)r2(A)w2(A)c2a1
r1(A)w1(A)r2(A)w2(A)c1c2
r1(A)w1(A)c1r2(A)w2(A)c2
 It is desirable to restrict the schedules to those that
are cascadeless
Implementation of Isolation
 Schedules must be conflict or view serializable,
and recoverable, for the sake of database
consistency, and preferably cascadeless.
 A policy in which only one transaction can
execute at a time generates serial schedules, but
provides a poor degree of concurrency..
 Concurrency-control schemes tradeoff between
the amount of concurrency they allow and the
amount of overhead that they incur.
 Some schemes allow only conflict-serializable
schedules to be generated, while others allow
view-serializable schedules that are not conflict-
serializable.
Transaction Definition in SQL
 Data manipulation language must include a
construct for specifying the set of actions that
comprise a transaction.
 In SQL, a transaction begins implicitly.
 A transaction in SQL ends by:
 Commit work commits current transaction and
begins a new one.
 Rollback work causes current transaction to
abort.
Testing for Serializability
 Consider some schedule of a set of transactions
T1, T2, ..., Tn
 Precedence graph — a direct graph where the
vertices are the transactions (names).
 We draw an arc from T to T if the two
i j
transactions conflict, and Ti accessed the data
item on which the conflict arose earlier.
 We may label the arc by the item that was
accessed.
 Example 1
x

y
Example Schedule (Schedule A)
T1 T2 T3 T4 T5
read(X)
read(Y)
read(Z)
read(V)
read(W)
read(W)
read(Y)
write(Y)
write(Z)
read(U)
read(Y)
write(Y)
read(Z)
write(Z)
read(U)
write(U)
Precedence Graph for Schedule A

T1 T2

T4
T3
Test for Conflict Serializability
 A schedule is conflict serializable if and only if its
precedence graph is acyclic.
 Cycle-detection algorithms exist which take order
n2 time, where n is the number of vertices in the
graph. (Better algorithms take order n + e where e
is the number of edges.)
 If precedence graph is acyclic, the serializability
order can be obtained by a topological sorting of the
graph. This is a linear order consistent with the
partial order of the graph.
For example, a serializability order for Schedule A
would be
T5  T1  T3  T2  T4 .
Test for View Serializability
 The precedence graph test for conflict
serializability must be modified to apply to a test
for view serializability.
 The problem of checking if a schedule is view
serializable falls in the class of NP-complete
problems. Thus existence of an efficient algorithm
is unlikely.
However practical algorithms that just check some
sufficient conditions for view serializability can
still be used.
Concurrency Control vs. Serializability
Tests

 Testing a schedule for serializability after it has


executed is a little too late!
 Goal – to develop concurrency control protocols
that will assure serializability. They will generally
not examine the precedence graph as it is being
created; instead a protocol will impose a discipline
that avoids nonseralizable schedules.
 Tests for serializability help understand why a
concurrency control protocol is correct.
Lock-Based Protocols

 A lock is a mechanism to control concurrent access


to a data item
 Data items can be locked in two modes :

1. exclusive (X) mode. Data item can be both read as


well as
written. X-lock is requested using lock-X
instruction.
2. shared (S) mode. Data item can only be read. S-
lock is
requested using lock-S instruction.
 Lock requests are made to concurrency-control
manager. Transaction can proceed only after request
is granted.
Lock-Based Protocols (Cont.)
 Lock-compatibility matrix

 A transaction may be granted a lock on an item if the


requested lock is compatible with locks already held
on the item by other transactions
 Any number of transactions can hold shared locks on
an item, but if any transaction holds an exclusive on
the item no other transaction may hold any lock on
the item.
 If a lock cannot be granted, the requesting
transaction is made to wait till all incompatible locks
held by other transactions have been released. The
lock is then granted.
Lock-Based Protocols (Cont.)
 Example of a transaction performing locking:
T2: lock-S(A);
read (A);
unlock(A);
lock-S(B);
read (B);
unlock(B);
display(A+B)
 Locking as above is not sufficient to guarantee
serializability — if A and B get updated in-between the read
of A and B, the displayed sum would be wrong.
 A locking protocol is a set of rules followed by all
transactions while requesting and releasing locks. Locking
protocols restrict the set of possible schedules.
Pitfalls of Lock-Based Protocols
 Consider the partial schedule

 Neither T3 nor T4 can make progress — executing lock-S(B)


causes T4 to wait for T3 to release its lock on B, while
executing lock-X(A) causes T3 to wait for T4 to release its
lock on A.
 Such a situation is called a deadlock.
 To handle a deadlock one of T3 or T4 must be rolled back
and its locks released.
Pitfalls of Lock-Based Protocols
(Cont.)

 The potential for deadlock exists in most locking


protocols. Deadlocks are a necessary evil.
 Starvation is also possible if concurrency control
manager is badly designed. For example:
 A transaction may be waiting for an X-lock on an item,
while a sequence of other transactions request and
are granted an S-lock on the same item.
 The same transaction is repeatedly rolled back due to
deadlocks.
 Concurrency control manager can be designed to
prevent starvation.
The Two-Phase Locking Protocol
 This is a protocol which ensures conflict-
serializable schedules.
 Phase 1: Growing Phase
 transaction may obtain locks
 transaction may not release locks
 Phase 2: Shrinking Phase
 transaction may release locks
 transaction may not obtain locks
 The protocol assures serializability. It can be proved
that the transactions can be serialized in the order
of their lock points (i.e. the point where a
transaction acquired its final lock).
The Two-Phase Locking Protocol
(Cont.)

 Two-phase locking does not ensure freedom from


deadlocks
 Cascading roll-back is possible under two-phase locking.
To avoid this, follow a modified protocol called strict two-
phase locking. Here a transaction must hold all its
exclusive locks till it commits/aborts.
 Rigorous two-phase locking is even stricter: here all locks
are held till commit/abort. In this protocol transactions
can be serialized in the order in which they commit.
(Cont.)

 There can be conflict serializable schedules that


cannot be obtained if two-phase locking is used.
 However, in the absence of extra information (e.g.,
ordering of access to data), two-phase locking is
needed for conflict serializability in the following
sense:
Given a transaction Ti that does not follow two-
phase locking, we can find a transaction Tj that
uses two-phase locking, and a schedule for Ti and Tj
that is not conflict serializable.
Lock Conversions
 Two-phase locking with lock conversions:

– First Phase:
 can acquire a lock-S on item
 can acquire a lock-X on item
 can convert a lock-S to a lock-X (upgrade)

– Second Phase:
 can release a lock-S
 can release a lock-X
 can convert a lock-X to a lock-S (downgrade)
 This protocol assures serializability. But still relies
on the programmer to insert the various locking
instructions.
Implementation of Locking
 A Lock manager can be implemented as a separate
process to which transactions send lock and unlock
requests
 The lock manager replies to a lock request by
sending a lock grant messages (or a message asking
the transaction to roll back, in case of a deadlock)
 The requesting transaction waits until its request is
answered
 The lock manager maintains a data structure called a
lock table to record granted locks and pending
requests
 The lock table is usually implemented as an in-
memory hash table indexed on the name of the data
item being locked
Timestamp-Based Protocols
 Each transaction is issued a timestamp when it enters the
system. If an old transaction Ti has time-stamp TS(Ti), a new
transaction Tj is assigned time-stamp TS(Tj) such that TS(Ti)
<TS(Tj).
 The protocol manages concurrent execution such that the
time-stamps determine the serializability order.
 In order to assure such behavior, the protocol maintains for
each data Q two timestamp values:
 W-timestamp(Q) is the largest time-stamp of any transaction that
executed write(Q) successfully.
 R-timestamp(Q) is the largest time-stamp of any transaction that
executed read(Q) successfully.
Timestamp-Based Protocols (Cont.)
 The timestamp ordering protocol ensures that any conflicting
read and write operations are executed in timestamp order.
 Suppose a transaction T i issues a read(Q)

1. If TS(Ti)  W-timestamp(Q), then Ti needs to read a value of Q

that was already overwritten. Hence, the read operation is

rejected, and Ti is rolled back.


2. If TS(Ti) W-timestamp(Q), then the read operation is
executed, and R-timestamp( Q) is set to the maximum of R-
timestamp(Q) and TS(Ti).
Timestamp-Based Protocols (Cont.)
 Suppose that transaction Ti issues write(Q).

 If TS(Ti) < R-timestamp(Q), then the value of Q that Ti is


producing was needed previously, and the system
assumed that that value would never be produced.
Hence, the write operation is rejected, and Ti is rolled
back.
 If TS(Ti) < W-timestamp(Q), then Ti is attempting to write
an obsolete value of Q. Hence, this write operation is
rejected, and Ti is rolled back.
 Otherwise, the write operation is executed, and W-
timestamp(Q) is set to TS(Ti).
Example Use of the Protocol
A partial schedule for several data items for transactions
with
timestamps 1, 2, 3, 4, 5
T1 T2 T3 T4 T5
read(X)
read(Y)
read(Y)
write(Y)
write(Z)
read(Z)
read(X)
abort
read(X)
write(Z)
abort
write(Y)
write(Z)
Correctness of Timestamp-Ordering
Protocol
 The timestamp-ordering protocol guarantees
serializability since all the arcs in the precedence
graph are of the form:

transaction transaction
with smaller with larger
timestamp timestamp

Thus, there will be no cycles in the precedence


graph
 Timestamp protocol ensures freedom from deadlock
as no transaction ever waits.
 But the schedule may not be cascade-free, and may
not even be recoverable.
Recoverability and Cascade
Freedom
 Problem with timestamp-ordering protocol:
 Suppose Ti aborts, but Tj has read a data item written by Ti
 Then Tj must abort; if Tj had been allowed to commit earlier,
the schedule is not recoverable.
 Further, any transaction that has read a data item written by
Tj must abort
 This can lead to cascading rollback --- that is, a chain of
rollbacks
 Solution:
 A transaction is structured such that its writes are all
performed at the end of its processing
 All writes of a transaction form an atomic action; no
transaction may execute while a transaction is being written
 A transaction that aborts is restarted with a new timestamp
Multiple Granularity
 Allow data items to be of various sizes and define a
hierarchy of data granularities, where the small
granularities are nested within larger ones
 Can be represented graphically as a tree (but don't
confuse with tree-locking protocol)
 When a transaction locks a node in the tree
explicitly, it implicitly locks all the node's
descendents in the same mode.
 Granularity of locking (level in tree where locking is
done):
 fine granularity (lower in tree): high concurrency, high
locking overhead
 coarse granularity (higher in tree): low locking
overhead, low concurrency
Example of Granularity Hierarchy

The highest level in the example hierarchy is the


entire database.
The levels below are of type area, file and record in
that order.
Deadlock Handling
 Consider the following two transactions:

T1: write (X) T2: write(Y)


write(Y) write(X)
 Schedule with deadlock

T1 T2

lock-X on X
write (X)
lock-X on Y
write (X)
wait for lock-X on X
wait for lock-X on Y
Deadlock Handling
 System is deadlocked if there is a set of transactions
such that every transaction in the set is waiting for
another transaction in the set.
 Deadlock prevention protocols ensure that the system
will never enter into a deadlock state. Some prevention
strategies :
 Require that each transaction locks all its data items
before it begins execution (predeclaration).
 Impose partial ordering of all data items and require that a
transaction can lock data items only in the order specified
by the partial order (graph-based protocol).
More Deadlock Prevention
Strategies

 Following schemes use transaction timestamps for


the sake of deadlock prevention alone.
 wait-die scheme — non-preemptive
 older transaction may wait for younger one to release
data item. Younger transactions never wait for older
ones; they are rolled back instead.
 a transaction may die several times before acquiring
needed data item
 wound-wait scheme — preemptive
 older transaction wounds (forces rollback) of younger
transaction instead of waiting for it. Younger
transactions may wait for older ones.
 may be fewer rollbacks than wait-die scheme.
Deadlock prevention (Cont.)
 Both in wait-die and in wound-wait schemes, a
rolled back transactions is restarted with its
original timestamp. Older transactions thus have
precedence over newer ones, and starvation is
hence avoided.
 Timeout-Based Schemes :
 a transaction waits for a lock only for a specified
amount of time. After that, the wait times out and the
transaction is rolled back.
 thus deadlocks are not possible
 simple to implement; but starvation is possible. Also
difficult to determine good value of the timeout
interval.
Deadlock Detection
 Deadlocks can be described as a wait-for graph, which
consists of a pair G = (V,E),
 V is a set of vertices (all the transactions in the system)
 E is a set of edges; each element is an ordered pair Ti Tj.
 If Ti  Tj is in E, then there is a directed edge from Ti to
Tj, implying that Ti is waiting for Tj to release a data
item.
 When Ti requests a data item currently being held by Tj,
then the edge Ti Tj is inserted in the wait-for graph.
This edge is removed only when Tj is no longer holding
a data item needed by Ti.
 The system is in a deadlock state if and only if the
wait-for graph has a cycle. Must invoke a deadlock-
detection algorithm periodically to look for cycles.
Deadlock Detection (Cont.)

Wait-for graph without a cycle Wait-for graph with a cycle


Deadlock Recovery
 When deadlock is detected :
 Some transaction will have to rolled back (made a
victim) to break deadlock. Select that transaction as
victim that will incur minimum cost.
 Rollback -- determine how far to roll back transaction
 Total rollback: Abort the transaction and then
restart it.
 More effective to roll back transaction only as far as
necessary to break deadlock.
 Starvation happens if same transaction is always
chosen as victim. Include the number of rollbacks in
the cost factor to avoid starvation
Failure Classification
 Transaction failure :
 Logical errors: transaction cannot complete due to
some internal error condition
 System errors: the database system must terminate an
active transaction due to an error condition (e.g.,
deadlock)
 System crash: a power failure or other hardware or
software failure causes the system to crash.
 Fail-stop assumption: non-volatile storage contents are
assumed to not be corrupted by system crash
 Database systems have numerous integrity checks
to prevent corruption of disk data
 Disk failure: a head crash or similar disk failure
destroys all or part of disk storage
 Destruction is assumed to be detectable: disk drives
use checksums to detect failures
Recovery Algorithms
 Recovery algorithms are techniques to ensure
database consistency and transaction atomicity
and durability despite failures
 Recovery algorithms have two parts
1. Actions taken during normal transaction processing to
ensure enough information exists to recover from
failures
2. Actions taken after a failure to recover the database
contents to a state that ensures atomicity,
consistency and durability
Log-Based Recovery
 A log is kept on stable storage.
 The log is a sequence of log records, and maintains a
record of update activities on the database.
 Undo (T): restore the values of all data items written
by transaction T to the old values in the reverse order
in which WRITES were recorded in the log.
 Redo (T): set the value of all data items written by
transaction T to the new values in the order in which
the WRITES were recorded in the log.
 Streamline recovery procedure by periodically
performing checkpointing
1. Output all log records currently residing in main memory
onto stable storage.
2. Output all modified buffer blocks to the disk.
3. Write a log record < checkpoint> onto stable storage.
4. During recovery we need to consider only the recent
transactions.
Example of Checkpoints

Tc Tf
T1
T2
T3
T4

checkpoint system failure

 T1 can be ignored (updates already output to disk due


to checkpoint)
 T2 and T3 redone.

 T4 undone
 System log: keeps track of all transaction operations
affecting the data values. It is kept on disk.
 Log entries:

 [start_-transaction,T]
 [write_-item, T, X, old_-value, new_-value]
 [read_-item, T, X]
 [commit, T], [abort, T]

 If X is modified then its corresponding log record is always


first actually written on the log in disk and then actually
written on the database (write-ahead logging).

 Commit point: the effect of all operations of the transaction


has been recorded in the log.
Recover from failure
 Undo: undo the effect of write operations by tracing
back the log.

 Redo: redo the effect of write operations by tracing


forward the log.

 Checkpoint: is written into the log periodically when


the system writes out to the DB all write operations
of committed transactions.
Deferred update
 do not update the database until after transaction
reaches itscommit point.
 before commit, all transaction updates are recorded
in the buffer.

 during commit, the updates are first recorded in the


log and then written to the database.
 No undo is needed. Redo may be necessary.
Immediate update
 database may be updated by transaction before it
reaches its commit point.
 operations are typically recorded in the log before
they are applied to the database.
 if a transaction fails after recording some changes
in the database
 but before reaching its commit point, the effect of
its operations on
 the database must be undone.
 Both undo and redo may be necessary.

You might also like