0% found this document useful (0 votes)
21 views45 pages

Adbms CH 1.c

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views45 pages

Adbms CH 1.c

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 45

Database Recovery Techniques

Database Recovery Techniques

Recovery from transaction failures means that the database is


restored to the most recent consistent state just before the failure.
It is a service provided by the DBMS to ensure that the database is
reliable and remains in a consistent state in the presence of failure.
Failure Classification

 System crash: hardware, software, network error occurs during

transaction execution.
 Transaction error: some operation [e.g. Div by 0] in the transaction

may cause it to fail.


 Exception conditions: e.g. Data not found.
 Concurrency control enforcement: e.g. serializability, deadlock.
 physical problems (theft) e.t.c
Concepts Used in Recovery

The System Log (Audit trail or DBMS journal)


 A log is used to keep track of all transaction operations that
affect the values of database items.
 The log is a sequential, append-only file that is kept on disk, so
it is not affected by any type of failure except for disk or
catastrophic failure.
Cont’d

 Main memory buffers hold the last part of the log file
 The log entries are first added to the main memory buffer. When
the main memory(MM) log buffer is filled, or when certain other
conditions occur, the MM log buffer is appended to the end of
the log file on disk.
 In addition, the log file from disk is periodically backed up to
archival storage (tape) to guard against catastrophic failures.
Types of Entries (Log Records)

[start_transaction, T] Indicates that transaction T has started execution.

[write_item, T, X, old_value, Indicates that transaction T has changed the value


new_value] of database item X from old_value to new_value.

[read_item, T, X] Indicates that transaction T has read the value of


database item X.
[commit, T] Indicates that transaction T has completed
successfully, and confirms that its effect can be
committed (recorded permanently) to the database.

[abort, T] Indicates that transaction T has been aborted.


Cont’d…

 Recovery from a transaction failure amounts to either undoing or


redoing transaction operations individually from the log.
 Undo the effect of these WRITE operations of a transaction T by
tracing backward through the log and resetting all items changed
by a WRITE operation of T to their old_values.
 Redo of an operation may also be necessary if a transaction has its
updates recorded in the log but a failure occurs before the system
can be sure that all these new_values have been written to the
actual database on disk from the main memory buffers.
Commit Point of a Transaction

A transaction T reaches its commit point when all its operations that access the
database have been executed successfully and the effect of all the transaction
operations on the database have been recorded in the log
- If a system failure occurs, we can search back in the log for all transactions T that
have written a [start_transaction, T] record into the log but have not written their
[commit, T] record yet; these transactions may have to be rolled back to undo their
effect on the database during the recovery process.
- Transactions that have written their commit record in the log must also have
recorded all their WRITE operations in the log, so their effect on the database can
be redone from the log records.
Commit Point of a Transaction

 One or more blocks of the log file is kept in main memory buffers
called the log buffer,
 The log buffer entries is write back to disk only once when the log
buffer filled with log entries, rather than writing to disk every time
a log entry is added, i.e saving the overhead of multiple disk writes.
 Therefore, before a transaction reaches its commit point, any portion
of the log that has not been written to the disk yet must now be
written to the disk.
 This process is called force-writing the log buffer before
committing a transaction.
Caching (Buffering) of Disk Blocks

 Cache the disk pages (blocks)–containing database items to be


updated into main memory buffers.
 A directory for the cache is used to keep track of which DB items
are in the buffers.
• A table of <disk page address, buffer location> entries.
• The DBMS cache holds the database disk blocks including
o Data blocks
o Index blocks
o Log blocks
Caching (Buffering) of Disk Blocks

 When DBMS requests action on some item


• First checks the cache directory to determine if the
corresponding disk page is in the cache.
• If no, the item must be located on disk and the appropriate
disk pages are copied into the cache.
• It may be necessary to replace (flush) some of the cache
buffers to make space available for the new item.
Dirty bit

 Associate with each buffer in the cache.


 It indicates whether or not the buffer has been modified.
 Set dirty bit=0 when the page is first read from disk to the buffer
cache.
 Set dirty bit=1 as soon as the corresponding buffer is modified.
 When the buffer content is replaced –flushed- from the cache,
write it back to the corresponding disk page only if dirty bit=1
 The dirty bit can be included in the directory entry.
Cont’d

 Strategies that can be used when flushing occurs.


1. In-place updating
 Writes the buffer back to the same original disk location
overwriting the old value on disk.
2. Shadowing
 Writes the updated buffer at a different disk location.
 Multiple versions of data items can be maintained.
 The old value called BFIM –before image, & the new value
AFIM –after image are kept on disk, so no need of log for
recovery
Cont’d
Pin-unpin bit
 if a page is cannot be written back to disk as yet. then
the page is pinned i.e. pin-unpin bit value=1.
Checkpoints

 In case of failure, the recovery manager requires that the entire


log be examined to process recovery which is a time consuming
one.
 A quick way to limit the amount of log to scan on recovery can
be achieved using checkpoints.
 A [checkpoint] record is written into the log periodically at that
point when the system writes out all DBMS buffers that have
been modified to the database on disk.
Cont’d
 Hence, all transactions with [commit, T] entry in the log before
[checkpoint] entry do not need to have their WRITE operations
redone in case of crash.
 The recovery manager must decide at what intervals (measured in
time) to take a checkpoint.
Cont’d
 Take a checkpoint consists of the following:
 Suspend execution of transactions temporarily.
 Force-write all main memory buffers that have been modified
to disk.
 Write a [checkpoint] record to the log, and force-write the log to
disk.
 Resume executing transactions.
Database Recovery Techniques

Recovery algorithms are techniques to ensure transaction atomicity


and durability despite failures.
 The recovery subsystem, using recovery algorithm, ensures
atomicity by undoing the actions of transactions that do not
commit and durability by making sure that all actions of
committed transactions survive even if failures occur.
Two main approaches in recovery process
Log-based recovery using WAL protocol.
Shadow-paging.
Write-Ahead Logging (WAL)

 Write-Ahead Logging (WAL) protocol ensures that a record


entry- of every change to the DB is available while attempting
to recover from a crash.
 WAL ensures that the BFIM of the data item is recorded in the
appropriate log entry and that the log entry is flushed to disk
before the BFIM is overwritten with the AFIM in the database
on disk.
Cont’d

 Suppose that the BFIM of a data item on disk has been


overwritten by the AFIM on disk and a crash occurs.
 Without ensuring that this BFIM is recorded in the
appropriate log entry and the log is not flushed to disk before
the BFIM is overwritten with the AFIM in the DB on disk,
the recovery will not be possible.
Cont’d

 Suppose a transaction made a change and committed with some of


its changes not yet written to disk.
 Without a record of these changes written to disk, there would be
no way to ensure that the changes of the committed transaction
survive crashes
Cont’d

 When a log record is written, it is stored in the current log in the


DBMS cache and after it written to disk as soon as is feasible.
 With Write-Ahead Logging, when the a particular data block
update, the log blocks must be first written to disk before the data
block itself can be written back to disk.
 IBM DB2, Informix, Microsoft SQL Server, Oracle 8, and Sybase
ASE all use a WAL scheme for recovery.
Cont’d

 To facilitate the recovery process, the DBMS recovery subsystem


may need to maintain a number of lists.
 List of active transactions: transactions started but not
committed yet.
 List of committed transactions since last checkpoint.
 List of aborted transactions since last checkpoint.
Steal/no-steal-- Force/no-force Approaches

 No-steal approach
 A cache page updated by a transaction cannot be written to
disk before the transaction commits.
 Deferred update follows this approach.
 The pin-unpin bit indicates if a page cannot be written
back to disk
Cont’d

 Steal approach
 An updated buffer can be written to disk before the
transaction commits.
 Used when the buffer manager replaces an existing page in
the cache, that has been updated by a transaction not yet
committed, by another page requested by another transaction.
 Advantage: avoid the need for a very large buffer space.
Cont’d

 Force approach
 All pages updated by a transaction are immediately written
to disk before the transaction commits.
Cont’d

 No-Force approaches
 All pages updated by a transaction are not immediately
written to disk when the transaction commits.
 Advantage: an updated page of a committed transaction may
be still in the buffer when another transaction needs to update
it; saving I/O cost.
Main Recovery Techniques

1.Deferred Update Techniques.


 A transaction cannot change the database on disk until it reaches its
commit point.
 A transaction does not reach its commit point until all its update
operations are recorded in the log and the log is force-written to
disk –i.e. WAL.
 During commit, the updates are first recorded in the log and then
written to the DB.
Cont’d

 If a transaction fails before reaching its commit point, no


UNDO is needed because the transaction has not affected the
database on disk in any way.
 If there is a crash, it may be necessary to REDO the effects of
committed transactions from the Log because their effect may
not have been recorded in the database.
 Deferred update also known as NO-UNDO/REDO algorithm.
Recovery Using Deferred update on Multiuser Environment

Assume
- Log includes checkpoints.
- Strict 2PL concurrency control protocol is used.
1. Identify two lists for a transaction. A list of committed
transaction since last checkpoint and a list of active
transactions.
2. Apply the REDO operation to all the write_item operations of
the committed transaction from the log in the order in which
they were written to the log.
Cont’d

3. For uncommitted active transaction T, a log entry is made as


[abort,T] & restart the active transactions either automatically by
the recovery process or manually by the user.

4. Make the NO-UNDO/REDO algorithm more efficient by only


REDO the last update of X.
Advantages of Deferred update Techniques

 A transaction does not record any changes in the DB on disk until it


commits & so never rollback because of transaction failure during
transaction execution.
 A transaction will never read the value of an item that is written by
an uncommitted transaction; hence no cascading rollback will occur.
Drawbacks of Deferred update Techniques

 Limits the concurrent execution of transactions because all


items remain locked until the transaction reaches its commit
point –due to strict 2PL.
 Require excessive buffer space to hold all updated items until
the transactions commit
Cont’d
2. Immediate Update Techniques

 In these techniques, the DB on disk can be updated


immediately without any need to wait for the transaction to
reach its commit point.
 However, the update operation must still be recorded in the log
(on disk) before it is applied to the database
 Using WAL protocol- so that we can recover in case of failure.
 If a transaction fails before reaching commit point, it must be
rolled back by undoing the effect of its operations on the DB.
Recovery Using Immediate Update in a Multiuser User Environment

Assume
- Log includes checkpoints.
- Strict 2PL concurrency control protocol is used.

1. Identify two lists for a transaction. A list of committed


transaction since last checkpoint and a list of active
transactions.
Cont’d
2. Undo all write_item operations of the active transaction from
the log, in the reverse of the order in which they were written into
the log & writes a log record [abort,T].

3. Redo the write_item operations of the committed transactions


from the log, in the order in which they were written in the log.
Cont’d

4. REDO is more efficiently done by starting from the end of the log
and redoing only the last update of each item X.

5. Whenever an item is redone, it is added to a list of redone items


and is not redone again.
3. Shadow Paging

 The DB is made up of ‘n’ fixed-size disk pages –blocks.


 A directory with ‘n’ entries where the ith entry points to the ith
DB page on disk.
 All references –reads or writes- to the DB pages on disk go
through the directory.
 The directory is kept in main memory if not too large.
 The current directory entries point to the most recent or
current DB pages on disk.
Cont’d

 When a transaction begins executing, the current directory is


copied into a shadow directory and the shadow directory is
saved on disk.
 During transaction execution, all updates are performed using
the current directory and the shadow directory is never
modified.
Cont’d
 When a write_item operation is performed
 A new copy of the modified DB page is created and the old
copy is not overwritten. Two versions of the page updated by
the transaction are kept.
 The new page is written elsewhere on some unused disk block
 The current directory entry is modified to point to the new
disk block.
 The shadow directory is not modified.
Cont’d
Cont’d

To recover from a failure:


o Delete the modified database pages & discard the current
directory.
o The state of the database before transaction execution is
available through the shadow directory and is recovered by
reinstating the shadow directory.
o NO-UNDO/NO-REDO technique since neither undoing or
redoing of data items
Cont’d
Drawbacks
 The updated pages change location on disk. Difficult to
keep related DB pages close together on disk without
complex storage management strategies.
 The overhead of writing shadow directories to disk as
transactions start is very high.
 A complicated garbage collection when a transaction
commits e.t.c
Thank You!

You might also like