0% found this document useful (0 votes)
62 views25 pages

Steal Force

This document provides an overview of the ARIES recovery algorithm used in database systems. It discusses how ARIES uses write-ahead logging and three phases (analysis, redo, undo) to recover from system crashes while allowing for high concurrency. The analysis phase determines the necessary redo starting point and dirty pages. The redo phase replays log records to restore the database state. The undo phase removes effects of uncommitted transactions by writing compensating log records. ARIES achieves flexibility through physiological logging at the record level within database pages.

Uploaded by

Hari C
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
62 views25 pages

Steal Force

This document provides an overview of the ARIES recovery algorithm used in database systems. It discusses how ARIES uses write-ahead logging and three phases (analysis, redo, undo) to recover from system crashes while allowing for high concurrency. The analysis phase determines the necessary redo starting point and dirty pages. The redo phase replays log records to restore the database state. The undo phase removes effects of uncommitted transactions by writing compensating log records. ARIES achieves flexibility through physiological logging at the record level within database pages.

Uploaded by

Hari C
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

Introduction to Database Systems

CSE 444

Lectures 11-12
Transactions: Recovery (Aries)

Magda Balazinska - CSE 444, Fall 2010 1


Readings
•  Material in today’s lecture NOT in the book

•  Instead, read Sections 1, 2.2, and 3.2 of:


Michael J. Franklin. Concurrency Control and
Recovery. The Handbook of Computer
Science and Engineering, A. Tucker, ed.,
CRC Press, Boca Raton, 1997.

Magda Balazinska - CSE 444, Fall 2010 2


Transaction Management
Two parts:

•  Recovery from crashes: ACID


•  Concurrency control: ACID

Both operate on the buffer pool

Our current focus: recovery


Magda Balazinska - CSE 444, Fall 2010 3
READ Buffer Manager
WRITE Page requests from higher-level code
Files and access methods
Buffer pool manager
Buffer pool
Disk page
Main
Free frame memory

INPUT choice of frame dictated


by replacement policy
OUTUPT
Disk space manager
Disk = collection
of blocks Disk 1 page corresponds
to 1 disk block
•  Data must be in RAM for DBMS to operate on it! 4
•  Buffer pool = table of <frame#, pageid> pairs
Buffer Manager Policies
•  STEAL or NO-STEAL
–  Can an update made by an uncommitted transaction overwrite
the most recent committed value of a data item on disk?

•  FORCE or NO-FORCE
–  Should all updates of a transaction be forced to disk before the
transaction commits?

•  Easiest for recovery: NO-STEAL/FORCE


•  Highest performance: STEAL/NO-FORCE
Magda Balazinska - CSE 444, Fall 2010 5
Comparison Undo/Redo
•  Undo logging:
Steal/Force
–  OUTPUT must be done early
–  If <COMMIT T> is seen, T definitely has written all its data to
disk (hence, don’t need to redo) – inefficient
•  Redo logging
–  OUTPUT must be done late No-Steal/No-Force
–  If <COMMIT T> is not seen, T definitely has not written any
of its data to disk (hence there is not dirty data on disk, no
need to undo) – inflexible
•  Would like more flexibility on when to OUTPUT:
undo/redo logging (next)
Steal/No-Force

Magda Balazinska - CSE 444, Fall 2010 6


Aries Recovery Algorithm
•  An UNDO/REDO log with lots of clever details

Magda Balazinska - CSE 444, Fall 2010 7


Write-Ahead Log
•  Enables the use of STEAL and NO-FORCE
•  Log: append-only file containing log records
•  For every update, commit, or abort operation
–  Write physical, logical, or physiological log record (more later)
–  Note: multiple transactions run concurrently, log records are
interleaved
•  After a system crash, use log to:
–  Redo some transaction that did commit
–  Undo other transactions that didn’t commit

Magda Balazinska - CSE 444, Fall 2010 8


Write-Ahead Log
•  All log records pertaining to a page are written to disk
before the page is overwritten on disk

•  All log records for transaction are written to disk


before the transaction is considered committed
–  Why is this faster than FORCE policy?

•  Committed transaction: transactions whose commit


log record has been written to disk

Magda Balazinska - CSE 444, Fall 2010 9


Granularity in ARIES
•  Page-oriented logging for REDO (element=one page)
•  Logical logging for UNDO (element=one record)
•  Result: logs logical operations within a page
•  This is called physiological logging
•  Why this choice?
–  Must do page-oriented REDO since cannot guarantee that db is in
an action-consistent state after crash
–  Must do logical undo because ARIES will only undo loser
transactions (this also facilitates ROLLBACKs)

Magda Balazinska - CSE 444, Fall 2010 10


ARIES Method
Recovery from a system crash is done in 3 passes:
1.  Analysis pass
–  Figure out what was going on at time of crash
–  List of dirty pages and active transactions
2.  Redo pass (repeating history principle)
–  Redo all operations, even for transactions that will not commit
–  Get back to state at the moment of the crash
3.  Undo pass
–  Remove effects of all uncommitted transactions
–  Log changes during undo in case of another crash during undo

Magda Balazinska - CSE 444, Fall 2010 11


ARIES Method Illustration
May be in
reverse order

[Franklin97]
Magda Balazinska - CSE 444, Fall 2010 12
ARIES Data Structures
•  Active Transactions Table
–  Lists all running transactions (active transactions)
–  For each txn: lastLSN = most recent update by transaction
•  Dirty Page Table
–  Lists all dirty pages
–  For each dirty page: recoveryLSN (recLSN)= first LSN that
caused page to become dirty
•  Write Ahead Log contains log records
–  LSN, prevLSN = previous LSN for same transaction
–  other attributes

Magda Balazinska - CSE 444, Fall 2010 13


ARIES Data Structures
Dirty pages Log
pageID recLSN LSN prevLSN transID pageID Log entry
P5 102 101 - T100 P7
P6 103 102 - T200 P5
P7 101 103 102 T200 P6
104 101 T100 P5

Active transactions Buffer Pool


transID lastLSN
T100 104
T200 103
P5 P6 P7
PageLSN=104 PageLSN=103 PageLSN=101
14
The LSN
•  Each log entry receives a unique Log
Sequence Number, LSN
–  The LSN is written in the log entry
–  Entries belonging to the same transaction are
chained in the log via prevLSN
–  LSN’s help us find the end of a circular log file:

After crash, log file = (22, 23, 24, 25, 26, 18, 19, 20, 21)
Where is the end of the log ? 18

Magda Balazinska - CSE 444, Fall 2010 15


ARIES Method Details
•  Steps under normal operations
–  Add log record
–  Update transactions table
–  Update dirty page table
–  Update pageLSN

Magda Balazinska - CSE 444, Fall 2010 16


ARIES Method
•  More details and long example on the board

•  Please TAKE NOTES!

Magda Balazinska - CSE 444, Fall 2010 17


Checkpoints
•  Write into the log
–  Entire active transactions table
–  Entire dirty page table

•  Very fast ! No waiting, no END CKPT

•  But, effectiveness is limited by dirty pages


–  There is a background process that periodically
sends dirty pages to disk
Magda Balazinska - CSE 444, Fall 2010 18
1. Analysis Phase
•  Goal
–  Determine point in log where to start REDO
–  Determine set of dirty pages when crashed
•  Conservative estimate of dirty pages
–  Identify active transactions when crashed

•  Approach
–  Rebuild active transactions table and dirty pages table
–  Reprocess the log from the beginning (or checkpoint)
•  Only update the two data structures
–  Compute: firstLSN = smallest of all recoveryLSN

Magda Balazinska - CSE 444, Fall 2010 19


1. Analysis Phase
Log Checkpoint (crash)

firstLSN
Replay
Dirty pages history

Active
transactions

Magda Balazinska - CSE 444, Fall 2010 20


2. Redo Phase
Main principle: replay history
•  Process Log forward, starting from firstLSN
•  Read every log record, sequentially
•  Redo actions are not recorded in the log
•  Needs the Dirty Page Table

Magda Balazinska - CSE 444, Fall 2010 21


2. Redo Phase: Details
For each Log entry record LSN
•  If affected page is not in Dirty Page Table then
do not update
•  If recoveryLSN > LSN, then no update
•  Read page from disk;
If pageLSN > LSN, then no update
•  Otherwise perform update

Magda Balazinska - CSE 444, Fall 2010 22


3. Undo Phase
Main principle: “logical” undo
•  Start from the end of the log, move
backwards
•  Read only affected log entries
•  Undo actions are written in the Log as special
entries: CLR (Compensating Log Records)
•  CLRs are redone, but never undone

Magda Balazinska - CSE 444, Fall 2010 23


3. Undo Phase: Details
•  “Loser transactions” = uncommitted transactions in
Active Transactions Table
•  ToUndo = set of lastLSN of loser transactions
•  While ToUndo not empty:
–  Choose most recent (largest) LSN in ToUndo
–  If LSN = regular record: undo; write a CLR where
CLR.undoNextLSN = LSN.prevLSN; if LSN.prevLSN not null,
insert in ToUndo otherwise, write <END TRANSACTION> in log
–  If LSN = CLR record: (don’t undo !)
if CLR.undoNextLSN not null, insert in ToUndo
otherwise, write <END TRANSACTION> in log

Magda Balazinska - CSE 444, Fall 2010 24


Handling Crashes during Undo

[Figure 4 from Franklin97]

Magda Balazinska - CSE 444, Fall 2010 25

You might also like