Database Management System Chapter 3
Database Management System Chapter 3
Management Systems
Database Storage
Database Storage
• How a disk-oriented DBMS manages its memory and move data back-
and-forth from disk?
Database Storage
• Spatial Control (space):
• Where to write pages on disk.
• The goal is to keep pages that are used together often as physically close
together as possible on disk.
• Temporal Control (Time):
• When to read pages into memory, and when to write them to disk.
• The goal is minimize the number of stalls from having to read data from disk.
Topics
• Buffer Pool Manager
• Replacement Policies
• Allocation Policies
Buffer Pool
• Memory region organized as an array of fixed-size pages.
• An array entry is called a frame.
• When the DBMS requests a page, an exact copy is placed into one of
these frames.
Buffer Pool Meta-Data
• The page table keeps track of pages that are currently in memory.
• Also maintains additional meta-data per page:
• Dirty Flag
• Pin/Reference Counter
Page Table vs. Page Directory
• The page directory is the mapping from page ids to page locations in
the database files.
• All changes must be recorded on disk to allow the DBMS to find on restart.
• The page table is the mapping from page ids to a copy of the page in
buffer pool frames.
• This is an in-memory data structure that does not need to be stored on disk,
why?
OS Page Cache
• Most disk operations go through the OS API.
• Unless you tell it not to, the OS maintains its own filesystem cache.
• Most DBMSs use direct I/O (O_DIRECT)to bypass the OS's cache:
Why?
OS Page Cache
• Most disk operations go through the OS API.
• Unless you tell it not to, the OS maintains its own filesystem cache.
• Most DBMSs use direct I/O (O_DIRECT)to bypass the OS's cache:
• Redundant copies of pages.
• Different eviction policies.
Topics
• Buffer Pool Manager
• Replacement Policies
• Allocation Policies
Buffer Replacement Policies
• When the DBMS needs to free up a frame to make room for a new
page, it must decide which page to evict from the buffer pool.
• Goals:
• Correctness
• Accuracy
• Speed
• Meta-data overhead
FIFO
• A queue, First in first out
Least-Recently Used (LRU)
• Maintain a timestamp of when each page was last accessed.
• When the DBMS needs to evict a page, select the one with the oldest
timestamp.
• Keep the pages in sorted order to reduce the search time on eviction.
FIFO vs. LRU
Clock
• Approximation of LRU without needing a separate timestamp per
page.
• Each page has a reference bit.
• When a page is accessed, set to 1.
• Organize the pages in a circular buffer with a
"clock hand":
• Upon sweeping, check if a page's bit is set to 1.
• If yes, set to zero. If no, then evict.
Problems
• LRU and CLOCK replacement policies are susceptible to sequential
flooding.
• A query performs a sequential scan that reads every page.
• This pollutes the buffer pool with pages that are read once and then never
again.
• The most recently used page is actually the most unneeded page.
• Ex:
Better Policy: LRU-K
• Take into account history of the last K references as timestamps and
compute the interval between subsequent accesses.
• The DBMS then uses this history to estimate the next time that page
is going to be accessed.
Topics
• Buffer Pool Manager
• Replacement Policies
• Allocation Policies
Allocation Policies
• Global Policies:
• Make decisions for all active txns.
• Local Policies:
• Allocate frames to a specific txn without considering the behavior of
concurrent txns.
• Still need to support sharing pages.