Lecture 14
Lecture 14
1
Overview
• File Organization
• Files of records
• Page Formats
• Record Formats
• Indexing
• Memory hierarchy
• Buffer management
Files
• FILE: A collection of pages, each containing a collection of records.
• Must support:
• insert/delete/modify record
• read a particular record (specified using record id)
• scan all records (possibly with some conditions on the records to be retrieved)
Alternative File Organization
• Several options (w/ trade-offs):
• Heap files: Suitable when typical access is a file scan retrieving all records.
• Sorted Files:
Later
• Index File Organizations:
Heap File using Lists
• The header page id and heap file name must be stored someplace.
• Each page contains 2 `pointers’ plus data.
Heap File using Lists
• The header page id and Heap file name must be stored someplace.
• Each page contains 2 `pointers’ plus data.
Any problems?
Heap File Using a Page Directory
✔ insertions
✔ deletions
Variable Length Records
• Slotted Page
• pack them
• keep ptrs to them
• rec-id = <page-id, slot#>
• mark start of free space
Record Formats: Fixed Length
• Information about field types same for all records in a file; stored in
system catalogs.
• Finding i’th field done via arithmetic.
Record Formats
• Fixed length records: straightforward - store info in catalog
• Variable length records: encode the length of each field?
• Store the length
• Use delimiter
Variable Length records
• Two alternative formats (# fields is fixed):
More popular!
Overview
• File Organization
• Files of records
• Page Formats
• Record Formats
• Indexing
• Memory hierarchy
• Buffer management
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
The Storage Hierarchy
• Main memory (RAM) for currently
used data.
• Disk for the main database
(secondary storage).
• Tapes for archiving older versions of
the data (tertiary storage).
Overview
• File Organization
• Files of records
• Page Formats
• Record Formats
• Indexing
• Memory hierarchy
• Buffer management
Motivation of Initial RDBMS architecture: Disk
is relatively VERY slow
• READ: disk-> main memory (RAM)
• WRITE: main memory (RAM) -> disk
• Both are high-cost operations, relative to in-memory operations, so
must be planned carefully
Rules of Thumbs
• Memory access much faster than disk I/O (~ 1000x)
• “Sequential” I/O faster than “random” I/O (~ 10x)
• seek time: moving arms to position disk head on track
• rotational delay: waiting for block to rotate under head
Dominating
• transfer time: actually moving data to/from disk surface
• SSD?
• Similar sequential and random
• Reading is much faster than writing
Disk Arrays: RAID (Redundant Array of
Inexpensive Disks)
OS virtual memory
OS file system
Can we leverage OS for DB storage
management?
• Unfortunately, OS often gets in the way of DBMS
• DBMS needs to do things “its own way”
• Control over buffer replacement policy
• LRU not always best (some times worst!)
• Control over flushing data to disk
• Write-ahead logging (WAL) protocol requires flushing log entries to disk
Overview
• File Organization
• Files of records
• Page Formats
• Record Formats
• Indexing
• Memory hierarchy
• Buffer management
Organize Disk Space into Pages
• A table is stored as one or more files, a file contains one or more
pages
• Higher levels call upon this layer to:
• allocate/de-allocate a page
• read/write a page
• Best if requested pages are stored sequentially on disk! Higher levels
don’t need to know if/ how this is done, nor how free space is
managed.
Buffer Management
Pinned or
Unpinned
Buffer Management
• Data must be in RAM for DBMS to operate on it!
• Buffer Mgr hides the fact that not all data is in RAM
When a Page is Requested ...
• Buffer pool information table contains: NOT FOUND <?,?,?>
• If requested page is not in pool and the pool is not full:
• Read requested page into chosen frame
• Pin the page and return its address
• If requested page is not in pool and the pool is full:
• Choose an (un-pinned) frame for replacement
• If frame is “dirty”, write it to disk
• Read requested page into chosen frame
• Pin the page and return its address
• Buffer pool information table now contains: