0% found this document useful (0 votes)
52 views121 pages

Database Files

The document discusses the architecture of a relational database management system (RDBMS). It describes how an SQL query is parsed, optimized, and executed using relational operators to operate on files and indexes. The RDBMS manages memory buffers and disk space to provide an abstraction of data operating in main memory while the physical data is stored on disk in pages. Key issues for RDBMS are concurrency control and recovery.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
52 views121 pages

Database Files

The document discusses the architecture of a relational database management system (RDBMS). It describes how an SQL query is parsed, optimized, and executed using relational operators to operate on files and indexes. The RDBMS manages memory buffers and disk space to provide an abstraction of data operating in main memory while the physical data is stored on disk in pages. Key issues for RDBMS are concurrency control and recovery.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 121

Cow book (3rd): Chapters 8,9

Chapter 1 – DBMS
DATA STORAGE
DISKS, BUFFERS, FILES

COMP7104-DASC7104 1
Recall: traditional RDBMS
Structured data (relational), centralized, disk-based, OLTP workloads, ACID

COMP7104 - DASC7104 2
Recall: RDBMS anatomy
SQL Client
Completed

Query Parsing
& Optimization
We will unpack a data system
Relational Operators
and explored modular system
design. Files and Index Management
Database
Management
Buffer Management
System
Disk Space Management

You are here


Database
COMP7104 - DASC7104 3
Architecture of a RDBMS
SQL Client
SQL
(by examples, HW1)

Today and next lectures:


RDBMS
How is an SQL query run?

Database
COMP7104-DASC7104 4
Example schema from the cow book
• Sailors (sid: integer, sname: string, rating: integer, age: real)
• Boats (bid: integer, bname: string, color:string)
• Reserves (sid: integer, bid: integer, day: dates, rname: string)

COMP7104-DASC7104 5
Architecture of a RDBMS
Parse, check (syntax), and verify the query
SQL Client

SELECT age, count(*)


FROM Sailors R, Reserves R Query Parsing
& Optimization
WHERE S.sid = R.sid and S.age > 30
GROUP BY age
and translate into an efficient
relational query plan
GroupBy (Age) RDBMS

Indexed Join

Heap Scan Indexed Scan Database


Reserves Sailors 6
COMP7104-DASC7104
Architecture of a DBMS
SQL Client
Execute a dataflow by operating
on records and files
Query Parsing
& Optimization

GroupBy (Age) Relational Operators

Indexed Join

RDBMS
Heap Scan Indexed Scan
Reserves Sailors

Database
COMP7104-DASC7104 7
Architecture of a DBMS
Organizing tables and records as
groups of pages in a logical file SQL Client
SSN Last First Age Salary
Name Name

123 Adams Elmo 31 $400 Query Parsing


443 Grouch Oscar 32 $300
& Optimization
244 Oz Bert 55 $140

134 Sanders Ernie 55 $400


Relational Operators

Page
Files and Index Management
Header

RDBMS

Page Page
Header Header

Database
COMP7104-DASC7104 8
Architecture of a DBMS
Illusion of operating in main memory
SQL Client

Page 1 Page 2 Page 3 Query Parsing


& Optimization

Relational Operators

RAM
Database
Files and Index Management
Management
Buffer Management

RDBMS
Disk Space Management

Database
COMP7104-DASC7104 9
Architecture of a DBMS
Translating page requests into physical
SQL Client
addresses on disk(s)

Query Parsing
& Optimization
Block 1 Block 2 Block 3 Relational Operators

Files and Index Management


Database
Management
Buffer Management
System
Disk Space Management

Database
COMP7104-DASC7104 10
Recall: Abstraction at each level
Query Parsing
& Optimization
What à How

Relational Operators How à Dataflow on files

Files and Index Management Files à Blocks in main memory


Database
Management
Buffer Management Memory Blocks à Disk pages
System
Disk Space Management Pages on disk à Bytes

You are here


Each level hiding the
Database
complexity of the next

COMP7104 - DASC7104 11
Architecture of a DBMS
• Organized in layers SQL Client

• Each layer abstracts the layer below Query Parsing


– Manage complexity & Optimization

– Performance assumptions Relational Operators

Files and Index Management


• Good systems design ! Database
Management
Buffer Management
System
Disk Space Management

Database
COMP7104-DASC7104 12
Architecture of a DBMS
• Two crucial issues related to SQL Client
storage and memory
management in RDBMS
Query Parsing
& Optimization

Relational Operators

Files and Index Management


Concurrency Control Database
Management
Buffer Management
Recovery
System
Disk Space Management

Database
COMP7104-DASC7104 13
Architecture of a DBMS
Completed SQL Client
Completed

Query Parsing
& Optimization

Relational Operators

Files and Index Management


Database
Today’s lecture Management
Buffer Management
System
Disk Space Management

You are here


Database
COMP7104-DASC7104 14
FILE REPRESENTATION

COMP7104-DASC7104 15
Overview: representations
Record SSN Last First Age Salary
Name Name

123 Adams Elmo 31 $400


Bob Harmon M 32 400 443 Grouch Oscar 32 $300

Char Int 244 Oz Bert 55 $140


Varchar Varchar Int
134 Sanders Ernie 55 $400

Byte Representation of Record


File
94703

Header M
3
2 Bob Harmon

Page 1 Page 2
Slotted Page
Page
Header Page 3 Page 4

Page 5 Page 6

COMP7104-DASC7104 16
Files of pages of records
• Tables stored as logical files consisting of pages each containing a
collection of records
• Pages are managed
– on disk by the disk space manager: pages read / written to physical
disk/files
• subject of today’s lecture
– in memory by the buffer manager: higher levels of DBMS only
operate in memory
• coming soon
• Main ideas in this lecture
– Block / page – granularity of reasoning (because of disk!)
– Exploit access patterns in memory management
– Efficient binary representations of data
COMP7104-DASC7104 17
Page Page
Header Header

Page Page
Header Header

Page Page
Header Header

COMP7104-DASC7104 18
DISK SPACE MANAGEMENT

COMP7104-DASC7104 19
Disks and files
• Yes, most DBMS stores information on hard disks and SSDs.
– Recall: hard disk are a mechanical anachronism (slow!)
– Recall: SSDs faster, still slow relative to main memory, costly writes

• DBMS interfaces to storage at block Level


– Read and write large chunks of sequential bytes
– Sequentially: “next” disk block is fastest
– Maximize usage of data per read/write
• “Amortize” seek delays (HDDs) and writes (SSDs)
– Predict future behavior
• Cache popular blocks
• Pre-fetch likely-to-be-accessed blocks

COMP7104-DASC7104 20
Note on terminology
• Block = Unit of transfer for disk read/write
– 64KB – 128KB is a good number today
– Cow book says 4KB

• Page = fixed size contiguous chunk of memory


– Assume same size as block
– Refer to corresponding blocks on disk

• For simplicity, we use block and page interchangeably

COMP7104-DASC7104 21
Arranging blocks on disk
• Next block concept (for locality):
– sequential blocks on same track, followed by
– blocks on same cylinder, followed by
– blocks on adjacent cylinder

• Arrange file pages sequentially by next on disk


Spindle
– This will minimize seek and rotational delays Disk head

• For a sequential scan, pre-fetch


– several blocks at a time!
Arm movement

• Read large consecutive blocks Tracks

Arm assembly
COMP7104-DASC7104 22
Disk space management SQL Client

Query Parsing
• Lowest layer of DBMS, manages space on disk & Optimization

Relational Operators
– Mapping pages to locations on disk
– Loading pages from disk to memory Files and Index Management
Database
Management
– Saving pages back to disk & ensuring writes Buffer Management
System
Disk Space Management

• Higher levels call upon this layer to: You are here
Database
– Read/write a page
– Allocate/de-allocate logical pages

• Request for a sequence of pages best satisfied by pages stored


sequentially on disk
– Physical details hidden from higher levels of system
– Higher levels may “safely” assume next page is fast, so they will
simply expect sequential runs of pages to be quick to scan. 23
COMP7104-DASC7104
Quick quizz
• Which of the following aspects is NOT a benefit of sequential
layout of data on disk?

A. Fast access to sequentially ordered pages


B. Amortization of seek cost for large sequential writes
C. Adjustable page sizes for the wider outer tracks
D. Ability to pre-fetch pages for sequential scans
E. All of them

A B C D E

24
COMP7104-DASC7104
Disk space management: implementation

• Proposal 1: Talk to the storage device directly


– Could be very fast if you knew the device well
– What happens when devices change ?

• Proposal 2: Run over filesystem (FS)


– Allocate single large “contiguous” file on an empty disk region,
and assume sequential/nearby byte access are fast
– Most FS optimize disk layout for sequential access
• Gives more or less what we want if we start with an empty disk
– DBMS “file” may span multiple FS files on multiple disks/machines

COMP7104-DASC7104 25
Using local filesystem
Get Page 4 Get Page 5

Disk Space Management

Big File 1 Big File 2 Big File 3

Page 1 Page 2 Page 3 Page 4 Page 5 Page 6 Page 7 Page 8 Page 9

File System File System File System

COMP7104-DASC7104 26
Disk space management
• Provide API to read and
write pages to device

• Pages: block level


Disk Space Management
organization of bytes on disk
Page 1 Page 2
• Provides “next” locality and
abstracts FS/device details
Page 3 Page 4

Page 5 Page 6

COMP7104-DASC7104 27
DATABASE FILES
files(pages(records))

COMP7104-DASC7104 28
Files of pages of records
files(pages(records))
• Higher levels of DBMS operate on pages of records and files of pages

• DB FILE: A collection of pages, each containing a collection of records


• API for higher layers of the DBMS:
– insert/delete/modify record
– fetch a particular record by record Id …
• like a pointer encoding pair of (pageID & location on page)
– scan all records (possibly with conditions on records to be retrieved)

• Could span multiple OS files, disk devices, and even machines

COMP7104-DASC7104 29
Many kinds of database files
• Unordered Heap Files
– Records placed arbitrarily across pages
• Clustered Heap File and Hash Files
– Records and pages are grouped
• Sorted Files
– Pages and records are in sorted order
• Index Files
– B+ Trees, Hash Tables, …
– May contain records or point to records in other files

COMP7104-DASC7104 30
Unordered heap files
• Collection of records with no particular order
– Not to be confused with the “heap” data structure

• As file shrinks / grows, pages are (de)allocated

• To support record level operations, we must


– keep track of the pages in a file
– keep track of free space on pages
– keep track of the records on a page

• Many alternatives for keeping track of this, we’ll consider two

COMP7104-DASC7104 31
Heap file implemented as list
Data Data Data Full pages
Page Page Page
Header
Page

Data Data Data Pages with


Page Page Page free space

• Header page ID and Heap file name stored elsewhere


– Database catalog
• Each page contains 2 “pointers” plus free space and data
• What is wrong with this?
– How do I find a page with enough space for a 20 byte record?
COMP7104-DASC7104 32
Better: use a page directory
Header Page Data
Page 1

Data
Page 2

Data
Page N
DIRECTORY

• Directory entries include # of free bytes on the referenced page


• Header pages accessed often à likely in cache
• Finding a page to fit a record required far fewer page loads than
linked list (Why?)
– One header page load reveals free space of many pages
COMP7104-DASC7104 33
Quick quizz
Data Data Data
Page Page Page Full Pages

Header
Page
Data Data Data
Page Page Page Pages with
Free Space

• Which of the following is NOT TRUE Header Page Data


Page 1

A. In the list scheme, the header page is mostly


empty Data
Page 2

B. The page directory scheme is better for finding Data


Page N

a page with k bytes of free space


DIRECTORY

C. Page directories require O(log #data pages)


lookup to find a page having at least k bytes of
free space
D. All of the above
E. None of the above

A B C D E

COMP7104-DASC7104 34
Indexes: quick preview
• A heap file allows us to retrieve records:
– by specifying the record Id (often as page Id + offset)
– by scanning all records sequentially

• If would like to fetch records by value, e.g.,


– Find all students in the “CS” department
– Find all students with a “GPA” > 3 AND “blue hair”
– Indexes: file structures for efficient value-based queries

COMP7104-DASC7104 35
Summary
File

Page 1 Page 2
SSNz Last First Age Salary
Name Name

Page 3 Page 4
123 Adams Elmo 31 $400

443 Grouch Oscar 32 $300

244 Oz Bert 55 $140

134 Sanders Ernie 55 $400

Page 5 Page 6

Table encoded as files which are collections of pages

COMP7104-DASC7104 36
Next
File

Page 1 Page 2
SSNz Last First Age Salary
Name Name

Page 3 Page 4
123 Adams Elmo 31 $400

443 Grouch Oscar 32 $300

244 Oz Bert 55 $140

134 Sanders Ernie 55 $400

Page 5 Page 6

How do we store records on a page?

COMP7104-DASC7104 37
PAGE LAYOUT
files(pages(records))

COMP7104-DASC7104 38
Page basics: the header
Page Header

• Header may contain:


– Number of records
– Free space
– Maybe a next/last pointer
– Bitmaps, slot table… more soon 39
COMP7104-DASC7104
Things to address
Page Header

• Record length? Fixed or variable


• Find records by record id?
– Record id = (page, location in page)
• How do we add and delete records?
COMP7104-DASC7104 40
Options for page layouts
• Depends on

– Record length (fixed or variable)

– Page packing (packed or unpacked)

COMP7104-DASC7104 41
A note on memory illustrations
• Data is stored in linear order
– 1 byte per position
– Memory addresses are ordered
– Disk addresses are ordered

• This does not fit well on slides


– So we will illustrate the linear order as a rectangle

COMP7104-DASC7104 42
Fixed length records, packed
Record Id: Page Header Record
(Page 2, Record 4)
Record Record

Record Record

Record

• Pack records densely


• Record Id = (pageId, “location in page”)?
– (pageId, record number in page)!
– we know the offset from start of page!
• Easy to add: just append
COMP7104-DASC7104 43
• Delete?
Fixed length records: packed
Record Id: Page Header Record
(Page 2, Record 4)
Record

Record Record

Record

• Pack records densely


• Record Id “location in page”?
– record number in page
• Easy to add: just append
• Delete? COMP7104-DASC7104 44
Fixed length records: packed
Record Id: Page Header Record
(Page 2, Record 4)
Record Record

Wrong Record Record Record

• Pack records densely


• Record Id “location in page”?
– record number in page
• Easy to add: just append
• Delete? Packed à re-arrange ß Necessary! 45
COMP7104-DASC7104
Fixed length records: unpacked
Record Id:
(Page 2, Record 4) Page Header Bitmap

Record Record

Record Record

Record

• Bitmap denotes “slots” with records


• Record Id: record number in page
• Insert: find first empty slot
• Delete: clear bit
COMP7104-DASC7104 46
Fixed length records: unpacked
Record Id:
(Page 2, Record 4) Page Header Bitmap

Record Record

Record Record

Record

• Bitmap denotes “slots” with records


• Record Id: record number in page
• Insert: find first empty slot
• Delete: Clear bit
COMP7104-DASC7104 47
Quick quizz
• Which of the following is NOT TRUE Page Header Record

A. You can delete a record from a packed page, Record Record

but it is more work than for an unpacked page


Record Record

B. In unpacked pages, deleting a record requires


overwriting the record Page Header

Record Record

C. In packed pages, deleting a record can change Record Record

the recordId of other records Record

D. All of the above


E. None of the above

A B C D E

COMP7104-DASC7104 48
Variable length records
Page Header

Record Record

Record Record

Record

• How do we know where each record begins?


• What happens when we add and delete records?

COMP7104-DASC7104 49
First: Relocate metadata to footer

Record Record

Record Record

Record

Page Footer

• We’ll see why this is handy shortly…

COMP7104-DASC7104 50
Slotted page
Record Record
Record Id:
Record Record
(Page 2, Record 4)
Record

Footer 18 12 32 24 16 Slot directory

• Introduce slot directory in footer


– Pointer to free space
– Length + pointer to beginning of record (reverse order) 18
• Record Id = location in slot table (from right)
• Delete? (e.g., 3rd record on the page)
COMP7104-DASC7104 51
Slotted page: delete record
Record Record
Record Id:
Record Record
(Page 2, Record 4)
Record

Footer 18 12 24 16 Slot directory

• Delete record (page 2, record 3): Set 3rd slot directory


pointer to null

COMP7104-DASC7104 52
Slotted page: delete record
Record Record
Record Id:
Record Record
(Page 2, Record 4)
Record

Footer 18 12 24 16 Slot directory

• Delete record (page 2, record 3): Set 3rd slot directory


pointer to null
– Doesn’t affect pointers to other records

COMP7104-DASC7104 53
Slotted page: insert record
Record Record
Record Id:
Record Record
(Page 2, Record 4)
Record

Footer 18 12 24 16 Slot directory

• Insert:

COMP7104-DASC7104 54
Slotted page: insert record
Record Record
Record Id:
Record Record
(Page 2, Record 4)
Record Reco

rd

Footer 18 12 24 16 Slot directory

• Insert:
– Place record in free space on page

COMP7104-DASC7104 55
Slotted page: insert record
Record Record
Record Id:
Record Record
(Page 2, Record 4)
Record Reco

rd

Footer 18 12 42 24 16 Slot directory

• Insert:
– Place record in free space on page
– Create pointer/length pair in next open slot in slot directory

COMP7104-DASC7104 56
Slotted page: insert record
Record Record
Record Id:
Record Record
(Page 2, Record 4)
Record Reco

rd

Footer 18 12 42 24 16 Slot directory

• Insert:
– Place record in free space on page
– Create pointer/length pair in next open slot in slot directory
– Update the free space pointer

COMP7104-DASC7104 57
Record Record

Quick quizz
Record Record
Rec
Record
o
rd

Footer 18 12 42 24 16

Which of the following statements about slotted pages is NOT TRUE

A. Slotted pages can work for both fixed or variable length records
B. A deletion operation requires no searching on the page
C. Deleting a record can change the recordId of other records
D. All of the above
E. None of the above

A B C D E

COMP7104-DASC7104 58
Slotted page: insert record
Record Record
Record Id:
Record Record
(Page 2, Record 4)
Record Reco

rd

Footer 18 12 42 24 16 Slot directory

• Insert:
– Place record in free space on page
– Create pointer/length pair in next open slot in slot directory
– Update the free space pointer
– Fragmentation?
COMP7104-DASC7104 59
Slotted page: insert record
Record Record
Record Id:
Record Record
(Page 2, Record 4)
Record

Footer 18 12 42 24 16 Slot directory

• Insert:
– Place record in free space on page Is this safe?
– Create pointer/length pair in next open slot in slot directory
– Update the free space pointer When should I reorganize?
– Reorganize data on page. What if we need more slots?
COMP7104-DASC7104 60
Slotted page: growing slots
Record Record

Record Record

Record

Footer 18 12 42 24 16 5 Slot directory

• Tracking count of slots in slot directory (empty or full)

COMP7104-DASC7104 61
Slotted page: growing slots
5
Header
Record 16 24 32
Record12 18

Record Record

Record Record Record Reco

rd Record Record

Footer
12 18Record
12 42 24 16 5 Slot directory

• Tracking number of slots in slot directory (empty or full)


• Extend slot directory
– Slots grow from end of page inward, records grow from
beginning of page inward
– Easy!
COMP7104-DASC7104 62
Slotted page: growing slots
5
Header
Record 16 24 32
Record12 18

Record Record

Record Record Record Reco

rd Record Record

Footer
12 18Record
12 42 24 16 6 Slot directory

• Tracking number of slots in slot directory (empty or full)


• Extend slot directory
– Slots grow from end of page inward, records grow from
beginning of page inward.
– Easy!
• And update counter COMP7104-DASC7104 63
Slotted page: summary
5
Header
Record 16 24 32
Record12 18

Record Record

Record Record Record Reco

rd Record Record

Footer
12 18Record
12 42 24 16 5 Slot directory

• Typically use slotted page


– Good for variable and fixed length records
• Not bad for fixed length records too. Why?
– Re-arrange (e.g., sort) and squash null fields
– But for a whole table of fixed-length non-null records, can be
worth the optimization of fixed-length format 64
COMP7104-DASC7104
Log-structured pages
• Instead of storing tuples in pages, we keep
change log records
– Insert: log entire tuple
– Delete: mark the tuple as deleted
– Update: keep delta of modified attributes

COMP7104-DASC7104 65
Log-structured pages
The log records complement the database file: to read a record, the
system « replays » the log backwards and recreates that record
• Use indexes to quickly scan log
• Periodically compact / coalesce records (lots of redundant ones
otherwise)
• Used in HBase, RocksDB, Cassandra

Log entry
INSERT Id=1,val=B
Read

INSERT Id=2,val=A
DELETE id=3
UPDATE val=C (id=1)
66
COMP7104-DASC7104
Quick quizz
• Which of the following is NOT TRUE for log-structured pages
A. Updates are fast
B. Reads may be slow
C. Well adapted for append-only storage
D. All of the above
E. None of the above

A B C D E

COMP7104-DASC7104 67
Record
Formats

RECORD LAYOUT
files(pages(records))

COMP7104-DASC7104 68
Record formats
• Relational model à each record in table has some fixed type
• Assume system catalog stores the schema
– No need to store type information with records (save space!)
– Catalog is just another table …
• Goals:
– Records should be compact in memory & disk format
– Fast access to fields (why?)

• Easy case: fixed length fields


• Interesting case: variable length fields

COMP7104-DASC7104 69
Record formats: fixed length
24

4 8 1 4 7

3 3.142 T 3 HELLO_W

INTEGER DOUBLE INTEGER CHARACTER(7)

BOOLEAN
• Field types same for all records in a file.
– Type info stored separately in system catalog
• Disk byte representation same as in memory
• Finding i’th field?
– done via arithmetic (fast)
COMP7104-DASC7104 70
• Compact? (Nulls?)
Record formats: variable length
What happens if fields are variable length?
Record
Bob Big, St. M 32 94703
VARCHAR VARCHAR CHAR INT INT

Could store with padding? (Fixed Length)


Wasted Space
Bob Big, St. M 32 94703

CHAR(20) CHAR(18) CHAR INT INT


Field Not Big Enough
Alice Boulevard of the Allies M 32 94703

CHAR(20) CHAR(18) CHAR INT


COMP7104-DASC7104
INT 71
Record formats: variable length
What happens if fields are variable length?
Record
Bob Big, St. M 32 94703
VARCHAR VARCHAR CHAR INT INT

Could use delimiters (i.e., CSV):

Comma Separated Values (CSV)


Bob , Big, St. , M , 32 , 94703
VARCHAR VARCHAR CHAR INT INT

• Issues?

COMP7104-DASC7104 72
Record formats: variable length
What happens if fields are variable length?
Record
Bob Big, St. M 32 94703
VARCHAR VARCHAR CHAR INT INT

Could use delimiters (i.e., CSV):

Comma Separated Values (CSV)


Bob , Big, St. , M , 32 , 94703
VARCHAR VARCHAR CHAR INT INT

• Requires scan to access field


• What if text contains commas?
COMP7104-DASC7104 73
Record formats: variable length
What happens if fields are variable length?
Record
Bob Big, St. M 32 94703
VARCHAR VARCHAR CHAR INT INT

Store length information before fields:


Variable Length Fields with Offsets
3 Bob 8 Big, St. M 32 94703
VARCHAR VARCHAR CHAR INT INT

• Requires scan to access field


• What if text contains commas?
COMP7104-DASC7104 74
Record formats: variable length
What happens if fields are variable length?
Record
Bob Big, St. M 32 94703
VARCHAR VARCHAR CHAR INT INT

Store length information before fields:


Variable Length Fields with Offsets
Move all variable
M 32 94703 3 Bob 8 Big, St. length fields to end
CHAR INT INT VARCHAR VARCHAR àenable fast access

• Requires scan to access some fields


• What if text contains commas?
COMP7104-DASC7104 75
Record formats: variable length
What happens if fields are variable length?
Record
Bob Big, St. M 32 94703
VARCHAR VARCHAR CHAR INT INT

Introduce a record header

Header M 32 94703 Bob Big, St.


CHAR INT INT VARCHAR VARCHAR

• Requires scan to access field. Why?


• What if text contains commas?
COMP7104-DASC7104 76
Record formats: variable length
What happens if fields are variable length?
Record
Bob Big, St. M 32 94703
VARCHAR VARCHAR CHAR INT INT

Introduce a record header

Header M 32 94703 Bob Big, St.


CHAR INT INT VARCHAR VARCHAR

• Direct access & no “escaping”, other advantages?


– Handle null fields easily à useful for fixed length records too!
COMP7104-DASC7104 77
Summary
Record SSNz Last
Name
First
Name
Age Salary

123 Adams Elmo 31 $400


Bob Harmon M 32 94703 443 Grouch Oscar 32 $300

244 Oz Bert 55 $140


Varchar Varchar Char Int Int
134 Sanders Ernie 55 $400

Byte Representation of Record


File
94703

Header M
3
2 Bob Harmon

Page 1 Page 2
Slotted Page
Page
Header Page 3 Page 4

Page 5 Page 6
COMP7104-DASC7104 78
System catalogs
• For each relation:
– name, file location, file structure (e.g., heap file)
– attribute name and type, for each attribute
– index name, for each index
– integrity constraints
• For each index:
– structure (e.g., B+ tree) and search key fields
• For each view:
– view name and definition
• Plus statistics, authorization, buffer pool size, etc

Catalogs are themselves stored as relations!


COMP7104-DASC7104 79
Disks and files: summary
• Spinning disk (hard) drives (HDD) and SSDs
– Basic HDD mechanics
– SSD write amplification
– Concept of “near” pages and how it relates to cost of access
– Relative cost of
• Random vs. sequential disk access (10x)
• Disk vs RAM vs. registers
– Huge differences!
• DB file storage
– Typically over FS file(s)
• Disk space manager loads and stores pages
– Block level reasoning
– Abstracts device and file system; provides fast “next” 80
COMP7104-DASC7104
Disks and files: summary (continued)
• files(pages(records))
• DBMS “file” contains pages, and records within pages
– Heap files: unordered records organized with directories

• Page layouts
– Fixed-length packed and unpacked
– Variable length records in slotted pages, with intra-page reorg

• Variable length record format


– Direct access to i’th field and null values
• Note: most DBMS do not allow tuples larger than pages (or have
overflow pages for them)
• Catalog relations store information about relations, indexes and views.
COMP7104-DASC7104 81
Record
Formats

FILE ORGANIZATIONS

COMP7104-DASC7104 82
Architecture of a DBMS
Completed SQL Client
Completed

Query Parsing
& Optimization

Relational Operators

Files and Index Management


We are here Database
Management
Buffer Management
System
Completed Disk Space Management

You are here


Database
COMP7104-DASC7104 83
Recall: heap files
• Unordered collection of records

• Add/Remove particular recordId: easy (cost?)

• Scan: easy (cost?)

• Find a record?
– Given a recordId: (pageId, slot)?
– Matching username = “sarahmanning”?

COMP7104-DASC7104 84
Multiple file organizations

Many alternatives exist, each good in some situations and not so


good in others. (This is often the case in DB systems work!)
• Heap files: Suitable when typical access is a full scan of all
records

• Sorted files: Best for retrieval in order, or when a range of


records is needed

• Clustered files & indexes: Group data into blocks to enable fast
lookup and efficient modifications. (more on this soon …)

COMP7104-DASC7104 85
Bigger questions

• What is the “best” file organization?


– Depends on access patterns …
– How? What are common access patterns anyway?

• Can we be quantitative about tradeoffs?


– Better à How much?

COMP7104-DASC7104 86
Goals for what follows
• Big picture overheads for data access
– We’ll simplify performance models to provide insight, not to get
perfect performance
– Still, a bit of discipline:
• Clearly identify assumptions up front
• Then estimate cost in a principled way

• Foundation for query optimization


– Can’t choose the fastest scheme without an estimate of speed!

COMP7104-DASC7104 87
COST MODEL AND ANALYSIS

COMP7104-DASC7104 88
Cost model for analysis
• B: The number of data blocks in the file
• R: Number of records per block
• D: (Average) time to read/write disk block

• Focus: Average case analysis for uniform random workloads

• For now, we will ignore


– Sequential vs Random I/O
– Pre-fetching
– Any in-memory costs

Good enough to show the overall trends!


COMP7104-DASC7104 89
More assumptions
• Single record insert and delete
• Equality selection – exactly one match
• For Heap Files:
– Insert always appends to end of file
• For Sorted Files:
– Packed: Files compacted after deletions.
– Sorted according to search key

• After understanding these slides …


you should question all these assumptions and rework
– Good exercise to study for tests, and generate ideas

COMP7104-DASC7104 90
Heap files & sorted files
Heap File

2, 5 1, 6 4, 7 3, 10 8, 9

Sorted File

1, 2 3, 4 5, 6 7, 8 9, 10

For illustration, records are just integers

• B: The number of data blocks = 5


• R: Number of records per block = 2
• D: (Average) time to read/write disk block = 5ms 91
COMP7104-DASC7104
Cost of operations
Heap File Sorted File

Scan all records

Equality Search

Range Search

Insert

Delete

• B: The number of data blocks


• R: Number of records per block
• D: Average time to read/write disk block
COMP7104-DASC7104 92
Cost of operations
Heap File Sorted File

Scan all records

Equality Search

Range Search

Insert

Delete

• B: The number of data blocks


• R: Number of records per block
• D: Average time to read/write disk block 93
COMP7104-DASC7104
Quick quizz
Which file scheme will be more efficient for scanning all
records?

A B C
Heap File Sorted File Tie

COMP7104-DASC7104 94
Scan all records
Heap File

2, 5 1, 6 4, 7 3, 10 8, 9

Sorted File

1, 2 3, 4 5, 6 7, 8 9, 10

• B: The number of data blocks Pages touched: ?


• R: Number of records per block Time to read the record: ?
• D: Average time to read/write disk block
COMP7104-DASC7104 95
Cost of operations
Heap File Sorted File

Scan all records B*D B*D

Equality Search

Range Search

Insert

Delete

• B: The number of data blocks


• R: Number of records per block
• D: Average time to read/write disk block 96
COMP7104-DASC7104
Cost of operations
Heap File Sorted File

Scan all records B*D B*D

Equality Search

Range Search

Insert

Delete

• B: The number of data blocks


• R: Number of records per block
• D: Average time to read/write disk block 97
COMP7104-DASC7104
Quick quizz
Which file scheme will be more efficient for finding
a particular record by key (e.g., 8)?

A B C
Heap File Sorted File Tie

COMP7104-DASC7104 98
Find key 8
Heap File

2, 5 1, 6 4, 7 3, 10 8, 9

Pages touched on average?


• P(i): Probability that key is on page i is 1/B
• T(i): Number of pages touched if key on page i is i
• Therefore the expected number of pages touched
B
X XB
1 B(B + 1) B
T(i)P(i) = i = ⇡
i=1 i=1
B 2B 2
COMP7104-DASC7104 99
Find key 8
Heap File

2, 5 1, 6 4, 7 3, 10 8, 9

Pages touched on average: B/2

• Breaking an assumption
– What if there could be more than one copy of each value?
A B C D
B/2 log2 B sqrt(B) B

COMP7104-DASC7104 100
Find key 8
Sorted File

1, 2 3, 4 5, 6 7, 8 9, 10

• Worst-case: Pages touched in binary search


– log2B

• Average-case: Pages touched in binary search


– log2B?

COMP7104-DASC7104 101
Average case binary search
1 IO

2 IOs

3 IOs

4 IOs

Expected Number of Reads: 1 (1 / B) + 2 ( 2 / B) + 3 (4 / B) + 4 (8 / B)


Average ≈ Worst
log2 B log2 B
X 2i 1
1 X i B 1
1
i = i2 = log2 B
i=1
B B i=1 B
102
COMP7104-DASC7104
Cost of operations
Heap File Sorted File

Scan all records B*D B*D

Equality Search 0.5*B*D (log2B)*D

Range Search

Insert

Delete

• B: The number of data blocks


• R: Number of records per block
• D: Average time to read/write disk block
COMP7104-DASC7104 103
Cost of operations
Heap File Sorted File

Scan all records B*D B*D

Equality Search 0.5*B*D (log2B)*D

Range Search

Insert

Delete

• B: The number of data blocks


• R: Number of records per block
• D: Average time to read/write disk block
COMP7104-DASC7104 104
Quick quizz
Which file scheme will be more efficient for finding
a range of records by key (e.g. 7-9)?

A B C
Heap File Sorted File Tie

COMP7104-DASC7104 105
Find keys between 7 and 9
Heap File

2, 5 1, 6 4, 7 3, 10 8, 9

• Always touch all blocks. Why?

COMP7104-DASC7104 106
Find keys between 7 and 9
Heap File

2, 5 1, 6 4, 7 3, 10 8, 9

• Always touch all blocks. Why?

Sorted File

1, 2 3, 4 5, 6 7, 8 9, 10

• Find beginning of range


• Scan right
COMP7104-DASC7104 107
Cost of operations
Heap File Sorted File

Scan all records B*D B*D

Equality Search 0.5*B*D (log2B)*D

Range Search B*D ((log2B)+pages)*D

Insert

Delete

• B: The number of data blocks


• R: Number of records per block
• D: Average time to read/write disk block 108
COMP7104-DASC7104
Cost of operations
Heap File Sorted File

Scan all records B*D B*D

Equality Search 0.5*B*D (log2B)*D

Range Search B*D ((log2B)+pages)*D

Insert

Delete

• B: The number of data blocks


• R: Number of records per block
• D: Average time to read/write disk block
COMP7104-DASC7104 109
Quick quizz
Which scheme will be faster for inserting a key (e.g. 4.5)?

A B C
Heap File Sorted File Tie

COMP7104-DASC7104 110
Insert 4.5
Heap File

2, 5 1, 6 4, 7 3, 10 8, 9 4.5,

Stick at the end of the file. Cost = 2*D Why 2?

COMP7104-DASC7104 111
Insert 4.5
Heap File

2, 5 1, 6 4, 7 3, 10 8, 9 4.5,

Read last page, append, write. Cost = 2*D


Sorted File

1, 2 3, 4 5, 6 7, 8 9, 10

• Find location for record: log2B

COMP7104-DASC7104 112
Insert 4.5
Heap File

2, 5 1, 6 4, 7 3, 10 8, 9 4.5,

Read last page, append, write. Cost = 2*D


Sorted File

1, 2 3, 4 5, 6
4.5,5 7, 7
6, 8 8,10
9, 9 10, _

• Find location for record: log2B


• Insert and shift rest of file Cost? 2*B/2 Why?
113
COMP7104-DASC7104
Cost of operations
Heap File Sorted File

Scan all records B*D B*D

Equality Search 0.5*B*D (log2B)*D

Range Search B*D ((log2B)+pages)*D

Insert 2*D ((log2B)+B)*D

Delete

• B: The number of data blocks


• R: Number of records per block
• D: Average time to read/write disk block
COMP7104-DASC7104 114
Cost of operations
Heap File Sorted File

Scan all records B*D B*D

Equality Search 0.5*B*D (log2B)*D

Range Search B*D ((log2B)+pages)*D

Insert 2*D ((log2B)+B)*D

Delete

• B: The number of data blocks


• R: Number of records per block
• D: Average time to read/write disk block
COMP7104-DASC7104 115
Quick quizz
Which scheme will be faster for deleting a
previously-inserted key (e.g., 4.5)?

A B C
Heap File Sorted File Tie

COMP7104-DASC7104 116
Delete 4.5
Heap File

2, 5 1, 6 4, 7 3, 10 8, 9 4.5,

Average case to find the record: B/2 reads


Delete record from page
Cost? = (B/2+1)*D Why +1?

COMP7104-DASC7104 117
Delete 4.5
Heap File

2, 5 1, 6 4, 7 3, 10 8, 9 4.5,

Average case runtime: (B/2+1) * D

Sorted File

1, 2 3, 4 4.5,5
__,5
5, 6 6, 8
7, 7 8,10
9, 9 10,

• Find location for record: log2B


• Delete record in page à gap 118
COMP7104-DASC7104
Delete 4.5
Heap File

2, 5 1, 6 4, 7 3, 10 8, 9 4.5,

Average case runtime: (B/2+1) * D

Sorted File

1, 2 3, 4 5, ,65
__ 7,
6, 8
7 9,
8,10
9 10,

• Find location for record: log2B


• Shift rest of file left by 1 record: 2 * (B/2) 119
COMP7104-DASC7104
Cost of operations
Heap File Sorted File

Scan all records B*D B*D

Equality Search 0.5*B*D (log2B)*D

Range Search B*D ((log2B)+pages)*D

Insert 2*D ((log2B)+B)*D

Delete (0.5*B+1)*D ((log2B)+B)*D

• B: The number of data blocks


• R: Number of records per block
• D: Average time to read/write disk block Which is better?
120
COMP7104-DASC7104
Cost of operations
Heap File Sorted File

Scan all records B*D B*D

Equality Search 0.5*B*D (log2B)*D

Range Search B*D ((log2B)+pages)*D


Can we do
Insert 2*D ((log2B)+B)*D better?
Delete (0.5*B+1)*D ((log2B)+B)*D Indexes!

• B: The number of data blocks


• R: Number of records per block
• D: Average time to read/write disk block
COMP7104-DASC7104 121

You might also like