08 Storage
08 Storage
ADVANCED
DATABASE
SYSTEMS
Storage Models &
Data Layout
@Andy_Pavlo // 15-721 // Spring 2020
2
D ATA O R G A N I Z AT I O N
Fixed-Length Variable-Length
Index Data Blocks Data Blocks
D ATA O R G A N I Z AT I O N
Type Representation
Data Layout / Alignment
Storage Models
System Catalogs
D ATA R E P R E S E N TAT I O N
INTEGER/BIGINT/SMALLINT/TINYINT
→ C/C++ Representation
FLOAT/REAL vs. NUMERIC/DECIMAL
→ IEEE-754 Standard / Fixed-point Decimals
TIME/DATE/TIMESTAMP
→ 32/64-bit int of (micro/milli)seconds since Unix epoch
VARCHAR/VARBINARY/TEXT/BLOB
→ Pointer to other location if type is ≥64-bits
→ Header with length and address to next location (if
segmented), followed by data bytes.
VA R I A B L E P R E C I S I O N N U M B E R S
VA R I A B L E P R E C I S I O N N U M B E R S
Rounding Example
#include <stdio.h>
D ATA L AYO U T
char[]
CREATE TABLE AndySux (
id INT PRIMARY KEY, header id value
value BIGINT
);
reinterpret_cast<int32_t*>(address)
VA R I A B L E -L E N G T H F I E L D S
char[]
CREATE TABLE AndySux (
value VARCHAR(1024) header id Andy|64-BIT
64-BIT POINTER POINTER
);
Variable-Length Data Blocks
INSERT INTO AndySux
VALUES ("Andy has the worst LENGTH NEXT
Andy has the worst
hygiene that I have ever seen. I hygiene that I have ever seen. I hate
hate him so much.");
N U L L D ATA T Y P E S
DISCLAIMER
W O R D -A L I G N E D T U P L E S
W O R D -A L I G N E D T U P L E S
W O R D -A L I G N M E N T: PA D D I N G
W O R D -A L I G N M E N T: R E O R D E R I N G
C M U -D B A L I G N M E N T E X P E R I M E N T
Processor: 1 socket, 4 cores w/ 2×HT
Workload: Insert Microbenchmark
Avg. Throughput
No Alignment 0.523 MB/sec
Padding 11.7 MB/sec
Padding + Sorting 814.8 MB/sec
Source: Tianyu Li
STORAGE MODELS
N -A R Y S T O R A G E M O D E L ( N S M )
N -A R Y S T O R A G E M O D E L ( N S M )
Advantages
→ Fast inserts, updates, and deletes.
→ Good for queries that need the entire tuple.
→ Can use index-oriented physical storage.
Disadvantages
→ Not good for scanning large portions of the table and/or
a subset of the attributes.
D E C O M P O S I T I O N S TO R A G E M O D E L ( D S M )
D E C O M P O S I T I O N S TO R A G E M O D E L ( D S M )
Advantages
→ Reduces the amount wasted work because the DBMS
only reads the data that it needs.
→ Better compression.
Disadvantages
→ Slow for point queries, inserts, updates, and deletes
because of tuple splitting/stitching.
D S M S Y S T E M H I S TO R Y
Tuple Identification
Data Organization
Update Policy
Buffering Location
D S M : T U P L E I D E N T I F I C AT I O N
D S M : D ATA O R G A N I Z AT I O N
D S M : D ATA O R G A N I Z AT I O N
C A S P E R D E LTA S TO R E
C A S P E R D E LTA S TO R E
Data Table
A B C
INSERT INTO xxx 0 a1 b2 c8
VALUES (a2, b1, c5); 1 a1 b2 c9
2 a1 b1 c1
3 a2 b2 c7
INSERT INTO xxx 4 a2 b1 c5
VALUES (a2, b2, c6); 5 a2
a3 b2
b1 c6
6 a3 b1 c1
Shallow Index 7 a3 b2 c9
key→partition 8 a3 b1 c6
9
O B S E R VAT I O N
H Y B R I D S TO R A G E M O D E L
H Y B R I D S TO R A G E M O D E L
S E PA R AT E E X E C U T I O N E N G I N E S
FRACTURED MIRRORS
NSM DSM
(Primary) (Mirror) Analytical
Transactions Queries
D E LTA S TO R E
DSM
NSM Historical Data
Delta Store
Transactions
P E LOT O N A D A P T I V E S TO R A G E
P E LOT O N A D A P T I V E S TO R A G E
SELECT AVG(B)
FROM AndySux
WHERE C = “yyy”
Cold
15-721 (Spring 2020)
39
P E LOT O N A D A P T I V E S T O R A G E
Row Layout Column Layout Adaptive Layout
Execution Time (ms)
1600
1200
800
400
0 Scan Insert Scan Insert Scan Insert Scan Insert Scan Insert Scan Insert
S Y S T E M C ATA LO G S
SCHEMA CHANGES
ADD COLUMN:
→ NSM: Copy tuples into new region in memory.
→ DSM: Just create the new column segment
DROP COLUMN:
→ NSM #1: Copy tuples into new region of memory.
→ NSM #2: Mark column as "deprecated", clean up later.
→ DSM: Just drop the column and free memory.
CHANGE COLUMN:
→ Check whether the conversion can happen. Depends on
default values.
15-721 (Spring 2020)
42
INDEXES
CREATE INDEX:
→ Scan the entire table and populate the index.
→ Must record changes made by txns that modified the table
while another txn was building the index.
→ When the scan completes, lock the table and resolve
changes that were missed after the scan started.
DROP INDEX:
→ Just drop the index logically from the catalog.
→ It only becomes "invisible" when the txn that dropped it
commits. All existing txns will still have to update it.
SEQUENCES
PA R T I N G T H O U G H T S