Query Execution: Intro To Database Systems Andy Pavlo
Query Execution: Intro To Database Systems Andy Pavlo
13 Part II
ADMINISTRIVIA
QUERY EXECUTION
SELECT R.id, S.cdate
We discussed last class how to FROM R JOIN S
compose operators together to ON R.id = S.id
execute a query plan. WHERE S.value > 100
R S
CMU 15-445/645 (Fall 2019)
4
W H Y C A R E A B O U T PA R A L L E L E X E C U T I O N ?
Increased performance.
→ Throughput
→ Latency
Parallel DBMSs:
→ Resources are physically close to each other.
→ Resources communicate with high-speed interconnect.
→ Communication is assumed to cheap and reliable.
Distributed DBMSs:
→ Resources can be far from each other.
→ Resources communicate using slow(er) interconnect.
→ Communication cost and problems cannot be ignored.
T O D AY ' S A G E N D A
Process Models
Execution Parallelism
I/O Parallelism
PROCESS MODEL
PROCESS MODELS
Dispatcher Worker
CMU 15-445/645 (Fall 2019)
11
PROCESS POOL
Worker Threads
CMU 15-445/645 (Fall 2019)
13
PROCESS MODELS
SCHEDULING
I N T E R- Q U E R Y PA R A L L E L I S M
I N T R A - Q U E R Y PA R A L L E L I S M
PA R A L L E L G R A C E H A S H J O I N
PA R A L L E L G R A C E H A S H J O I N
I N T R A - Q U E R Y PA R A L L E L I S M
I N T R A - O P E R AT O R PA R A L L E L I S M
I N T R A - O P E R AT O R PA R A L L E L I S M
Exchange
SELECT * FROM A
WHERE A.value > 99 s s s
A1 A2 A3
s value>99 1 2 3
Pages
1 2 3 4 5
CMU 15-445/645 (Fall 2019)
21
I N T R A - O P E R AT O R PA R A L L E L I S M
Exchange
Pages
1 2 3 4 5
CMU 15-445/645 (Fall 2019)
21
I N T R A - O P E R AT O R PA R A L L E L I S M
Next Exchange
SELECT * FROM A
WHERE A.value > 99 s Next s s
A1 A2 A3
s value>99 1 2 3
Pages
1 2 3 4 5
CMU 15-445/645 (Fall 2019)
21
I N T R A - O P E R AT O R PA R A L L E L I S M
Next Exchange
SELECT * FROM A
WHERE A.value > 99 s Next s s
A1 A2 A3
s value>99 1 2 3
Pages
1 2 3 4 5
CMU 15-445/645 (Fall 2019)
21
I N T R A - O P E R AT O R PA R A L L E L I S M
Exchange
SELECT * FROM A
WHERE A.value > 99 s s s
A1 A2 A3
s value>99 1 2 3
Pages
1 2 3 4 5
CMU 15-445/645 (Fall 2019)
21
I N T R A - O P E R AT O R PA R A L L E L I S M
Exchange
SELECT * FROM A
WHERE A.value > 99 s s s
A1 A2 A3
s value>99 1 2 3
Pages
1 2 3 4 5
CMU 15-445/645 (Fall 2019)
22
E XC H A N G E O P E R AT O R
I N T R A - O P E R AT O R PA R A L L E L I S M
SELECT A.id, B.value
FROM A JOIN B
ON A.id = B.id
WHERE A.value < 99
AND B.value > 100
p
⨝
s s A 1 A2 A3
A B 1 2 3
I N T R A - O P E R AT O R PA R A L L E L I S M
SELECT A.id, B.value
FROM A JOIN B
ON A.id = B.id
WHERE A.value < 99
AND B.value > 100
p
⨝ s s s
s s A 1 A2 A3
A B 1 2 3
I N T R A - O P E R AT O R PA R A L L E L I S M
SELECT A.id, B.value
FROM A JOIN B
ON A.id = B.id
WHERE A.value < 99
AND B.value > 100
p
⨝ Build HT
s
Build HT
s
Build HT
s
s s A 1 A2 A3
A B 1 2 3
I N T R A - O P E R AT O R PA R A L L E L I S M
SELECT A.id, B.value
FROM A JOIN B
ON A.id = B.id
WHERE A.value < 99
AND B.value > 100
p Exchange
⨝ Build HT
s
Build HT
s
Build HT
s
s s A 1 A2 A3
A B 1 2 3
I N T R A - O P E R AT O R PA R A L L E L I S M
SELECT A.id, B.value
FROM A JOIN B
ON A.id = B.id
WHERE A.value < 99
AND B.value > 100
p Exchange
⨝ Build HT
s
Build HT
s
Build HT
s
s s A 1 A2 A3 B1 B2
A B 1 2 3 4 5
I N T R A - O P E R AT O R PA R A L L E L I S M
SELECT A.id, B.value
FROM A JOIN B
ON A.id = B.id
WHERE A.value < 99
AND B.value > 100
p Exchange Exchange
⨝ Build HT
s
Build HT
s
Build HT
s
Partition
s
Partition
s
s s A 1 A2 A3 B1 B2
A B 1 2 3 4 5
I N T R A - O P E R AT O R PA R A L L E L I S M
SELECT A.id, B.value
FROM A JOIN B
ON A.id = B.id
WHERE A.value < 99
AND B.value > 100
⨝
p Exchange Exchange
⨝ Build HT
s
Build HT
s
Build HT
s
Partition
s
Partition
s
s s A 1 A2 A3 B1 B2
A B 1 2 3 4 5
I N T R A - O P E R AT O R PA R A L L E L I S M
SELECT A.id, B.value Exchange
FROM A JOIN B
ON A.id = B.id 1 2 3 4
WHERE A.value < 99
Probe HT Probe HT Probe HT Probe HT
AND B.value > 100
⨝
p Exchange Exchange
⨝ Build HT
s
Build HT
s
Build HT
s
Partition
s
Partition
s
s s A 1 A2 A3 B1 B2
A B 1 2 3 4 5
I N T E R- O P E R AT O R PA R A L L E L I S M
p
⨝
s s 1
⨝
for r1 ∊ outer:
for r2 ∊ inner:
A B
emit(r1⨝r2)
p
WHERE A.value < 99
AND B.value > 100 for r ∊ incoming:
2
emit(pr)
p
⨝
s s 1
⨝
for r1 ∊ outer:
for r2 ∊ inner:
A B
emit(r1⨝r2)
p
WHERE A.value < 99
AND B.value > 100 for r ∊ incoming:
2
emit(pr)
p
⨝
s s 1
⨝
for r1 ∊ outer:
for r2 ∊ inner:
A B
emit(r1⨝r2)
B U S H Y PA R A L L E L I S M
O B S E R VAT I O N
I / O PA R A L L E L I S M
M U LT I - D I S K PA R A L L E L I S M
RAID 0 (Stripping)
CMU 15-445/645 (Fall 2019)
29
M U LT I - D I S K PA R A L L E L I S M
RAID 1 (Mirroring)
CMU 15-445/645 (Fall 2019)
30
D ATA B A S E PA R T I T I O N I N G
PA R T I T I O N I N G
V E R T I C A L PA R T I T I O N I N G
CREATE TABLE foo (
Store a table’s attributes in a separate attr1 INT,
location (e.g., file, disk volume). attr2 INT,
attr3 INT,
Have to store tuple information to attr4 TEXT
reconstruct the original record. );
V E R T I C A L PA R T I T I O N I N G
CREATE TABLE foo (
Store a table’s attributes in a separate attr1 INT,
location (e.g., file, disk volume). attr2 INT,
attr3 INT,
Have to store tuple information to attr4 TEXT
reconstruct the original record. );
Partition #1 Partition #2
Tuple#1 attr1 attr2 attr3 Tuple#1 attr4
Tuple#2 attr1 attr2 attr3 Tuple#2 attr4
Tuple#3 attr1 attr2 attr3 Tuple#3 attr4
Tuple#4 attr1 attr2 attr3 Tuple#4 attr4
H O R I Z O N TA L PA R T I T I O N I N G
CREATE TABLE foo (
Divide the tuples of a table up into attr1 INT,
disjoint segments based on some attr2 INT,
attr3 INT,
partitioning key. attr4 TEXT
→ Hash Partitioning );
→ Range Partitioning
→ Predicate Partitioning
H O R I Z O N TA L PA R T I T I O N I N G
CREATE TABLE foo (
Divide the tuples of a table up into attr1 INT,
disjoint segments based on some attr2 INT,
attr3 INT,
partitioning key. attr4 TEXT
→ Hash Partitioning );
→ Range Partitioning
→ Predicate Partitioning
Partition #1 Partition #2
Tuple#1 attr1 attr2 attr3 attr4 Tuple#3 attr1 attr2 attr3 attr4
Tuple#2 attr1 attr2 attr3 attr4 Tuple#4 attr1 attr2 attr3 attr4
CONCLUSION
MIDTERM EXAM
Who: You
What: Midterm Exam
When: Wed Oct 16th @ 12:00pm ‐ 1:20pm
Where: MM 103
Why: https://youtu.be/GHPB1eCROSA
MIDTERM EXAM
What to bring:
→ CMU ID
→ Calculator
→ One 8.5x11" page of handwritten notes (double-sided)
R E L AT I O N A L M O D E L
Integrity Constraints
Relation Algebra
SQL
Basic operations:
→ SELECT / INSERT / UPDATE / DELETE
→ WHERE predicates
→ Output control
More complex operations:
→ Joins
→ Aggregates
→ Common Table Expressions
STORAGE
HASHING
Static Hashing
→ Linear Probing
→ Robin Hood
→ Cuckoo Hashing
Dynamic Hashing
→ Extendible Hashing
→ Linear Hashing
TREE INDEXES
B+Tree
→ Insertions / Deletions
→ Splits / Merges
→ Difference with B-Tree
→ Latch Crabbing / Coupling
Radix Trees
SORTING
JOINS
QUERY PROCESSING
Processing Models
→ Advantages / Disadvantages
Parallel Execution
→ Inter- vs. Intra-Operator Parallelism
NEXT CLASS