15 Optimization
15 Optimization
Systems
Query Planning &
Optimization
15-445/645 FALL 2024 PROF. ANDY PAVLO
ADMINISTRIVIA
Project #3 is due Sunday Nov 17th @ 11:59pm
→ Recitation will be next week
LAST CLASS
We talked about how to design the DBMS's
architecture to execute queries in parallel.
LAST CLASS
We talked about how to design the DBMS's
architecture to execute queries in parallel.
MOTIVATION
SELECT
FROM
ON
DISTINCT ename
Emp E JOIN Dept D
E.did = D.did
π
ename
Catalog
clustered unclustered unclustered
σ dname = 'Toy'
Emp(ssn,ename,addr,sal,did)
10,000 records
1,000 pages
σ Emp.did = Dept.did
clustered unclustered
Dept(did,dname,floor,mgr)
500 records
×
50 pages
Emp Dept
5-445/645 (Fall 2024)
Catalog
clustered unclustered unclustered
2,000 reads + 4 writes
(10K/500 = 20 emps per dept)
σ dname = 'Toy'
Emp(ssn,ename,addr,sal,did)
10,000 records
1,000 pages
1,000,000 reads + 2,000 writes
(FK join, 10k tuples in temp T2)
σ Emp.did = Dept.did
clustered unclustered
Dept(did,dname,floor,mgr)
500 records
(50 + 50,000) reads
+ 1,000,000 writes ×
50 pages Write temp file T1
5 tuples per page in T1 Emp Dept
5-445/645 (Fall 2024)
MOTIVATION
SELECT DISTINCT ename
FROM Emp E JOIN Dept D
ON E.did = D.did
WHERE D.dname = 'Toy'
Total: 54k I/Os
Catalog
clustered unclustered unclustered
4 reads + 4 writes
Read temp T2
πename
Emp(ssn,ename,addr,sal,did)
10,000 records
1,000 pages
2,000 reads + 4 writes
Read temp T1, Write temp T2
σ dname = 'Toy'
clustered unclustered
Dept(did,dname,floor,mgr)
500 records
(50 + 50,000) reads
+ 2,000 writes ⋈ Emp.did = Dept.did
MOTIVATION
SELECT DISTINCT ename
FROM Emp E JOIN Dept D
ON E.did = D.did
WHERE D.dname = 'Toy'
Total: 54k I/Os
Catalog
clustered unclustered unclustered
4 reads + 4 writes
Read temp T2
πename
Emp(ssn,ename,addr,sal,did)
10,000 records
1,000 pages
2,000 reads + 4 writes
Read temp T1, Write temp T2
σ dname = 'Toy'
clustered unclustered
Dept(did,dname,floor,mgr)
500 records
(50 + 50,000) reads
+ 2,000 writes ⋈ Emp.did = Dept.did
MOTIVATION
SELECT DISTINCT ename
FROM Emp E JOIN Dept D
ON E.did = D.did
WHERE D.dname = 'Toy'
Total: 7,159 I/Os
Catalog
clustered unclustered unclustered
4 reads + 4 writes
Read temp T2
π ename
Emp(ssn,ename,addr,sal,did)
10,000 records
1,000 pages
2,000 reads + 4 writes
Read temp T1, Write temp T2
σ dname = 'Toy'
clustered unclustered
Dept(did,dname,floor,mgr)
500 records
3×(|Emp| + |Dept| =
3,150 reads + 2,000 writes ⋈ Emp.did = Dept.did
MOTIVATION
SELECT DISTINCT ename
FROM Emp E JOIN Dept D
ON E.did = D.did Materialization Model
WHERE D.dname = 'Toy'
Total: 7,159 I/Os
Catalog
clustered unclustered unclustered
4 reads + 4 writes
Read temp T2
π ename
Emp(ssn,ename,addr,sal,did)
10,000 records
1,000 pages
2,000 reads + 4 writes
Read temp T1, Write temp T2
σ dname = 'Toy'
clustered unclustered
Dept(did,dname,floor,mgr)
500 records
3×(|Emp| + |Dept| =
3,150 reads + 2,000 writes ⋈ Emp.did = Dept.did
MOTIVATION
SELECT DISTINCT ename Vectorization Model Total: 3,151 I/Os
FROM Emp E JOIN Dept D
ON E.did = D.did Materialization Model
WHERE D.dname = 'Toy'
Total: 7,159 I/Os
Catalog
clustered unclustered unclustered
4 reads + 4 writes
Read temp T2
π ename
Emp(ssn,ename,addr,sal,did)
10,000 records
1,000 pages
2,000 reads + 4 writes
Read temp T1, Write temp T2
σ dname = 'Toy'
clustered unclustered
Dept(did,dname,floor,mgr)
500 records
3×(|Emp| + |Dept| =
3,150 reads + 2,000 writes ⋈ Emp.did = Dept.did
MOTIVATION
SELECT DISTINCT ename
FROM Emp E JOIN Dept D
ON E.did = D.did
WHERE D.dname = 'Toy'
Catalog
clustered unclustered unclustered
π
ename
Emp(ssn,ename,addr,sal,did)
10,000 records
1,000 pages
σ dname = 'Toy'
clustered unclustered
Dept(did,dname,floor,mgr)
500 records
⋈ Emp.did = Dept.did
50 pages
Emp Dept
5-445/645 (Fall 2024)
MOTIVATION
SELECT DISTINCT ename
FROM Emp E JOIN Dept D
ON E.did = D.did
WHERE D.dname = 'Toy'
Catalog
clustered unclustered unclustered
π
ename
Emp(ssn,ename,addr,sal,did)
10,000 records
1,000 pages
σ dname = 'Toy'
clustered unclustered
Dept(did,dname,floor,mgr)
500 records
⋈ Emp.did = Dept.did
50 pages
Dept Emp
5-445/645 (Fall 2024)
MOTIVATION
SELECT DISTINCT ename
FROM Emp E JOIN Dept D
ON E.did = D.did
WHERE D.dname = 'Toy'
Total: 37 I/Os
Catalog
clustered unclustered unclustered
4 reads + 1 writes
Read temp T2
π ename
Emp(ssn,ename,addr,sal,did)
1 + 3 (idx) + 20 (ptr chase) reads
10,000 records
1,000 pages + 4 writes ⋈ Emp.did = Dept.did
Index Nested-Loop Join
clustered unclustered
Dept(did,dname,floor,mgr)
3 reads + 1 writes
σ Emp
dname = 'Toy'
500 records
50 pages Access: Index(dname)
Dept
5-445/645 (Fall 2024)
TODAY'S AGENDA
Background
Heuristic / Ruled-based Optimization
Cost-based Optimization
Cost Model Estimation
ARCHITECTURE OVERVIEW
Cost
Application Schema Info
Model
System
Catalog
Optimizer
Parser
Name→Internal ID 4 Physical
Plan
Binder
2 Abstract 3 Logical
Syntax Plan
Tree
QUERY OPTIMIZATION
Heuristics / Rules
→ Rewrite the query to remove (guessed) inefficiencies.
→ Examples: always do selections first or push down
projections as early as possible.
→ These techniques may need to examine catalog, but they do
not need to examine data.
Cost-based Search
→ Use a model to estimate the cost of executing a plan.
→ Enumerate multiple equivalent plans for a query and pick
the one with the lowest cost.
PREDICATE PUSHDOWN
π ename π ename
⋈ Emp.did = Dept.did
σ Emp
dname = 'Toy'
π ename (σ dname = 'Toy' (Dept ⋈ Emp)) Rewrite π ename (Emp ⋈ σ dname = 'Toy' (Dept))
5-445/645 (Fall 2024)
σ Emp.did = Dept.did
× ⋈ Emp.did = Dept.did
PROJECTION PUSHDOWN
π ename
π ename
⋈ Emp.did = Dept.did
⋈ Emp.did = Dept.did
… π ename,did
… Emp Emp
QUERY OPTIMIZATION
Heuristics / Rules
→ Rewrite the query to remove (guessed) inefficiencies.
→ Examples: always do selections first or push down
projections as early as possible.
→ These techniques may need to examine catalog, but they do
not need to examine data.
Cost-based Search
→ Use a model to estimate the cost of executing a plan.
→ Enumerate multiple equivalent plans for a query and pick
the one with the lowest cost.
BOTTOM-UP OPTIMIZATION
Use static rules to perform initial optimization.
Then use dynamic programming to determine
the best join order for tables using a divide-and-
conquer search method
SYSTEM R OPTIMIZER
Left-Deep Tree
Break query into blocks and generate
logical operators for each block. D
For each logical operator, generate a C
set of physical operators that
implement it. A
outer
B
inner
→ All combinations of join algorithms and
access paths
Bushy Tree
Then, iteratively construct a “left-deep”
join tree that minimizes the estimated
amount of work to execute the plan.
A B C D
5-445/645 (Fall 2024)
SYSTEM R OPTIMIZER
Left-Deep Tree
Break query into blocks and generate
logical operators for each block. D
For each logical operator, generate a C
set of physical operators that
implement it. A
outer
B
inner
→ All combinations of join algorithms and
access paths
Bushy Tree
Then, iteratively construct a “left-deep”
join tree that minimizes the estimated
amount of work to execute the plan.
A B C D
5-445/645 (Fall 2024)
SYSTEM R OPTIMIZER
SELECT ARTIST.NAME
FROM ARTIST, APPEARS, ALBUM ARTIST: Sequential Scan
WHERE ARTIST.ID=APPEARS.ARTIST_ID APPEARS: Sequential Scan
AND APPEARS.ALBUM_ID=ALBUM.ID
ALBUM: Index Look-up on NAME
AND ALBUM.NAME=“Andy's OG Remix”
ORDER BY ARTIST.ID
SYSTEM R OPTIMIZER
SELECT ARTIST.NAME
FROM ARTIST, APPEARS, ALBUM ARTIST: Sequential Scan
WHERE ARTIST.ID=APPEARS.ARTIST_ID APPEARS: Sequential Scan
AND APPEARS.ALBUM_ID=ALBUM.ID
ALBUM: Index Look-up on NAME
AND ALBUM.NAME=“Andy's OG Remix”
ORDER BY ARTIST.ID
Step #1: Choose the best access paths ARTIST ⨝ APPEARS ⨝ ALBUM
to each table APPEARS ⨝ ALBUM ⨝ ARTIST
ALBUM ⨝ APPEARS ⨝ ARTIST
Step #2: Enumerate all possible join
APPEARS ⨝ ARTIST ⨝ ALBUM
orderings for tables ARTIST × ALBUM ⨝ APPEARS
ALBUM × ARTIST ⨝ APPEARS
⋮ ⋮ ⋮
SYSTEM R OPTIMIZER
SELECT ARTIST.NAME
FROM ARTIST, APPEARS, ALBUM ARTIST: Sequential Scan
WHERE ARTIST.ID=APPEARS.ARTIST_ID APPEARS: Sequential Scan
AND APPEARS.ALBUM_ID=ALBUM.ID
ALBUM: Index Look-up on NAME
AND ALBUM.NAME=“Andy's OG Remix”
ORDER BY ARTIST.ID
Step #1: Choose the best access paths ARTIST ⨝ APPEARS ⨝ ALBUM
to each table APPEARS ⨝ ALBUM ⨝ ARTIST
ALBUM ⨝ APPEARS ⨝ ARTIST
Step #2: Enumerate all possible join
APPEARS ⨝ ARTIST ⨝ ALBUM
orderings for tables ARTIST × ALBUM ⨝ APPEARS
Step #3: Determine the join ordering ALBUM × ARTIST ⨝ APPEARS
with the lowest cost ⋮ ⋮ ⋮
ARTIST.ID=APPEARS.ARTIST_ID APPEARS.ALBUM_ID=ALBUM.ID
ARTIST.ID=APPEARS.ARTIST_ID APPEARS.ALBUM_ID=ALBUM.ID
ARTIST.ID=APPEARS.ARTIST_ID APPEARS.ALBUM_ID=ALBUM.ID
ARTIST.ID=APPEARS.ARTIST_ID APPEARS.ALBUM_ID=ALBUM.ID
ALBUM⨝APPEARS
ARTIST
TOP-DOWN OPTIMIZATION
Start with a logical plan of what we want the query
to be. Perform a branch-and-bound search to
traverse the plan tree by converting logical
operators into physical operators.
→ Keep track of global best plan during search.
→ Treat physical properties of data as first-class entities
during planning.
JOIN(A,B) to JOIN(B,A)
→ Logical→Physical:
JOIN(A,B) to HASH_JOIN(A,B) ARTIST⨝APPEARS ALBUM⨝APPEARS ARTIST⨝ALBUM
JOIN(A,B) to JOIN(B,A)
→ Logical→Physical:
JOIN(A,B) to HASH_JOIN(A,B) ARTIST⨝APPEARS ALBUM⨝APPEARS ARTIST⨝ALBUM
JOIN(A,B) to JOIN(B,A)
→ Logical→Physical:
JOIN(A,B) to HASH_JOIN(A,B) ARTIST⨝APPEARS ALBUM⨝APPEARS ARTIST⨝ALBUM
JOIN(A,B) to JOIN(B,A)
→ Logical→Physical:
JOIN(A,B) to HASH_JOIN(A,B) ARTIST⨝APPEARS ALBUM⨝APPEARS ARTIST⨝ALBUM
JOIN(A,B) to JOIN(B,A)
→ Logical→Physical:
JOIN(A,B) to HASH_JOIN(A,B) ARTIST⨝APPEARS ALBUM⨝APPEARS ARTIST⨝ALBUM
HASH_JOIN(A1,A2)
JOIN(A,B) to JOIN(B,A)
→ Logical→Physical:
JOIN(A,B) to HASH_JOIN(A,B) ARTIST⨝APPEARS ALBUM⨝APPEARS ARTIST⨝ALBUM
HASH_JOIN(A1,A2)
JOIN(A,B) to JOIN(B,A)
→ Logical→Physical:
JOIN(A,B) to HASH_JOIN(A,B) ARTIST⨝APPEARS ALBUM⨝APPEARS ARTIST⨝ALBUM
HASH_JOIN(A1,A2) MERGE_JOIN(A1,A2)
JOIN(A,B) to JOIN(B,A)
→ Logical→Physical:
JOIN(A,B) to HASH_JOIN(A,B) ARTIST⨝APPEARS ALBUM⨝APPEARS ARTIST⨝ALBUM
HASH_JOIN(A1,A2) MERGE_JOIN(A1,A2)
JOIN(A,B) to JOIN(B,A)
→ Logical→Physical:
JOIN(A,B) to HASH_JOIN(A,B) ARTIST⨝APPEARS ALBUM⨝APPEARS ARTIST⨝ALBUM
JOIN(A,B) to JOIN(B,A)
→ Logical→Physical:
JOIN(A,B) to HASH_JOIN(A,B) ARTIST⨝APPEARS ALBUM⨝APPEARS ARTIST⨝ALBUM
JOIN(A,B) to JOIN(B,A)
→ Logical→Physical:
JOIN(A,B) to HASH_JOIN(A,B) ARTIST⨝APPEARS ALBUM⨝APPEARS ARTIST⨝ALBUM
JOIN(A,B) to JOIN(B,A)
→ Logical→Physical:
JOIN(A,B) to HASH_JOIN(A,B) ARTIST⨝APPEARS ALBUM⨝APPEARS ARTIST⨝ALBUM
JOIN(A,B) to JOIN(B,A)
→ Logical→Physical:
JOIN(A,B) to HASH_JOIN(A,B) ARTIST⨝APPEARS ALBUM⨝APPEARS ARTIST⨝ALBUM
JOIN(A,B) to JOIN(B,A)
→ Logical→Physical:
JOIN(A,B) to HASH_JOIN(A,B) ARTIST⨝APPEARS ALBUM⨝APPEARS ARTIST⨝ALBUM
JOIN(A,B) to JOIN(B,A)
→ Logical→Physical:
JOIN(A,B) to HASH_JOIN(A,B) ARTIST⨝APPEARS ALBUM⨝APPEARS ARTIST⨝ALBUM
OBSERVATION
Applications often execute nested queries.
→ We could optimize each block using the methods we have
discussed.
→ However, this may be inefficient since we optimize each
block separately without a global approach.
NESTED SUB-QUERIES
The DBMS treats nested sub-queries in the where
clause as functions that take parameters and return
a single value or set of values.
SELECT name
FROM sailors AS S, reserves AS R
WHERE S.sid = R.sid
AND R.day = '2022-10-25'
DECOMPOSING QUERIES
For harder queries, the optimizer breaks up queries
into blocks and then concentrates on one block at a
time.
DECOMPOSING QUERIES
DECOMPOSING QUERIES
DECOMPOSING QUERIES
DECOMPOSING QUERIES
EXPRESSION REWRITING
An optimizer transforms a query’s expressions (e.g.,
WHERE/ON clause predicates) into the minimal set of
expressions.
EXPRESSION REWRITING
Impossible / Unnecessary Predicates
SELECT * FROM A WHERE 1 = 0;
EXPRESSION REWRITING
Impossible / Unnecessary Predicates
SELECT * FROM A WHERE false;
1 = 0;
EXPRESSION REWRITING
Impossible / Unnecessary Predicates
SELECT * FROM A WHERE false;
1 = 0;
SELECT * FROM A WHERE NOW() IS NULL;
EXPRESSION REWRITING
Impossible / Unnecessary Predicates
SELECT * FROM A WHERE false;
1 = 0;
SELECT * FROM A WHERE false;
EXPRESSION REWRITING
Impossible / Unnecessary Predicates
SELECT * FROM A WHERE false;
1 = 0;
SELECT * FROM A WHERE false;
Merging Predicates
SELECT * FROM A
WHERE val BETWEEN 1 AND 100
OR val BETWEEN 50 AND 150;
EXPRESSION REWRITING
Impossible / Unnecessary Predicates
SELECT * FROM A WHERE false;
1 = 0;
SELECT * FROM A WHERE false;
Merging Predicates
SELECT * FROM A
WHERE val BETWEEN 1 AND 150;
OBSERVATION
We have formulas for the operator
algorithms (e.g. the cost formulas for π
ename
hash join, sort merge join, …), but we
also need to estimate the size of the
output that an operator produces. ⋈ Emp.did = Dept.did
Dept σ ename,did
Emp
OBSERVATION
We have formulas for the operator
algorithms (e.g. the cost formulas for ??? π
ename
hash join, sort merge join, …), but we
also need to estimate the size of the
output that an operator produces. ⋈ Emp.did = Dept.did ???
COST ESTIMATION
The DBMS uses a cost model to predict the
behavior of a query plan given a database state.
→ This is an internal cost that allows the DBMS to compare
one plan with another.
41
STATISTICS
The DBMS stores internal statistics about tables,
attributes, and indexes in its internal catalog.
Different systems update them at different times.
Manual invocations:
→ Postgres/SQLite: ANALYZE
→ Oracle/MySQL: ANALYZE TABLE
→ SQL Server: UPDATE STATISTICS
→ DB2: RUNSTATS
SELECTION CARDINALITY
The selectivity (sel) of a predicate P
is the fraction of tuples that qualify.
SELECT * FROM people
Equality Predicate: A=constant WHERE age = 9
→ sel(A=constant) = #occurences/|R|
# of occurrences 10
5 Distinct values
0 of attribute
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
age
5-445/645 (Fall 2024)
SELECTION CARDINALITY
The selectivity (sel) of a predicate P
is the fraction of tuples that qualify.
SELECT * FROM people
Equality Predicate: A=constant WHERE age = 9
→ sel(A=constant) = #occurences/|R|
→ Example: sel(age=9) =
# of occurrences 10
SC(age=9)=4
5 Distinct values
0 of attribute
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
age
5-445/645 (Fall 2024)
SELECTION CARDINALITY
The selectivity (sel) of a predicate P
is the fraction of tuples that qualify.
SELECT * FROM people
Equality Predicate: A=constant WHERE age = 9
→ sel(A=constant) = #occurences/|R|
→ Example: sel(age=9) = 4/45
# of occurrences 10
SC(age=9)=4
5 Distinct values
0 of attribute
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
age
5-445/645 (Fall 2024)
SELECTION CARDINALITY
Assumption #1: Uniform Data
→ The distribution of values (except for the heavy hitters) is
the same.
CORRELATED ATTRIBUTES
Consider a database of automobiles:
→ # of Makes = 10, # of Models = 100
And the following query:
→ (make=“Honda” AND model=“Accord”)
With the independence and uniformity
assumptions, the selectivity is:
→ 1/10 × 1/100 = 0.001
But since only Honda makes Accords the real
selectivity is 1/100 = 0.01
STATISTICS
Choice #1: Histograms
→ Maintain an occurrence count per value (or range of
values) in a column.
HISTOGRAMS
Our formulas are nice, but we assume that data
values are uniformly distributed.
Histogram
# of occurrences 10
5
0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
EQUI-WIDTH HISTOGRAM
Maintain counts for a group of values instead of
each unique key. All buckets have the same width
(i.e., same # of value).
Non-Uniform Approximation
10
0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
EQUI-WIDTH HISTOGRAM
Maintain counts for a group of values instead of
each unique key. All buckets have the same width
(i.e., same # of value).
Non-Uniform Approximation
10
Bucket Ranges0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Bucket #1 Bucket #2 Bucket #3 Bucket #4 Bucket #5
Count=8 Count=4 Count=15 Count=3 Count=14
EQUI-WIDTH HISTOGRAM
Maintain counts for a group of values instead of
each unique key. All buckets have the same width
(i.e., same # of value).
Equi-Width Histogram
15
10
5
Bucket Ranges0
1-3 4-6 7-9 10-12 13-15
Bucket #1 Bucket #2 Bucket #3 Bucket #4 Bucket #5
Count=8 Count=4 Count=15 Count=3 Count=14
EQUI-DEPTH HISTOGRAMS
Vary the width of buckets so that the total number
of occurrences for each bucket is roughly the same.
Histogram (Quantiles)
10
0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
EQUI-DEPTH HISTOGRAMS
Vary the width of buckets so that the total number
of occurrences for each bucket is roughly the same.
Histogram (Quantiles)
15
10
5
0
1-5 6-8 9-13 14-15
SKETCHES
Probabilistic data structures that generate
approximate statistics about a data set.
Cost-model can replace histograms with sketches to
improve its selectivity estimate accuracy.
SAMPLING
Modern DBMSs also collect samples SELECT AVG(age)
from tables to estimate selectivities. FROM people
WHERE age > 50
Update samples when the underlying id name age status
tables changes significantly. 1001 Obama 63 Rested
1002 Swift 34 Paid
1003 Tupac 25 Dead
1004 Bieber 30 Crunk
1005 Andy 43 Illin
1006 TigerKing 61 Jailed
⋮
1 billion tuples
5-445/645 (Fall 2024)
SAMPLING
Modern DBMSs also collect samples SELECT AVG(age)
from tables to estimate selectivities. FROM people
WHERE age > 50
Update samples when the underlying id name age status
tables changes significantly. 1001 Obama 63 Rested
1002 Swift 34 Paid
1003 Tupac 25 Dead
1004 Bieber 30 Crunk
Table Sample 1005 Andy 43 Illin
1001 Obama 63 Rested 1006 TigerKing 61 Jailed
1003 Tupac 25 Dead
1005 Andy 43 Illin
⋮
1 billion tuples
5-445/645 (Fall 2024)
SAMPLING
Modern DBMSs also collect samples SELECT AVG(age)
from tables to estimate selectivities. FROM people
WHERE age > 50
Update samples when the underlying id name age status
tables changes significantly. 1001 Obama 63 Rested
1002 Swift 34 Paid
1003 Tupac 25 Dead
1004 Bieber 30 Crunk
Table Sample 1005 Andy 43 Illin
1001 Obama 63 Rested 1006 TigerKing 61 Jailed
sel(age>50) = 1/3 1003 Tupac 25 Dead
⋮
1005 Andy 43 Illin
1 billion tuples
5-445/645 (Fall 2024)
CONCLUSION
Query optimization is critical for a database system.
→ SQL → Logical Plan → Physical Plan
→ Flatten queries before going to the optimization part.
Expression handling is also important.
→ Estimate costs using models based on summarizations.
NEXT CLASS
Transactions!
→ aka the second hardest part about database systems