Query Planning & Optimization: Intro To Database Systems Andy Pavlo
Query Planning & Optimization: Intro To Database Systems Andy Pavlo
14 Optimization
Part I
Intro to Database Systems Andy Pavlo
15-445/15-645
Fall 2019 AP Computer Science
Carnegie Mellon University
2
ADMINISTRIVIA
Q U E R Y O P T I M I Z AT I O N
IBM SYSTEM R
Q U E R Y O P T I M I Z AT I O N
Heuristics / Rules
→ Rewrite the query to remove stupid / inefficient things.
→ These techniques may need to examine catalog, but they
do not need to examine data.
Cost-based Search
→ Use a model to estimate the cost of executing a plan.
→ Evaluate multiple equivalent plans for a query and pick
the one with the lowest cost.
ARCHITECTURE OVERVIEW
Cost
Application Schema Info
Model
System
Catalog
Plan
Schema Info Optimizer
SQL Rewriter
(Optional)
Name→Internal ID
Tree Rewriter 6 Physical
(Optional)
Plan
2 SQL Query
Binder
4 Logical
Plan
Parser
3 Abstract
Syntax
Tree
CMU 15-445/645 (Fall 2019)
7
Q U E R Y O P T I M I Z AT I O N I S N P - H A R D
T O D AY ' S A G E N D A
R E L AT I O N A L A L G E B R A E Q U I VA L E N C E S
P R E D I C AT E P U S H D O W N
SELECT s.name, e.cid
FROM student AS s, enrolled AS e
WHERE s.sid = e.sid
AND e.grade = 'A'
πname, cid(σgrade='A'(student⋈enrolled))
P R E D I C AT E P U S H D O W N
SELECT s.name, e.cid
FROM student AS s, enrolled AS e
WHERE s.sid = e.sid
AND e.grade = 'A'
p s.name,e.cid p s.name,e.cid
s grade='A'
⨝ s.sid=e.sid
⨝ s.sid=e.sid
s grade='A'
R E L AT I O N A L A L G E B R A E Q U I VA L E N C E S
SELECT s.name, e.cid
FROM student AS s, enrolled AS e
WHERE s.sid = e.sid
AND e.grade = 'A'
πname, cid(σgrade='A'(student⋈enrolled))
=
πname, cid(student⋈(σgrade='A'(enrolled )))
R E L AT I O N A L A L G E B R A E Q U I VA L E N C E S
Selections:
→ Perform filters as early as possible.
→ Reorder predicates so that the DBMS applies the most
selective one first.
→ Break a complex predicate, and push down
σp1∧p2∧…pn(R) = σp1(σp2(…σpn(R)))
Simplify a complex predicate
→ (X=Y AND Y=3) → X=3 AND Y=3
R E L AT I O N A L A L G E B R A E Q U I VA L E N C E S
Projections:
→ Perform them early to create smaller tuples and reduce
intermediate results (if duplicates are eliminated)
→ Project out all attributes except the ones requested or
required (e.g., joining keys)
PROJECTION PUSHDOWN
SELECT s.name, e.cid
FROM student AS s, enrolled AS e
WHERE s.sid = e.sid
AND e.grade = 'A'
p s.name,e.cid p s.name,e.cid
⨝ s.sid=e.sid ⨝ s.sid=e.sid
s grade='A' p ps
sid,name
sid,cid
grade='A'
student enrolled student enrolled
CMU 15-445/645 (Fall 2019)
16
CREATE TABLE A (
id INT PRIMARY KEY,
val INT NOT NULL ); MORE EXAMPLES
Join Elimination
SELECT A1.*
FROM A AS A1 JOIN A AS A2
ON A1.id = A2.id;
Join Elimination
SELECT * FROM A;
Ignoring Projections
SELECT * FROM A AS A1
WHERE EXISTS(SELECT val FROM A AS A2
WHERE A1.id = A2.id);
Ignoring Projections
SELECT * FROM A;
Ignoring Projections
SELECT * FROM A;
Merging Predicates
SELECT * FROM A
WHERE val BETWEEN 1 AND 100
OR val BETWEEN 50 AND 150;
Ignoring Projections
SELECT * FROM A;
Merging Predicates
SELECT * FROM A
WHERE val BETWEEN 1 AND 100
OR val BETWEEN 50 AND 150;
Ignoring Projections
SELECT * FROM A;
Merging Predicates
SELECT * FROM A
WHERE val BETWEEN 1 AND 150;
R E L AT I O N A L A L G E B R A E Q U I VA L E N C E S
Joins:
→ Commutative, associative
R⋈S = S⋈R
(R⋈S)⋈T = R⋈(S⋈T)
R E L AT I O N A L A L G E B R A E Q U I VA L E N C E S
CONCLUSION
NEXT CLASS
MID-TERM EXAM!
→ Seriously, this is not a joke.