Query Optimization
Query Optimization
Query Optimization
query
processor
Query optimization
How do we determine the “best” execution plan?
Strategy 1
ENAME(DUR>37EMP.ENO=ASG.ENO(EMP ASG))
Strategy 2
ENAME(EMP ENO (DUR>37 (ASG)))
Site 5 Site 5
result = EMP1’EMP2’ result2=(EMP1EMP2) ENODUR>37(ASG1ASG1)
EMP1’ EMP2’
ASG1 ASG2 EMP1 EMP2
Site 3 Site 4
EMP1’=EMP1 ASG1’ EMP2’=EMP2 ASG2’
ENO ENO
Site 1 Site 2 Site 3 Site 4
ASG1’ ASG2’
Site 1 Site 2
ASG1’=DUR>37(ASG1) ASG2’=DUR>37(ASG2)
Select
Project O(n)
Assume (without duplicate elimination)
relations of cardinality n Project
sequential scan (with duplicate elimination) O(nlog n)
Group
Join
Semi-join O(nlog n)
Division
Set Operators
Relation
cardinality
size of a tuple
fraction of tuples participating in a join with another relation
Common assumptions
independence between different attribute values
uniform distribution of attribute values within their domain
Query
Query
GLOBAL
GLOBAL
Decomposition
Decomposition SCHEMA
SCHEMA
Fragment Query
Global STATS ON
Global STATS ON
Optimization
Optimization
FRAGMENTS
FRAGMENTS
Optimized Local
Queries
PNO ENO
PNO
PNO,ENAME
ENO
Response Time
Do as many things as possible in parallel
May increase total time because of increased total activity