4-Query - Processing (1) - PTIT
4-Query - Processing (1) - PTIT
Systems
© 2020 1
Outline
Introduction
Distributed and parallel database design
Distributed data control
Distributed Query Processing
Distributed Transaction Processing
Data Replication
Database Integration – Multidatabase Systems
Parallel Database Systems
Peer-to-Peer Data Management
Big Data Processing
NoSQL, NewSQL and Polystores
Web Data Management
© 2020 2
Outline
Distributed Query Processing
Query Decomposition and Localization
Join Ordering
Distributed Query Optimization
Adaptive Query Processing
© 2020 3
Query Processing in a DDBMS
Query
Processor
© 2020 4
Query Processing Components
Query language
SQL: “intergalactic dataspeak”
Query execution
The steps that one goes through in executing high-level
(declarative) user queries.
Query optimization
How do we determine the “best” execution plan?
© 2020 5
Selecting Alternatives
SELECT ENAME
FROM EMP NATURAL JOIN ASG
WHERE RESP = "Manager"
Strategy 1
ENAME(RESP=“Manager”EMP.ENO=ASG.ENO(EMP×ASG))
Strategy 2
ENAME(EMP ⋈ENO (RESP=“Manager” (ASG))
© 2020 7
Cost of Alternatives
Assume
size(EMP) = 400, size(ASG) = 1000
tuple access cost = 1 unit; tuple transfer cost = 10 units
Strategy 1
produce ASG': (10+10) tuple access cost
20
transfer ASG' to the sites of EMP: (10+10) tuple transfer cost
200
produce EMP': (10+10) tuple access cost 2
40
transfer EMP' to result site: (10+10) tuple transfer cost
200
Total Cost 460
Strategy 2
© 2020 8
transfer EMP to site 5: 400 tuple transfer cost
Query Optimization Objectives
Minimize a cost function
I/O cost + CPU cost + communication cost
These might have different weights in different distributed
environments
Wide area networks
Communication cost may dominate or vary much
Bandwidth
Speed
Protocol overhead
Local area networks
Communication cost not that dominant,so total cost function
should be considered
Can also maximize throughput
© 2020 9
Complexity of Relational Operations
Operation Complexity
Select
Project O(n)
Assume (without duplicate elimination)
Relations of cardinality n Project
Sequential scan (with duplicate elimination) O(n log n)
Group
Join
Semi-join O(n log n)
Division
Set Operators
© 2020 10
Types Of Optimizers
Exhaustive search
Cost-based
Optimal
Combinatorial complexity in the number of relations
Heuristics
Not optimal
Regroup common sub-expressions
Perform selection, projection first
Replace a join by a series of semijoins
Reorder operations to reduce intermediate relation size
Optimize individual operations
© 2020 11
Optimization Granularity
© 2020 12
Optimization Timing
Static
Compilation optimize prior to the execution
Difficult to estimate the size of the intermediate results⇒error
propagation
Can amortize over many executions
Dynamic
Run time optimization
Exact information on the intermediate relation sizes
Have to reoptimize for multiple executions
Hybrid
Compile using a static algorithm
If the error in estimate sizes > threshold, reoptimize at run time
© 2020 13
Statistics
Relation
Cardinality
Size of a tuple
Fraction of tuples participating in a join with another relation
Attribute
Cardinality of domain
Actual number of distinct values
Simplifying assumptions
Independence between different attribute values
Uniform distribution of attribute values within their domain
© 2020 14
Optimization Decision Sites
Centralized
Single site determines the “best” schedule
Simple
Need knowledge about the entire distributed database
Distributed
Cooperation among sites to determine the schedule
Need only local information
Cost of cooperation
Hybrid
One site determines the global schedule
Each site optimizes the local subqueries
© 2020 15
Network Topology
© 2020 16
Distributed Query Processing
Methodology
© 2020 17
Outline
Distributed Query Processing
Query Decomposition and Localization
Distributed Query Optimization
Join Ordering
Adaptive Query Processing
© 2020 18
Step 1 – Query Decomposition
Same as centralized query processing
Input : Calculus query on global relations
Normalization
Manipulate query quantifiers and qualification
Analysis
Detect and reject “incorrect” queries
Simplification
Eliminate redundant predicates
Restructuring
Calculus query algebraic query
Use transformation rules
© 2020 19
Step 2 – Data Localization
© 2020 20
Example
Assume
EMP is fragmented as follows:
EMP =
1 ENO≤“E3”(EMP)
EMP2= “E3”<ENO≤“E6”(EMP)
EMP3= ENO≥“E6”(EMP)
ASG fragmented as follows:
ASG =
1 ENO≤“E3”(ASG)
ASG2= ENO>“E3”(ASG)
In any query
Replace EMP by (EMP1 EMP2 EMP3)
Replace ASG by (ASG1 ASG2)
© 2020 21
Reduction for PHF
Reduction with selection
Relation R and FR={R1, R2, …, Rw} where Rj=pj(R)
SELECT *
FROM EMP
WHERE ENO="E5"
© 2020 22
Reduction for PHF
© 2020 23
Reduction for PHF
© 2020 24
Reduction for VF
Find useless (not empty) intermediate relations
Relation R defined over attributes A = {A1, ..., An} vertically
fragmented as Ri =A'(R) where A' A:
D,K(Ri) is useless if the set of projection attributes D is not in A'
Example: EMP1=ENO,ENAME (EMP); EMP2=ENO,TITLE (EMP)
SELECT ENAME
FROM EMP
© 2020 25
Reduction for DHF
Rule :
Distribute joins over unions
Apply the join reduction for horizontal fragmentation
Example
ASG1: ASG ⋉ENO EMP1
ASG2: ASG ⋉ENO EMP2
EMP1: TITLE=“Programmer” (EMP)
EMP2: TITLE=“Programmer” (EMP)
Query
SELECT *
FROM EMP NATURAL JOIN ASG
WHERE EMP.TITLE = "Mech. Eng."
© 2020 26
Reduction for DHF
Generic query
Selections first
© 2020 27
Reduction for DHF
Joins over unions
© 2020 28
Reduction for Hybrid Fragmentation
© 2020 29
Reduction for HF
Example
Consider the following hybrid
fragmentation:
EMP1= ENO≤"E4" (ENO,ENAME (EMP))
© 2020 30
Outline
Distributed Query Processing
Query Decomposition and Localization
Distributed Query Optimization
Join Ordering
Adaptive Query Processing
© 2020 31
Step 3 – Global Query Optimization
© 2020 32
Query Optimization Process
Input Query
Equivalent QEP
Best QEP
© 2020 33
Components
Search space
The set of equivalent algebra expressions (query trees)
Cost model
I/O cost + CPU cost + communication cost
These might have different weights in different distributed
environments (LAN vs WAN)
Can also maximize throughput
Search algorithm
How do we move inside the solution space?
Exhaustive search, heuristic algorithms (iterative improvement,
simulated annealing, genetic,…)
© 2020 34
Join Trees
Characterize the search space
for optimization
For N relations, there are O(N!)
equivalent join trees that can be
obtained by applying
commutativity and associativity
rules
SELECT ENAME,RESP
FROM EMP
NATURAL JOIN ASG
NATURAL JOIN PROJ
© 2020 35
Join Trees
© 2020 36
Search Strategy
© 2020 37
Outline
Distributed Query Processing
Query Decomposition and Localization
Distributed Query Optimization
Join Ordering
Adaptive Query Processing
© 2020 39
Join Ordering
© 2020 40
Join Ordering – Example
Consider
PROJ ⋈PNO ASG ⋈ENO EMP
© 2020 41
Join Ordering – Example
Execution alternatives
5. EMP Site 2
PROJ Site 2
Site 2 computes EMP ⋈ PROJ ⋈ ASG
© 2020 42
Semijoin-based Ordering
© 2020 43
Semijoin-based Ordering
Perform the join
Send R to Site 2
Site 2 computes R ⋈A S
Consider semijoin (R ⋉AS) ⋈AS
S' = A(S)
S' Site 1
Site 1 computes R' = R ⋉AS'
R' Site 2
Site 2 computes R' ⋈AS
Semijoin is better if
size(A(S)) + size(R ⋉AS)) < size(R)
© 2020 44
Full Reducer
© 2020 45
Full Reducer – Example
Consider
ET (ENO, ENAME, TITLE, CITY)
AT (ENO, PNO, RESP, DUR, CITY)
PT (PNO, PNAME, BUDGET, CITY)
© 2020 46
Full Reducer – example
© 2020 47
Join versus Semijoin-based Ordering
Semijoin-based induces more operators, but possibly on
smaller operands
© 2020 48
Distributed Cost Model
Cost functions
Total Time (or Total Cost)
Reduce each cost (in terms of time) component individually
Do as little of each cost component as possible
Optimizes resource utilization and increases system throughput
Response Time
Do as many things as possible in parallel
May increase total time because of increased total activity
© 2020 49
Total Time
© 2020 50
Response Time
© 2020 51
Example
© 2020 52
Database Statistics
© 2020 53
Statistics
For each relation R[A1, A2, …, An] fragmented as R1, …, Rr
length of each attribute: length(Ai)
the number of distinct values for each attribute in each fragment:
card(AiRj)
maximum and minimum values in the domain of each attribute:
min(Ai), max(Ai)
the cardinalities of each domain: card(dom[Ai])
The cardinalities of each fragment: card(Rj)
Selectivity factor of each operator on relations
See centralized query optimization statistics
© 2020 54
Distributed Query Optimization
Dynamic approach
Distributed INGRES
No static cost estimation, only runtime cost information
Static approach
System R*
Static cost model
Hybrid approach
2-step
© 2020 55
Dynamic Approach
1. Execute all monorelation queries (e.g., selection,
projection)
2. Reduce the multirelation query to produce irreducible
subqueries q1 q2 … qn such that there is only
one relation between qi and qi+1
3. Choose qi involving the smallest fragments to execute
(call MRQ')
4. Find the best execution strategy for MRQ'
1. Determine processing site
2. Determine fragments to move
5. Repeat 3 and 4
© 2020 56
Static Approach
© 2020 57
Static Approach – Performing Joins
Ship whole
Larger data transfer
Smaller number of messages
Better if relations are small
Fetch as needed
Number of messages = O(cardinality of external relation)
Data transfer per message is minimal
Better if relations are large and the selectivity is good
© 2020 58
Static Approach –
Vertical Partitioning & Joins
1. Move outer relation tuples to the site of the inner relation
(a) Retrieve outer tuples
(b) Send them to the inner relation site
(c) Join them as they arrive
© 2020 59
Static Approach –
Vertical Partitioning & Joins
2. Move inner relation to the site of outer relation
Cannot join as they arrive; they need to be stored
Total cost = cost (retrieving qualified outer tuples)
+ no. of outer tuples fetched * cost(retrieving matching inner
tuples from temporary storage)
+ cost(retrieving qualified inner tuples)
+ cost(storing all qualified inner tuples in temporary storage)
+ msg. cost * no. of inner tuples fetched * avg. inner tuple
size/msg. size
© 2020 60
Static Approach –
Vertical Partitioning & Joins
3. Move both inner and outer relations to another site
Total cost = cost(retrieving qualified outer tuples)
+ cost(retrieving qualified inner tuples)
+ cost(storing inner tuples in storage)
+ msg. cost × (no. of outer tuples fetched * avg. outer tuple
size)/msg. size
+ msg. cost * (no. of inner tuples fetched * avg. inner tuple
size)/msg. size
+ no. of outer tuples fetched * cost(retrieving inner tuples from
temporary storage)
© 2020 61
Static Approach –
Vertical Partitioning & Joins
4. Fetch inner tuples as needed
(a) Retrieve qualified tuples at outer relation site
(b) Send request containing join column value(s) for outer tuples to
inner relation site
(c) Retrieve matching inner tuples at inner relation site
(d) Send the matching inner tuples to outer relation site
(e) Join as they arrive
Total Cost = cost(retrieving qualified outer tuples)
+ msg. cost * (no. of outer tuples fetched)
+ no. of outer tuples fetched * no. of inner tuples fetched * avg.
inner tuple size * (msg. cost / msg. size)
+ no. of outer tuples fetched * cost(retrieving matching inner tuples
for one outer value)
© 2020 62
2-Step Optimization
1. At compile time, generate a static plan with operation
ordering and access methods only
2. At startup time, carry out site and copy selection and
allocate operations to sites
Static plan Runtime plan
© 2020 63
2-Step – Problem Definition
Given
A set of sites S = {s1, s2, …,sn} with the load of each site
A query Q ={q1, q2, q3, q4} such that each subquery qi is the
maximum processing unit that accesses one relation and
communicates with its neighboring queries
For each qi in Q, a feasible allocation set of sites Sq={s1, s2,
…,sk} where each site stores a copy of the relation in qi
The objective is to find an optimal allocation of Q to S
such that
The load unbalance of S is minimized
The total communication cost is minimized
© 2020 64
2-Step Algorithm
© 2020 65
2-Step Algorithm Example
© 2020 68
Example: QEP with Blocked Operator
6
Adaptive Query Processing – Definition
A query processing is adaptive if it receives information
from the execution environment and determines its
behavior accordingly
Feed-back loop between optimizer and runtime environment
Communication of runtime information between DDBMS
components
Additional components
Monitoring, assessment, reaction
Embedded in control operators of QEP
Tradeoff between reactiveness and overhead of
adaptation
© 2020 70
Adaptive Components
© 2020 71
Eddy Approach
© 2020 72
QEP with Eddy
D= {R, S, T}
P = {P (R), R ⋈1 S, S ⋈2 T)
C = {S < T} where < imposes S tuples to probe T tuples using an index on join
attribute
Access to T is wrapped by ⋈
© 2020 73