0% found this document useful (0 votes)
4 views

4-2-Query_Processing

The document outlines the principles of distributed database systems, focusing on distributed query processing, optimization strategies, and the importance of join ordering. It discusses various optimization techniques including static, dynamic, and hybrid approaches, as well as the use of semijoins to improve query performance. Additionally, it highlights the complexities involved in query optimization, such as decision sites and the search space for join trees.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

4-2-Query_Processing

The document outlines the principles of distributed database systems, focusing on distributed query processing, optimization strategies, and the importance of join ordering. It discusses various optimization techniques including static, dynamic, and hybrid approaches, as well as the use of semijoins to improve query performance. Additionally, it highlights the complexities involved in query optimization, such as decision sites and the search space for join trees.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 106

Principles of Distributed Database

Systems
M. Tamer Özsu
Patrick Valduriez

© 2020, M.T. Özsu & P. Valduriez 1


Outline
◼ Distributed Query Processing
❑ Overview
❑ Query Modification
❑ Query Decomposition
❑ Data Localization
❑ Query Translation
❑ Distributed Cost Model
❑ Distributed Query Optimization
◼ Join Ordering
◼ Dynamic Approach
◼ Static Approach
◼ Hybrid Approach
◼ Adaptive Query Processing

© 2020, M.T. Özsu & P. Valduriez 2


Global Query Optimization
Input: Global query
◼ Find the best global schedule

❑ Minimize a cost function

❑ Distributed join processing

◼ Bushy vs. linear trees

◼ Which relation to ship where?

◼ Ship-whole vs ship-as-needed

❑ Decide on the use of semijoins

◼ Semijoin saves on communication at the expense


of more local processing
❑ Join methods

◼ Nested loop, merge join or hash join

© 2020, M.T. Özsu & P. Valduriez 3


Components of Query Optimization
◼ Search space
❑ The set of equivalent algebra expressions (query trees)

◼ Search Strategy
❑ Explores the search space and selects the best plan, using the
cost model
❑ Exhaustive search, heuristic algorithms

◼ Cost model
❑ I/O cost + CPU cost + communication cost
❑ These might have different weights in different distributed
environments (LAN vs WAN)
❑ Can also maximize throughput

© 2020, M.T. Özsu & P. Valduriez 4


Query Optimization Process

Input Query

Search Space Transformation


Generation Rules

Equivalent QEP

Search Cost Model


Strategy

Best QEP

© 2020, M.T. Özsu & P. Valduriez 5


Search Space

Linear Join Tree Bushy Join Tree




⋈ R4

⋈ R3 ⋈ ⋈
R1 R2 R1 R2 R3 R4
Search Strategy

◼ How to “move” in the search space


◼ Deterministic
 Start from base relations and build plans by adding one relation
at each step
 Dynamic programming
 Greedy
◼ Randomized
 Search for optimalities around a particular starting point
 Better when > 10 relations
 Iterative improvement

© 2020, M.T. Özsu & P. Valduriez 7


Search Strategies

◼ Deterministic ⋈
⋈ ⋈ R4

⋈ ⋈ R3 ⋈ R3

R1 R2 R1 R2 R1 R2

◼ Randomized
⋈ ⋈
⋈ R3 ⋈ R2

R1 R2 R1 R3

© 2020, M.T. Özsu & P. Valduriez 8


Optimization Granularity

◼ Single query at a time


❑ Cannot use common intermediate results

◼ Multiple queries at a time


❑ Efficient if many similar queries

❑ Decision space is much larger

© 2020, M.T. Özsu & P. Valduriez 9


Optimization Timing

◼ Static
❑ Compilation ➔ optimize prior to the execution
❑ Difficult to estimate the size of the intermediate results
❑ Can amortize (divide) over many executions
◼ Dynamic
❑ Run time optimization
❑ Exact information on the intermediate relation sizes
❑ Have to re-optimize for multiple executions
◼ Hybrid
❑ Compile using a static algorithm
❑ If the error in estimate sizes > threshold, re-optimize at run time

© 2020, M.T. Özsu & P. Valduriez 10


Query Optimization Issues –
Optimization Timing
◼ Static
❑ Static query optimization is done at query
compilation time.
❑ This timing is appropriate for use with the exhaustive
search method.
❑ Since the sizes of the intermediate relations of a
strategy are not known until run time, they must be
estimated using database statistics.
❑ Errors in these estimates can lead to the choice of
suboptimal strategies.
Query Optimization Issues –
Optimization Timing
◼ Dynamic
❑ Dynamic query optimization proceeds at query execution
time.
❑ At any point of execution, the choice of the best next
operator can be based on accurate knowledge of the results
of the operators executed previously.
❑ The main advantage is that the sizes of intermediate
relations are available to the query processor, thereby
minimizing the probability of a bad choice.
❑ The main shortcoming is that query optimization, an
expensive task, must be repeated for each execution of the
query.
❑ Therefore, this approach is best for ad-hoc queries.
Query Optimization Issues –
Optimization Timing
◼ Hybrid
❑ Hybrid query optimization attempts to
provide the advantages of static query
optimization while avoiding the issues
generated by inaccurate estimates.
❑ The approach is basically static, but
dynamic query optimization may take
place at run time when a high difference
between predicted sizes and actual size of
intermediate relations is detected.
Optimization Decision Sites

◼ Centralized
❑ Single site determines the “best” schedule
❑ Simple
❑ Need knowledge about the entire distributed database
◼ Distributed
❑ Cooperation among sites to determine the schedule
❑ Need only local information
❑ Cost of cooperation
◼ Hybrid
❑ One site determines the global schedule
❑ Each site optimizes the local subqueries

© 2020, M.T. Özsu & P. Valduriez 14


Query Optimization Issues – Decision
Sites
◼ A single site or several sites may participate in the selection of
the strategy to be applied for answering the query.
◼ Most systems use the centralized decision approach, in which
a single site generates the strategy.
◼ The centralized approach is simpler but requires knowledge of
the entire distributed database, while the distributed approach
requires only local information.
◼ The decision process could be distributed among various sites
participating in the elaboration of the best strategy.
◼ Hybrid approaches where one site makes the major decisions
and other sites can make local decisions are also frequent.
Outline
◼ Distributed Query Processing
❑ Overview
❑ Query Modification
❑ Query Decomposition
❑ Data Localization
❑ Query Translation
❑ Distributed Cost Model
❑ Distributed Query Optimization
◼ Join Ordering
◼ Dynamic Approach
◼ Static Approach
◼ Hybrid Approach
◼ Adaptive Query Processing

© 2020, M.T. Özsu & P. Valduriez 16


Join Trees
◼ Characterize the search space for
optimization
◼ For N relations, there are O(N!)
equivalent join trees that can be
obtained by applying commutativity
and associativity rules

SELECT ENAME,RESP
FROM EMP
NATURAL JOIN ASG
NATURAL JOIN PROJ

© 2020, M.T. Özsu & P. Valduriez 17


Join Trees

◼ Two major shapes


◼ Linear versus bushy trees

Linear Join Tree Bushy Join Tree

© 2020, M.T. Özsu & P. Valduriez 18


Join Ordering

• Ordering joins is an important aspect of


centralized query optimization.
• Join ordering in a distributed context is even
more important since joins between fragments
may increase the communication time.
Join Ordering

• Consider two relations only

if size(R) < size(S)

Site 1 R S Site 2
if size(R) > size(S)

◼ The obvious choice of the relation to transfer is to send


the smaller relation to the site of the larger one, which
gives rise to two possibilities, as shown above.
◼ To make this choice we need to evaluate the sizes of R
and S.
Join Ordering
• Consider two relations only

if size(R) < size(S)


Site 1 R S Site 2
if size(R) > size(S)

◼ As in the case of a single join, the objective of the join-ordering algorithm is to


transmit smaller operands.
◼ Thus, estimating the size of relations and/or join results is mandatory, but also
difficult.
◼ Multiple relations are more difficult because of too many alternatives.
❑ Compute the cost of all alternatives and select the best one.
❑ Necessary to compute the size of intermediate relations which is
difficult.
❑ Use heuristics
Join Ordering – Example

Consider
PROJ ⋈PNO ASG ⋈ENO EMP

© 2020, M.T. Özsu & P. Valduriez 22


Join Ordering – Example
Execution alternatives:
1. EMP→ Site 2 2. ASG → Site 1
Site 2 computes ASG'=EMP ⋈ ASG Site 1 computes ASG'=EMP ⋈ ASG
ASG'→ Site 3 ASG' → Site 3
Site 3 computes ASG' ⋈ PROJ Site 3 computes ASG’ ⋈ PROJ

3. ASG → Site 3 4. PROJ → Site 2


Site 3 computes ASG'=ASG ⋈ PROJ Site 2 computes ASG'=PROJ ⋈ ASG
ASG' → Site 1 ASG' → Site 1
Site 1 computes ASG' ▷◁ EMP Site 1 computes ASG' ⋈ EMP

5. EMP → Site 2 ASG


PROJ → Site 2
ENO Site 2 PNO
Site 2 computes EMP ⋈ PROJ ⋈ ASG
EMP PROJ

Site 1 Site 3
Join Ordering – Example
◼ To select one of these, the following sizes must be known or
predicted:
size(EMP), size(ASG), size(PROJ),
size(EMP ⋈ ASG), and size(ASG ⋈ PROJ)
◼ Furthermore, if the response time is considered, the optimization
must consider that transfers can be done in parallel with strategy
5.
◼ An alternative is to use heuristics that consider only the sizes of
the operand relations by assuming, for example, that the
cardinality of the resulting join is the product of operand
cardinalities.
◼ In this case, relations are ordered by increasing sizes and the
order of execution is given by this ordering and the join graph.
◼ For instance, the order (EMP, ASG, PROJ) could use strategy 1,
while the order (PROJ, ASG, EMP) could use strategy 4.
Semijoin based Algorithms
◼ Semijoin operation can be used to
decrease the total time of join queries R
A B C
◼ Consider the join of two relations: 1 1 2
2 1 2
❑ R[A] (located at site 1)
3 2 2
❑ S[A] (located at site 2) 6 2 3
7 3 3
◼ Alternatives:
S
1. Do the join R ⋈AS A D E
1 3 6
2. Perform one of the semijoin equivalents
2 3 6
R ⋈ AS  (R ⋉AS) ⋈AS 5 5 7

 R ⋈A (S ⋉A R)
8 1 3

 (R ⋉A S) ⋈A (S ⋉A R)
Semijoin based Algorithms
R ⋈ AS
1. Do the join R ⋈AS A B C D E
1 1 2 3 6
2. Perform one of the semijoin
2 1 2 3 6
equivalents
R ⋈A S  (R ⋉AS) ⋈AS R ⋉AS
 R ⋈A (S ⋉A R)
A B C
1 1 2
 (R ⋉A S) ⋈A (S ⋉A R) 2 1 2
R
S
A B C
1 1 2 A D E S ⋉A R
2 1 2 1 3 6 A D E
3 2 2 2 3 6
6 2 3 5 5 7
1 3 6
7 3 3 8 1 3 2 3 6
Semijoin based Algorithms
◼ Perform the join ( R ⋈AS ) assuming that S Site 2
size(R) > size(S). A
❑ send S to Site 1 Site 1 R
❑ Site 1 computes R ⋈A S
◼ Consider semijoin (R ⋉AS) ⋈AS
❑ S' = A(S)
❑ S' → Site 1
❑ Site 1 computes R' = R ⋉AS'
❑ R'→ Site 2
❑ Site 2 computes R' ⋈AS
◼ Semijoin is better if size(A(S)) + size(R ⋉AS’))
< size(S)
Join versus Semijoin

◼ Compared with the join, the semijoin induces more


operations but possibly on smaller operands.
◼ If the join attribute length is much smaller than the
length of an entire tuple, then the semijoin approach
can result in significant savings in communication
time.
◼ Using semijoins may well increase the local
processing time, since one of the two joined relations
must be accessed twice.
◼ Using semijoins might not be a good idea if the
communication time is not the dominant factor, as is
the case with local area networks
Join versus Semijoin-based Ordering

◼ Semijoin-based induces more operators, but possibly on


smaller operands

© 2020, M.T. Özsu & P. Valduriez 29


Full Reducer

◼ Optimal semijoin program that reduces each relation


more than others
◼ How to find the full reducer?
❑ Enumeration of all possible semijoin programs and select the
one that has best size reduction
◼ Problem
❑ For cyclic queries, no full reducers can be found
❑ For tree queries, full reducers exist but the number of candidate
semijoin programs is exponential in the number of relations

© 2020, M.T. Özsu & P. Valduriez 30


Full Reducer – Example

Consider
ET (ENO, ENAME, TITLE, CITY)
AT (ENO, PNO, RESP, DUR, CITY)
PT (PNO, PNAME, BUDGET, CITY)

And the cyclic query


SELECT ENAME, PNAME
FROM ET NATURAL JOIN AT
NATURAL JOIN PT
NATURAL JOIN ET

© 2020, M.T. Özsu & P. Valduriez 31


Full Reducer – example

◼ Solution: transform the cyclic


query into a tree
❑ Remove one arc of the
cyclic graph
❑ Add appropriate
predicates to other arcs
such that the removed
predicate is preserved by
transitivity

© 2020, M.T. Özsu & P. Valduriez 32


Outline
◼ Distributed Query Processing
❑ Overview
❑ Query Modification
❑ Query Decomposition
❑ Data Localization
❑ Query Translation
❑ Distributed Cost Model
❑ Distributed Query Optimization
◼ Join Ordering
◼ Dynamic Approach
◼ Static Approach
◼ Hybrid Approach
◼ Adaptive Query Processing

© 2020, M.T. Özsu & P. Valduriez 33


Distributed Query Optimization

◼ Distributed Dynamic approach


❑ Distributed INGRES
❑ No static cost estimation, only runtime cost information
◼ Distributed Static approach
❑ System R*
❑ Static cost model
◼ Hybrid approach
❑ 2-step
◼ But, we need to see Centralized Query Optimization first

© 2020, M.T. Özsu & P. Valduriez 34


Centralized Query Optimization

◼ The main query optimization techniques for centralized systems


is a prerequisite to understand distributed query optimization
with three reasons.
◼ First, a distributed query is translated into local queries, each of
which is processed in a centralized way.
◼ Second, distributed query optimization techniques are often
extensions of the techniques for centralized systems.
◼ Finally, centralized query optimization is a simpler problem; the
minimization of communication costs makes distributed query
optimization more complex.
Centralized Query Optimization

◼ Dynamic (Ingres project at University of California,


Berkeley)
❑ Interpretive
◼ Static (System R project at IBM)
❑ Exhaustive search
◼ Hybrid (Volcano project at Oregon Graduate Institute)
❑ Choose node within plan
Dynamic Algorithm
 Decompose each multi-variable query into a
sequence of mono-variable queries with a common
variable
 Process each by a one variable query processor
❑ Choose an initial execution plan (heuristics)
❑ Order the rest by considering intermediate relation
sizes

No need for a cost model, so No statistical


information is maintained.
Heuristic Optimization

◼ Cost-based optimization is expensive, even with dynamic


programming.
◼ Systems may use heuristics to reduce the number of choices
that must be made in a cost-based fashion.
◼ Heuristic optimization transforms the query-tree by using a set
of rules that typically (but not in all cases) improve execution
performance:
❑ Perform selection early (reduces the number of tuples)

❑ Perform projection early (reduces the number of attributes)

❑ Perform most restrictive selection and join operations (i.e.


with smallest result size) before other similar operations.
❑ Some systems use only heuristics, others combine
heuristics with partial cost-based optimization.
Dynamic Algorithm–Decomposition

◼ Replace an n variable query q by a series of queries


q1→q2 → … → qn
where qi uses the result of qi-1.
◼ Detachment
❑ Query q decomposed into q' → q" where q' and q"
have a common variable which is the result of q'
◼ Tuple substitution
❑ Replace the value of each tuple with actual values
and simplify the query
q(V1, V2, ... Vn) → (q' (t1, V2, V2, ... , Vn), t1R)
Detachment
q: SELECT V1.A1,V2.A2,V3.A3, …,Vn.An
FROM R1 V1, …,Rn Vn
WHERE P1(V1.A1’)AND P2(V1.A1,V2.A2,…, Vn.An)


q': SELECT V1.A1 INTO R1'
FROM R1 V1
WHERE P1(V1.A1)

q": SELECT V2.A2, …,Vn.An


FROM R1' V1, R2 V2, …,Rn Vn
WHERE P2(V1.A1, V2.A2, …,Vn.An)
Detachment Example
Names of employees working on CAD/CAM project
q1 : SELECT EMP.ENAME
FROM EMP, ASG, PROJ
WHERE EMP.ENO=ASG.ENO AND
ASG.PNO=PROJ.PNO AND
PROJ.PNAME="CAD/CAM"

q11: SELECT PROJ.PNO INTO JVAR
FROM PROJ
WHERE PROJ.PNAME="CAD/CAM"

q': SELECT EMP.ENAME


FROM EMP,ASG,JVAR
WHERE EMP.ENO=ASG.ENO
AND ASG.PNO=JVAR.PNO
Detachment Example (cont’d)

q': SELECT EMP.ENAME


FROM EMP,ASG,JVAR
WHERE EMP.ENO=ASG.ENO
AND ASG.PNO=JVAR.PNO


q12: SELECT ASG.ENO INTO GVAR
FROM ASG,JVAR
WHERE ASG.PNO=JVAR.PNO

q13: SELECT EMP.ENAME


FROM EMP,GVAR
WHERE EMP.ENO=GVAR.ENO
Detachment Example (cont’d)
q11: SELECT PROJ.PNO INTO JVAR
FROM PROJ
WHERE PROJ.PNAME="CAD/CAM“
q12: SELECT ASG.ENO INTO GVAR
FROM ASG,JVAR
WHERE ASG.PNO=JVAR.PNO
q13: SELECT EMP.ENAME
FROM EMP,GVAR
WHERE EMP.ENO=GVAR.ENO
◼ Thus query q1 has been reduced to the subsequent queries q11 →
q12 → q13.
◼ Query q11 is mono-relation and can be executed. However, q12
and q13 are not mono-relation and cannot be reduced by
detachment.
Detachment Example (cont’d)

◼ Multi-relation queries, which cannot be further


detached (e.g., q12 and q13), are irreducible.

◼ Irreducible queries are converted into mono-relation


queries by tuple substitution.

◼ Given an n-relation query q, the tuples of one relation


are substituted by their values, thereby producing a
set of (n−1)-relation queries.
Tuple Substitution

◼ First, one relation in q is chosen for tuple


substitution. Let R1 be that relation.

◼ Then for each tuple t1i in R1, the attributes in q


are replaced by their actual values, thereby
generating a query q’ with n− 1 relations.

◼ Therefore, the total number of queries in q’


produced by tuple substitution is card(R1).
Tuple Substitution
◼ q11 is a mono-variable query
◼ q12 and q13 are subject to tuple substitution
◼ Assume GVAR has two tuples only: 〈E1〉 and 〈E2〉
◼ Then q13 becomes
q131: SELECT EMP.ENAME q13: SELECT EMP.ENAME
FROM EMP FROM EMP,GVAR
WHERE EMP.ENO="E1" WHERE EMP.ENO=GVAR.ENO

q132: SELECT EMP.ENAME


FROM EMP
WHERE EMP.ENO="E2"
Distributed Dynamic Approach
1. Execute all monorelation queries (e.g., selection,
projection)
2. Reduce the multi-relation query (MRQ) to produce
irreducible subqueries q1→ q2 → … → qn such that
there is only one relation between qi and qi+1
3. Choose qi involving the smallest fragments to execute
(call MRQ')
4. Find the best execution strategy for MRQ'
1. Determine processing site
2. Determine fragments to move
5. Repeat 3 and 4

© 2020, M.T. Özsu & P. Valduriez 47


Distributed Dynamic Algorithm

◼ Example. Assume that relations EMP, ASG, and PROJ of the


query are stored as follows, where relation EMP is fragmented.
❑ Site 1: EMP1 and ASG

❑ Site 2: EMP2 and PROJ

◼ There are several possible strategies, including the following:


1. Execute the entire query (EMP ⋈ ASG ⋈ PROJ) by moving
EMP1 and ASG to site 2.
2. Execute (EMP ⋈ ASG) ⋈ PROJ by moving (EMP1 ⋈ ASG)
and ASG to site 2, and so on.
◼ The choice between the possible strategies requires
an estimate of the size of the intermediate results.
Centralized Query Optimization

◼ Dynamic (Ingres project at UCB)


❑ Interpretive

◼ Static (System R project at IBM)


❑ Exhaustive search
◼ Hybrid (Volcano project at OGI)
❑ Choose node within plan
Static Approach

◼ Cost function includes local processing as well as


transmission
◼ Considers only joins
◼ “Exhaustive” search
◼ Compilation

© 2020, M.T. Özsu & P. Valduriez 50


Static Algorithm

◼ With static query optimization, there is a clear


separation between the generation of the QEP at
compile-time and its execution by the DBMS execution
engine.
◼ Thus, an accurate cost model is key to predict the costs
of candidate QEPs.
◼ The input to the optimizer is a relational algebra tree
resulting from the decomposition of an SQL query.
◼ The output is a QEP that implements the “optimal”
relational algebra tree.
Static Algorithm

◼ The optimizer assigns a cost (in terms of time) to every


candidate tree and retains the one with the smallest
cost.
◼ The candidate trees are obtained by a permutation of
the join orders of the n relations of the query using the
commutativity and associativity rules.
◼ To limit the overhead of optimization, the number of
alternative trees is reduced using dynamic
programming.
◼ The set of alternative strategies is constructed
dynamically so that, when two joins are equivalent by
commutativity, only the cheapest one is kept.
Static Algorithm

◼ The optimization algorithm consists of two major steps.


◼ First, the best access method to each individual relation based
on a select predicate is predicted.
◼ Second, for each relation R, the best join ordering is estimated,
where R is first accessed using its best single-relation access
method. The cheapest ordering becomes the basis for the best
execution plan.
◼ Execute joins

❑ Determine the possible ordering of joins

❑ Determine the cost of each ordering

❑ Choose the join ordering with minimal cost


Static Algorithm – Example

Names of employees working on CAD/CAM project

q1 : SELECT EMP.ENAME
FROM EMP, ASG, PROJ
WHERE EMP.ENO=ASG.ENO AND
ASG.PNO=PROJ.PNO AND
PROJ.PNAME="CAD/CAM"

EMP.ENO=ASG.ENO ASG ASG.PNO=PROJ.PNO


ENO PNO

EMP PROJ
Block Nested-Loop Join

◼ Variant of nested-loop join in which every block of inner


relation is paired with every block of outer relation.
for each block Br of r do begin
for each block Bs of s do begin
for each tuple tr in Br do begin
for each tuple ts in Bs do begin
Check if (tr,ts) satisfy the join condition
if they do, add tr • ts to the result.
end
end
end
end
Block Nested-Loop Join (Cont.)

◼ Worst case estimate: br  bs + br block transfers + 2 * br seeks


❑ Each block in the inner relation s is read once for each block in the
outer relation
◼ Best case: bs + br block transfers + 2 seeks.
❑ If the smaller relation fits entirely in memory, use that as the inner
relation.
◼ Examples use the following information
❑ Number of records of student: 5,000
❑ Number of blocks of student: 100
❑ Number of records of takes: 10,000
❑ Number of blocks of takes: 400
Block Nested-Loop Join (Cont.)

◼ In the worst case, if there is enough memory only to hold one block of
each relation, the estimated cost is
❑ br  bs + br block transfers,

❑ 2 * br seeks,

❑ Assuming worst case memory availability cost estimate is

❑ with student as outer relation:

 100  400 + 100 = 40,100 block transfers,

 2 * 100 = 200 seeks

❑ with takes as the outer relation

 400  100 + 400 = 40,400 block transfers

 2 * 400 = 800 seeks


Indexed Nested-Loop Join
◼ Index lookups can replace file scans if
❑ join is an equi-join or natural join and
❑ an index is available on the inner relation’s join attribute
◼ For each tuple tr in the outer relation r, use the index to look
up tuples in s that satisfy the join condition with tuple tr.
◼ Worst case: buffer has space for only one block of r, and one
block of index for each tuple in r, we perform an index lookup
on s.
◼ Cost of the join: br + nr  c block transfers and seeks
❑ Where c is the cost of traversing index and fetching all
matching s tuples for one tuple of r
❑ c can be estimated as cost of a single selection on s using
the join condition. (i.e., height +1)
◼ If indices are available on join attributes of both r and s, use
the relation with fewer tuples as the outer relation.
Example of Indexed Nested-Loop Join
Costs
◼ Compute student takes, with student as the outer
relation.
◼ Let takes have a primary B+-tree index on the attribute ID,
which contains 20 entries in each index node.
◼ Since takes has 10,000 tuples, the height of the tree is 4,
and one more access is needed to find the actual data.
❑ If there are K search-key values in the file, the height of
the tree is no more than logn/2(K) =
log20/2(10,000) = 4
◼ Cost of indexed nested loops join (br + nr  c)
❑ 100 + 5000 * 5 = 25,100 block transfers and seeks.
Merge-Join
1. Sort both relations on their join
attribute (if not already sorted on
the join attributes).
2. Merge the sorted relations to join
them
1) Join step is similar to the
merge stage of the merge-
sort algorithm.
2) Main difference is handling of
duplicate values in join
attribute — every pair with
same value on join attribute
must be matched
3) Detailed algorithm in book
Merge-Join (Cont.)

◼ Can be used only for equi-joins and natural joins


◼ Each block needs to be read only once (assuming all
tuples for any given value of the join attributes fit in
memory
◼ Thus the cost of merge join is:
❑ br + bs block transfers + br / bb + bs / bb seeks

+ the cost of sorting if relations are unsorted.


❑ Buffer size: bb blocks for each relation.
Static Algorithm – Example

Names of employees working on CAD/CAM project


q1 : SELECT EMP.ENAME
FROM EMP, ASG, PROJ
WHERE EMP.ENO=ASG.ENO AND
ASG.PNO=PROJ.PNO AND
PROJ.PNAME="CAD/CAM"
Assume
❑ EMP has an index on ENO,
❑ ASG has an index on PNO,
❑ PROJ has an index on PNO and an index on PNAME
Example (cont’d)

• Choose the best access paths to each relation


❑ EMP has an index on ENO,
❑ ASG has an index on PNO,
❑ PROJ has an index on PNO and an index on PNAME
◼ Determine the best join ordering
❑ EMP ▷◁ ASG ▷◁ PROJ
❑ ASG ▷◁ EMP ▷◁ PROJ
❑ ASG ▷◁ PROJ ▷◁ EMP
❑ PROJ ▷◁ ASG ▷◁ EMP
❑ EMP ▷◁ PROJ ▷◁ ASG
❑ PRO ▷◁ EMP ▷◁ ASG
❑ Select the best ordering based on the join costs
Example (cont’d)

◼ Determine the best join ordering


❑ (EMP ▷◁ ASG) ▷◁ PROJ
❑ (ASG ▷◁ EMP) ▷◁ PROJ
❑ (ASG ▷◁ PROJ) ▷◁ EMP
❑ (PROJ ▷◁ ASG) ▷◁ EMP
❑ (EMP ▷◁ PROJ) ▷◁ ASG (x)
❑ (PROJ ▷◁ EMP) ▷◁ ASG (x)

❑ Select the best ordering based on the join costs


Static Algorithm

❑ (EMP ▷◁ ASG) ▷◁ PROJ (x) The number of tuples of PROJ is smaller


❑ (ASG ▷◁ EMP) ▷◁ PROJ (x) than that of ASG or EMP typically
❑ (ASG ▷◁ PROJ) ▷◁ EMP
❑ (PROJ ▷◁ ASG) ▷◁ EMP

❑ EMP has an index on ENO,


❑ ASG has an index on PNO,
❑ PROJ has an index on PNO and an index on PNAME

◼ ((PROJ ⋈ ASG) ⋈ EMP) is selected. Why?


❑ The number of tuples of PROJ is smaller than that of ASG typically

❑ ( br + nr  c block transfers and seeks for indexed nested loop join)


Distributed Static Approach

◼ This algorithm performs an exhaustive search of all


alternative strategies in order to choose the one with
the least cost.
◼ Although predicting and enumerating these
strategies may be costly, the overhead of
exhaustive search is rapidly amortized if the query is
executed frequently.
◼ Query compilation is a distributed task, coordinated
by a master site, where the query is initiated.
Distributed Static Approach

◼ The optimizer of the master site makes all intersite


decisions, such as the selection of the execution sites and
the fragments as well as the method for transferring data.
◼ The other sites that have relations involved in the query,
make the remaining local decisions (such as the ordering of
joins at a site) and generate local access plans for the
query.
◼ The objective function of the optimizer is the general total
time function, including local processing and
communications costs.
Distributed Static Approach

◼ The input to the algorithm of the static approach is a


localized query expressed as a relational algebra tree (the
query tree), the location of relations, and their statistics.
◼ As in the centralized case, the optimizer must select the
join ordering, the join algorithm (block nested-loop or
merge-join), and the access path for each fragment (e.g.,
primary index, sequential scan, etc.).
◼ These decisions are based on statistics and formulas used
to estimate the size of intermediate results and access path
information.
Distributed Static Approach

◼ In addition, the optimizer must select the sites


of join results and the method of transferring
data between sites.
◼ To join two relations, there are three candidate
sites: the site of the first relation, the site of the
second relation, or a third site (e.g., the site
that needs results of the join for other
operations).
Distributed Static Approach –
Performing Joins
◼ Ship whole

◼ Fetch as needed

© 2020, M.T. Özsu & P. Valduriez 70


Distributed Static Approach –
Performing Joins
Two methods are supported for intersite data transfers.
◼ Ship whole: The entire relation is shipped to the join
site and stored in a temporary relation before being
joined.
❑ Larger data transfer

❑ Smaller number of messages

❑ Better if relations are small


Distributed Static Approach –
Performing Joins
Two methods are supported for intersite data transfers.
◼ Fetch as needed : The external relation is sequentially scanned,
and for each tuple the join value is sent to the site of the internal
relation, which selects the internal tuples matching the value and
sends the selected tuples to the site of the external relation.
◼ This method is equivalent to the semijoin of the internal relation
with each external tuple
❑ Smaller data transfer

❑ Larger number of messages

◼ Number of messages = O(cardinality of external relation)

◼ Data transfer per message is minimal

❑ Better if relations are large and the selectivity is good (only a


few matching tuples).
Distributed Static Approach –
Performing Joins
◼ Given the join of an external (outer) relation R with an
internal (inner) relation S on attribute A, There are five
join strategies.
◼ In what follows we describe each strategy in detail and
provide a simplified cost formula for each,
❑ where LT denotes local processing time (I/O + CPU time)
and
❑ CT denotes communication time.
◼ For simplicity, we ignore the cost of producing the result.
◼ For convenience, we denote by s the average number of
tuples of S that match one tuple of R:
Distributed Static Approach –
Performing Joins
1. Move outer relation (R) tuples to the site of the inner
relation (S)
(a) Retrieve outer tuples
(b) Send them to the inner relation site
(c) Join them as they arrive
❑ LT = Local Processing Time (I/O time + CPU time)
❑ CT = Communication Time

Total Cost = LT (retrieving card(R) tuples from R)

+ CT (size (R))

+ LT (retrieve card(S) tuples from S) * card(R)


Distributed Static Approach –
Performing Joins
2. Move inner relation (S) to the site of outer relation (R)
Cannot join as they arrive; they need to be stored
Total Cost = LT (retrieving card(S) tuples from S)
+ CT (size (S))
+ LT (store card(S) tuples into T)
+ LT (retrieving card(R) tuples from R)
+ LT (retrieve card(S) tuples from T) * card(R)
Distributed Static Approach –
Performing Joins
3. Move both inner and outer relations to another site
Total Cost = LT (retrieving card(S) tuples from S)
+ CT (size (S))
+ LT (store card(S) tuples into T)
+ LT (retrieving card(R) tuples from R)
+ CT (size (R))
+ LT (retrieve card(S) tuples from T) * card(R)
Distributed Static Approach –
Performing Joins
4. Fetch inner tuples as needed
(a) For each tuple in R, the join attribute (A) value is sent to the site of S.
(b) s tuples of S which match that value are retrieved and sent to the site of R
to be joined as they arrive

Total Cost = LT(retrieve card(R) tuples from R)

+ CT(length(A)) * card(R)

+ LT(retrieve card(S) tuples from S) * card(R)


+ CT( s * length(S))
+ LT (store s * length(S) into T)
+ LT (retrieving card(R) tuples from R)
+ LT (retrieve s * length(S) from T) * card(R)
Distributed Static Approach –
Performing Joins
5. Fetch outer tuples as needed
(a) For each tuple in S, the join attribute (A) value is sent to the site of R.
(b) r tuples of R which match that value are retrieved and sent to the site
of S to be joined as they arrive

Total Cost = LT(retrieve card(S) tuples from S)


+ CT(length(A)) * card(S)
+ LT(retrieve card(R) tuples from R) * card(S)
+ CT( r * length(R))
+ LT (retrieving card(S) tuples from S) * card(R)
Centralized Query Optimization

◼ Dynamic (Ingres project at UCB)


❑ Interpretive

◼ Static (System R project at IBM)


❑ Exhaustive search
◼ Hybrid (Volcano project at OGI)
❑ Choose node within plan
Hybrid optimization
◼ In general, static optimization is more efficient than
dynamic optimization
❑ Adopted by all commercial DBMS
◼ But even with a sophisticated cost model (with
histograms), accurate cost prediction is difficult
◼ Example
❑ Consider a parametric query with predicate
WHERE R.A = $a /* $a is a parameter
❑ The only possible assumption at compile time is uniform
distribution of values
◼ Solution: Hybrid optimization
❑ Choose-plan done at runtime, based on the actual parameter
binding
Hybrid optimization

◼ Hybrid query optimization attempts to provide the


advantages of static query optimization while avoiding
the issues generated by inaccurate estimates.
◼ The approach is basically static, but further optimization
decisions may take place at run time by adding a
conditional runtime reoptimization phase
◼ Thus, plans that have become infeasible or suboptimal
are reoptimized.
◼ However, detecting suboptimal plans is hard and this
approach tends to perform much more reoptimization
than necessary.
Hybrid optimization

◼ A more general solution is to produce dynamic


QEPs, which include carefully selected optimization
decisions to be made at runtime using “choose-plan”
operators.
◼ The choose-plan operator links two or more
equivalent subplans of a QEP that are incomparable
at compile-time
◼ The execution of a choose-plan operator yields the
comparison of the subplans based on actual costs
and the selection of the best one.
◼ Choose-plan nodes can be inserted anywhere in a
QEP.
Hybrid Optimization Example

• The bottom choose-plan


operator compares the cost A<=$a (R1) ▷◁R2 ▷◁R3
of two alternative subplans
for joining R1 and R2, the
left subplan being better
than the right one
• As stated above, since
there is a runtime binding of
the parameter $a, the $a=A
accurate selectivity of
A<=$a (R1) cannot be $a=A
estimated until runtime.
Hybrid Optimization Example

• The top choose-plan


operator compares the cost
of two alternative subplans A<=$a (R1) ▷◁R2 ▷◁R3
for joining the result of the
bottom choose-plan
operation with R3.
• Depending on the
estimated size of the join of
R1 and R2, which indirectly
depends on the selectivity
of the selection on R1, it $a=A
may be better to use R3 as
external or internal relation. $a=A
Distributed Hybrid optimization

• The static and dynamic distributed optimization


approaches have some advantages and disadvantages
as in centralized systems
• However, the problems of accurate cost estimation and
comparison of QEPs at compile-time are much more
severe in distributed systems.
• In addition to unknown bindings of parameter values in
embedded queries, sites may become unavailable or
overloaded at runtime.
• In addition, relations (or relation fragments) may be
replicated at several sites.
Distributed Hybrid optimization

• The hybrid query optimization technique using


dynamic QEPs is general enough to incorporate site
and other decisions.
• However, the search space of alternative subplans
linked by choose-plan operators becomes much
larger and may result in heavy static plans and much
higher startup time.
• Therefore, several hybrid techniques have been
proposed to optimize queries in distributed systems
• They essentially rely on the following two-step
approach:
2-Step Optimization

1. At compile time, generate a static plan that


specifies the ordering of operations and the
access methods, without considering where
relations are stored.
2. At startup time, generate an execution plan
by
❑ carrying out site and copy selection, and
❑ allocating the operations to the sites.
2-Step Optimization
 (R1) ▷◁R2 ▷◁R3

◼ The static plan shows the relational operation ordering as


produced by a centralized query optimizer.
◼ The run-time plan extends the static plan with site, copy
selection and communication between sites.
◼ For instance, the first selection is allocated at site S1 on
copy R11 of relation R1 and sends its result to site S3 to be
joined with R23
2-Step Optimization

◼ The first step can be done by a centralized query


optimizer. It may also include choose-plan operators
so that runtime bindings can be used to make
accurate cost estimations.
◼ The second step carries out site and copy selection,
possibly in addition to choose-plan operator
execution. Furthermore, it can optimize the load
balancing of the system.
2-Step – Problem Definition

◼ Given
❑ A set of sites S = {s1, s2, …, sn},
❑ A query Q ={q1, q2, q3, q4, …, qm} such that each subquery qi is the
processing unit that accesses one relation and communicates with its
neighboring queries
❑ For each qi in Q, a feasible allocation set of sites Sq={s1, s2, …,sk}
where each site stores a copy of the relation in qi
❑ Each site si has a load, denoted by load(si), which reflects the
number of queries currently submitted.
◼ The objective is to find an optimal allocation of Q to S such
that
❑ the unbalanced load of S is minimized
❑ The total communication cost is minimized
2-Step Algorithm

◼ For each q in Q compute load (Sq)


◼ While Q not empty do
1. Select subquery a with least allocation
flexibility
2. Select best site b for a (with least load
and best benefit)
3. Remove a from Q and recompute loads if
needed
2-Step Algorithm Example
◼ Consider the following query:  (R1) ▷◁R2 ▷◁R3 ▷◁R4
◼ Let Q = {q1, q2, q3, q4}
❑ where q1 is associated with R1,

❑ q2 is associated with R2 joined with the result of q1,

❑ q3 is associated with R3 joined with the result of q2, and

❑ q4 is associated with R4 joined with the result of q3.


2-Step Algorithm Example
◼ Consider the following query:  (R1) ▷◁R2 ▷◁R3 ▷◁R4
◼ It performs 4 iterations
◼ Iteration 1:
❑ select q1 with the least allocation flexibility,
allocate to s1,
❑ set load(s1) = 2

New Load
2(q1)
2
2
2
2-Step Algorithm Example
◼ Consider the following query:  (R1) ▷◁R2 ▷◁R3 ▷◁R4
◼ It performs 4 iterations
◼ Iteration 2: the next subquery to be selected is q2.
❑ select q2, allocate to s2 (it could be allocated to s4
which has the same load as s2,
❑ set load(s2) = 3 or set load(s4) = 3

New Load New Load


2(q1) 2(q1)
3(q2) 2
2 2
2 3(q2)
2-Step Algorithm Example
◼ Consider the following query:  (R1) ▷◁R2 ▷◁R3 ▷◁R4
◼ It performs 4 iterations
◼ Iteration 3:
❑ select q3, allocate to s1, set load(s1) = 3 or
allocate to s3,
❑ set load(s3) = 3

New Load New Load New Load New Load


3(q1, q3), 2(q1), 3(q1, q3), 2(q1)
3(q2) 3(q2) 2 2
2 3(q3), 2 3(q3)
2 2 3(q2) 3(q2)
2-Step Algorithm Example
◼ Consider the following query:  (R1) ▷◁R2 ▷◁R3 ▷◁R4
◼ It performs 4 iterations
◼ Iteration 4:
❑ select q4, allocate to s1
❑ set load(s1) = 4 or set load(s1) = 3
◼ Which one is better? (need to consider load-balancing)
New Load New Load New Load New Load
4(q1, q3, q4) 3(q1, q4), 4(q1, q3, q4) 3(q1, q4)
3(q2) 3(q2) 2 2
2 3(q3), 2 3(q3)
2 2 3(q2) 3(q2)
Adaptive Query Processing -
Motivations
◼ Assumptions underlying query optimization
❑ The optimizer has sufficient knowledge about runtime
Cost information
❑ Runtime conditions remain stable during query
execution
◼ Appropriate for systems with small data sources in a
controlled environment
◼ Inappropriate for changing environments with large
numbers of data sources and unpredictable runtime
conditions

© 2020, M.T. Özsu & P. Valduriez 97


Example: QEP with Blocked Operator

◼ Assume ASG, EMP,


PROJ and PAY each at a
different site
◼ If ASG site is down, the
entire pipeline is blocked
◼ However, with some
reorganization, the join of
EMP and PAY could be
done while waiting for
ASG

9
Adaptive Query Processing – Definition
◼ Adaptive query processing is a form of dynamic query
processing, with
❑ a feedback loop between the execution environment
and the query optimizer in order to react to
unforeseen variations of runtime conditions.
◼ A query processing system is defined as adaptive if it
receives information from the execution environment and
determines its behavior according to that information in
an iterative manner.
❑ a general adaptive query processing process

❑ Eddy approach that provides adaptive query


processing

© 2020, M.T. Özsu & P. Valduriez 99


Adaptive Query Processing – Definition
◼ Adaptive query processing adds to the traditional query
processing process the following activities: monitoring,
assessing, and reacting.
◼ Monitoring involves measuring some environment
parameters within a time window, and reporting them to
the assessment component.
◼ It analyzes the reports and considers thresholds for an
adaptive reaction plan.
◼ Finally, the reaction plan is communicated to the
reaction component that applies the reactions to query
execution.

© 2020, M.T. Özsu & P. Valduriez 100


Adaptive Components

◼ Monitoring parameters (collected by sensors in QEP)


❑ Memory size
❑ Data arrival rates
❑ Actual statistics
❑ Operator execution cost
❑ Network throughput
◼ Adaptive reactions
❑ Change schedule
❑ Replace an operator by an equivalent one
❑ Modify the behavior of an operator
❑ Data repartitioning

© 2020, M.T. Özsu & P. Valduriez 101


Eddy Approach

◼ Eddy is a general framework for adaptive query


processing over distributed relations.
◼ Multiple valid operator trees can be derived from G (Join
graph) that obey the constraints in C by exploring the
search space with different predicate orders.
◼ There is no need to find an optimal QEP during query
compilation.
◼ The process of QEP compilation is completed by adding
the eddy operator which is an n-ary operator placed
between the relations in D and query predicates in P.

© 2020, M.T. Özsu & P. Valduriez 102


Eddy Approach

◼ Query compilation: produces a tuple D, P, C, Eddy


❑ D: set of data sources (e.g. relations)
❑ P: set of predicates (operations)
❑ C: ordering constraints to be followed at runtime
❑ Eddy: n-ary operator between D and P
◼ Query execution: operator ordering on a tuple basis
using Eddy
❑ based on cost and selectivity
❑ Change of join ordering during execution

© 2020, M.T. Özsu & P. Valduriez 103


QEP with Eddy

◼ Q = {P (R) ⋈ S ⋈ T)
◼ Assume that
❑ the only access method to relation T is through an index
on join attribute T.A,
❑ the second join can only be an index join over T.A.

◼ Assume also that P is an expensive predicate (e.g., a


predicate over the results of running a program over
values of R.B).
◼ D= {R, S, T}
◼ P = {P (R), R ⋈1 S, S ⋈2 T)
◼ C = {S < T} where < imposes S tuples to probe T tuples
using an index on join attribute T.A
© 2020, M.T. Özsu & P. Valduriez 104
QEP with Eddy

◼ It shows a QEP produced by the compilation


of query Q with eddy.
◼ An ellipse corresponds to a physical
operator (i.e., either eddy operator or an
algorithm implementing a predicate p ∈ P).
◼ As usual, the bottom of the plan presents
the source relations.
◼ In the absence of a scan access method,
the access to relation T is wrapped by the
join S and T, thus does not appear as a
source relation.
◼ The arrows specify pipeline dataflow
following a producer–consumer relationship.
◼ Finally, an arrow departing from the eddy
models the production of output tuples.

© 2020, M.T. Özsu & P. Valduriez 105


QEP with Eddy

◼ Eddy provides fine-grained adaptiveness by deciding how to


access tuples through predicates according to a scheduling
policy.
◼ During query execution, tuples in source relations are retrieved
and staged into an input buffer managed by the eddy operator.
◼ Eddy responds to relation unavailability by simply reading from
another relation and staging tuples in the buffer pool.
◼ The flexibility of choosing the currently available source relation
is obtained by relaxing the fixed order of predicates in a QEP.
◼ In eddy, there is no fixed QEP and each tuple follows its own
path through predicates according to the constraints in the plan
and its own history of predicate evaluation.

© 2020, M.T. Özsu & P. Valduriez 106

You might also like