0% found this document useful (0 votes)
2 views

L2 Distributed QueryProcessing

Uploaded by

naziya1531
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

L2 Distributed QueryProcessing

Uploaded by

naziya1531
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

10-08-2024

Query decomposition and data


localization
Dario Della Monica

These slides are a modified version of the slides provided with the book
Özsu and Valduriez, Principles of Distributed Database Systems (3rd Ed.), 2011
The original version of the slides is available at: extras.springer.com

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.7/1

Outline (distributed DB)


• Introduction (Ch. 1) ⋆

• Distributed Database Design (Ch. 3) ⋆

• Distributed Query Processing (Ch. 6-8) ⋆


➡ Overview (Ch. 6) ⋆

➡ Query decomposition and data localization (Ch. 7) ⋆


➡ Distributed query optimization (Ch. 8) ⋆

• Distributed Transaction Management (Ch. 10-12) ⋆

⋆ Özsu and Valduriez, Principles of Distributed Database Systems (3rd Ed.), 2011

Distributed DBMS © M. T. Özsu & P. Valduriez


Ch.7/2

1
10-08-2024

Outline (today)
• Query decomposition and data localization (Ch. 7) ⋆
➡ The problem of distributed data localization
➡ A naïve algorithm
➡ Optimization steps (reductions)
✦ PHF (selection, join)
✦ VF (projection)
✦ DHF (join)
✦ Hybrid Fragmentation (selection/join + projection)

⋆ Özsu and Valduriez, Principles of Distributed Database Systems (3rd Ed.), 2011

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.7/3

Data Localization
Input: Relational algebra expression on global, distributed relations (distributed
query)
Output: Relational algebra expression on fragments (localized query)
• Localization uses global information about distribution of fragments (no use of
quantitative information, e.g., catalog statistics)
• Recall that fragmentation is obtained by several application of rules expressed
by relational algebra …
➡ primary horizontal fragmentation: selection σ
➡ derived horizontal fragmentation: semijoin ⋉
➡ vertical fragmentation: projection 

• … and that reconstruction (reverse fragmentation) rules are also expressed in


relational algebra
➡ horizontal fragmentation: union ∪
➡ vertical fragmentation: join ⋈

Distributed DBMS © M. T. Özsu & P. Valduriez


Ch.7/4

2
10-08-2024

A naïve algorithm to localize


distribute queries
• Localization program: relational algebra expression that reconstructs a
global relation from its fragments, by reverting the rules employed for
fragmentation
• A localized query is obtained from distributed, global query by replacing
leaves (global relations) with (the tree of) its corresponding localization
program
➡ Leaves of localized queries are fragments

• This approach to obtain a localized query from a distributed one is


inefficient and the result can be improved
➡ During data localization there is a first optimization phase
✦ we call it reduction
✦ different from the “proper” global optimization phase (“proper” in the sense of
the centralized case, i.e., finding the “best” strategy for executing the query)

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.7/5

Example
Assume
EMP ⋈ ASG
• EMP is fragmented as follows:
=
(EMP1 ∪ EMP2 ∪ EMP3) ⋈ (ASG1 ∪ ASG2 )
➡ EMP1= ENO≤“E3”(EMP)

➡ EMP2= “E3”<ENO≤“E6”(EMP)

➡ EMP3= ENO≥“E6”(EMP)

• ASG is fragmented as follows: ⋈ENO

➡ ASG1= ENO≤“E3”(ASG)
 
➡ ASG2= ENO>“E3”(ASG)

Replace EMP by (EMP1  EMP2  EMP3) EMP1 EMP3 ASG1 ASG2


and ASG by (ASG1  ASG2) in any query EMP2

Distributed DBMS © M. T. Özsu & P. Valduriez


Ch.7/6

3
10-08-2024

Provides Parallellism
( EMP1 ∪ EMP2 ∪ EMP3) ⋈ (ASG1 ∪ ASG2 )
EMP1= ENO≤“E3”(EMP)
=
EMP2= “E3”<ENO≤“E6”(EMP) ( EMP1 ⋈ ASG1) ∪ ( EMP1 ⋈ ASG2) ∪
EMP3= ENO≥“E6”(EMP) ( EMP2 ⋈ ASG1) ∪ ( EMP2 ⋈ ASG2) ∪
( EMP3 ⋈ ASG1) ∪ ( EMP3 ⋈ ASG2)
ASG1= ENO≤“E3”(ASG)
ASG2= ENO>“E3”(ASG) 
⋈ENO

 

⋈ENO ⋈ENO ⋈ENO ⋈ENO ⋈ ⋈ENO


EMP 1 EMP3 ASG1 ASG2 ENO
EMP2

EMP1 ASG1 EMP1 ASG2 EMP2 ASG1 EMP2 ASG2 EMP3 ASG1 EMP3 ASG2

Distributed DBMS © M. T. Özsu & P. Valduriez


Ch.7/7

Eliminates Unnecessary Work



⋈ENO ⋈ENO ⋈ENO

EMP1 ASG1 EMP2 ASG2 EMP3 ASG2

Identify (pairs of) fragments that can be ignored because they produce
empty relations (e.g., when a selection or a join is applied to them)
Distributed DBMS © M. T. Özsu & P. Valduriez
Ch.7/8

4
10-08-2024

Reduction for PHF – Selection


• Reduction of a selection over a relation fragmented with PHF: ignore a fragment
if selection predicate and fragment predicate are contradictory
➡ Consider p(R)

➡ Horizontal fragmentation on R: FR={R1, R2, …, Rw}, where Rj=p (R)


j

➡ p(Rj)= if x in R: ¬(p(x) pj(x)) i.e., p and pj are contradictory

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.7/9

Reduction for PHF – Selection


(Example)
• Reduction of a selection over a relation fragmented with PHF: ignore a fragment
if selection predicate and fragment predicate are contradictory
➡ Example SELECT * EMP1= ENO≤“E3”(EMP)
FROM EMP EMP2= “E3”<ENO≤“E6”(EMP)
WHERE ENO="E5" EMP3= ENO≥“E6”(EMP)
ASG1= ENO≤“E3”(ASG)
ASG2= ENO>“E3”(ASG)

ENO=“E5” ENO=“E5” ENO=“E5”


EMP EMP1 EMP2 EMP3 EMP2


distributed query localized query reduced local query

Distributed DBMS © M. T. Özsu & P. Valduriez


Ch.7/10

10

5
10-08-2024

Reduction for PHF – Join


• Reduction of a join over relations fragmented with PHF: ignore the join of 2
fragments if their fragment predicates are contradictory over the join attributes

➡ Possible if fragmentation predicates (minterms) involve the join attribute

➡ Distribute join over union

R ⋈ S  (R1 R2) ⋈ (S1  S2)


 (R1 ⋈ S1)  (R1 ⋈ S2)  (R2 ⋈ S1)  (R2 ⋈ S2)
➡ Then, join between 2 fragments can be simplified in some cases

✦ Given Ri =pi(R) and Sj = pj(S) [pi and pj defined over join attributes]

Ri ⋈Sj = if x in R ⋈ S : ¬(pi(x) pj(x)) [there is a mistake in the textbook]


i.e., pi and pj are contradictory

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.7/11

11

Reduction for PHF – Join (Example)


EMP1= ENO≤“E3”(EMP) ⋈ENO
EMP2= “E3”<ENO≤“E6”(EMP)
EMP3= ENO≥“E6”(EMP)
ASG1= ENO≤“E3”(ASG)  
ASG2= ENO>“E3”(ASG)

EMP1 EMP2 EMP3 ASG1 ASG2


• Consider the query
SELECT *
FROM EMP,ASG 
WHERE EMP.ENO=ASG.ENO
• Distribute join over unions ⋈ENO ⋈ENO ⋈ENO
• Apply the reduction rule
Not always useful EMP1 ASG1 EMP2 ASG2 EMP3 ASG2

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.7/12

12

6
10-08-2024

Reduction for VF
• Reduction of a projection over a relation fragmented with VF: ignore the fragment for
which the set of fragmentation attributes intersected with the set of projection
attributes is contained in the primary key
• Recall that the localization program consists in joins over key attributes
• Let R1 be a fragment of R obtained as R1 = A’ (R) where A’  attr(R) :
➡ Reduction of a projection A’’ over R1 is possible when A’’ ∩ A’  key(R)

Ex.: EMP1 = ENO,ENAME (EMP)


ENAME ENAME
EMP2 = ENO,TITLE (EMP)
⋈ENO
SELECT ENAME
FROM EMP
EMP1 EMP2 EMP1

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.7/13

13

Reduction for DHF


• Similar to the case PHF
• DHF: 2 relations S (owner) and R (member) in association one-to-many
➡ S participates with cardinality N , R participates with cardinality 1
➡ Fragmentation propagate from S to R
➡ Localization program: union
➡ Compatible fragments (i.e., fragments that agree on the values of join
attributes) are placed at the same site
• Reduction of a join over relations fragmented with DHF: only join
“corresponding” fragments
➡ Distribute joins over unions
➡ Apply the join reduction for horizontal fragmentation

Distributed DBMS © M. T. Özsu & P. Valduriez


Ch.7/14

14

7
10-08-2024

Reduction for DHF – Example


• Example [EMP is owner , ASG is member]
EMP1: TITLE=“Programmer” (EMP) Always convenient
EMP2: TITLE ≠“Programmer” (EMP) - the number of joins is always equal to the number
ASG1: ASG ⋉ENO EMP1 of fragments
ASG2: ASG ⋉ENO EMP2 - all joins can be performed in parallel (are disjoint)
• Query SELECT *
FROM EMP, ASG
WHERE ASG.ENO = EMP.ENO

⋈ENO 

  ⋈ENO ⋈ENO

ASG1 ASG2 EMP1 EMP2 ASG1 EMP1 ASG2 EMP2

Distributed DBMS © M. T. Özsu & P. Valduriez


Ch.7/15

15

Complex reduction for PHF and DHF


1. Generic query ⋈ENO SELECT *
FROM EMP, ASG
WHERE ASG.ENO = EMP.ENO
TITLE=“Mech. Eng.” AND EMP.TITLE = “Mech. Eng”

 

ASG1 ASG2 EMP1 EMP2

2. Reduction of selection over a 3. Reduction of join over a relation


relation fragmented with HF fragmented with DHF
⋈ENO ⋈ENO

 TITLE=“Mech. Eng.” TITLE=“Mech. Eng.”

ASG1 ASG2 EMP2 ASG2 EMP2

Distributed DBMS © M. T. Özsu & P. Valduriez


Ch.7/16

16

8
10-08-2024

Reduction for Hybrid Fragmentation


• Combine the rules already specified
➡ Remove empty relations generated by contradicting predicates (inside
selections or joins) on horizontal fragments

➡ Remove useless relations generated by projections on vertical fragments

➡ Distribute joins/selections/projections over unions in order to isolate and


remove useless operands

Distributed DBMS © M. T. Özsu & P. Valduriez Ch.7/17

17

Reduction for Hybrid Fragmentation


Example
ENAME
Consider the following hybrid fragmentation:
ENAME
EMP1= ENO≤"E4" (ENO,ENAME (EMP))
EMP2= ENO>"E4" (ENO,ENAME (EMP)) ENO=“E5”
EMP3= ENO,TITLE (EMP)
⋈ENO
 ENO=“E5”
Thus, the localization program for EMP is:
EMP = (EMP1 ∪ EMP2) ⋈ EMP3

Consider also the query: 


EMP2
SELECT ENAME
FROM EMP EMP1 EMP2 EMP3
WHERE ENO="E5"

Distributed DBMS © M. T. Özsu & P. Valduriez


Ch.7/18

18

You might also like