0% found this document useful (0 votes)
56 views

7 Query Localization

Uploaded by

Nida Fatima
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
56 views

7 Query Localization

Uploaded by

Nida Fatima
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 27

Outline

• Introduction
• Background
• Distributed Database Design
• Database Integration
• Semantic Data Control
• Distributed Query Processing
➡ Overview
➡ Query decomposition and localization
➡ Distributed query optimization
• Multidatabase query processing
• Distributed Transaction Management
• Data Replication
• Parallel Database Systems
• Distributed Object DBMS
• Peer-to-Peer Data Management
• Web Data Management
• Current Issues
Distributed DBMS © M. T. Özsu & P. Valduriez
Ch.7/1
Step 1 – Query Decomposition
Input : Calculus query on global relations
• Normalization
➡ manipulate query quantifiers and qualification
• Analysis
➡ detect and reject “incorrect” queries
➡ possible for only a subset of relational calculus
• Simplification
➡ eliminate redundant predicates
• Restructuring
➡ calculus query  algebraic query
➡ more than one translation is possible
➡ use transformation rules

Distributed DBMS © M. T. Özsu & P. Valduriez


Ch.7/2
Normalization
• Lexical and syntactic analysis
➡ check validity (similar to compilers)
➡ check for attributes and relations
➡ type checking on the qualification

• Put into normal form


➡ Conjunctive normal form
(p11 p12  …  p1n)  …  (pm1  pm2  …  pmn)
➡ Disjunctive normal form
(p11  p12  …  p1n)  …  (pm1  pm2  …  pmn)
➡ OR's mapped into union
➡ AND's mapped into join or selection
Distributed DBMS © M. T. Özsu & P. Valduriez
Ch.7/3
Analysis
• Refute incorrect queries
• Type incorrect
➡ If any of its attribute or relation names are not defined in the global schema
➡ If operations are applied to attributes of the wrong type
• Semantically incorrect
➡ Components do not contribute in any way to the generation of the result
➡ Only a subset of relational calculus queries can be tested for correctness
➡ Those that do not contain disjunction and negation
➡ To detect
✦ connection graph (query graph)
✦ join graph

Distributed DBMS © M. T. Özsu & P. Valduriez


Ch.7/4
Analysis – Example
SELECT ENAME,RESP
FROM EMP, ASG, PROJ
WHERE EMP.ENO = ASG.ENO
AND ASG.PNO = PROJ.PNO
AND PNAME = "CAD/CAM"
AND DUR ≥ 36
AND TITLE = "Programmer"

Query graph Join graph


DUR≥36

ASG ASG
EMP.ENO=ASG.ENO ASG.PNO=PROJ.PNO EMP.ENO=ASG.ENO ASG.PNO=PROJ.PNO

TITLE =
EMP RESP PROJ EMP PROJ
“Programmer”

ENAME
RESULT
PNAME=“CAD/CAM”

Distributed DBMS © M. T. Özsu & P. Valduriez


Ch.7/5
Analysis
If the query graph is not connected, the query may be wrong or
use Cartesian product
SELECT ENAME,RESP
FROM EMP, ASG, PROJ
WHERE EMP.ENO = ASG.ENO
AND PNAME = "CAD/CAM"
AND DUR > 36
AND TITLE = "Programmer"

ASG

EMP RESP PROJ

ENAME
RESULT
PNAME=“CAD/CAM”

Distributed DBMS © M. T. Özsu & P. Valduriez


Ch.7/6
Simplification
• Why simplify?

➡ Remember the example


• How? Use transformation rules

➡ Elimination of redundancy
✦ idempotency rules
p1  ¬( p1)  false
p1  (p1 p2)  p1
p1  false  p1

➡ Application of transitivity
➡ Use of integrity rules

Distributed DBMS © M. T. Özsu & P. Valduriez


Ch.7/7
Simplification – Example
SELECT TITLE
FROM EMP
WHERE EMP.ENAME = "J. Doe"
OR (NOT(EMP.TITLE = "Programmer")
AND (EMP.TITLE = "Programmer"
OR EMP.TITLE = "Elect. Eng.")
AND NOT(EMP.TITLE = "Elect. Eng."))


SELECT TITLE
FROM EMP
WHERE EMP.ENAME = "J. Doe"

Distributed DBMS © M. T. Özsu & P. Valduriez


Ch.7/8
Restructuring
• Convert relational calculus to relational ENAME Project
algebra
• Make use of query trees σDUR=12 OR DUR=24
• Example
Find the names of employees other than
J. Doe who worked on the CAD/CAM
σ PNAME=“CAD/CAM” Select
project for either 1 or 2 years.
SELECT ENAME σENAME≠“J. DOE”
FROM EMP, ASG, PROJ
WHERE EMP.ENO = ASG.ENO ⋈PNO
AND ASG.PNO = PROJ.PNO
AND ENAME≠ "J. Doe" ⋈ENO Join
AND PNAME = "CAD/CAM"
AND (DUR = 12 OR DUR = 24) PROJ ASG EMP
Distributed DBMS © M. T. Özsu & P. Valduriez
Ch.7/9
Restructuring –Transformation
Rules
• Commutativity of binary operations
➡ R×SS×R
➡ R ⋈S  S ⋈R
➡ RSSR
• Associativity of binary operations
➡ ( R × S) × T  R × (S × T)
➡ (R ⋈S) ⋈T  R ⋈ (S ⋈T)
• Idempotence of unary operations
➡  A’( A’(R))   A’(R)
➡ p1(A1)(p2(A2)(R))  p1(A1)p2(A2)(R)
where R[A] and A'  A, A"  A and A'  A"
• Commuting selection with projection
Distributed DBMS © M. T. Özsu & P. Valduriez
Ch.7/10
Restructuring – Transformation
Rules
• Commuting selection with binary operations
➡ p(A)(R × S)  (p(A) (R)) × S

➡ p(A )(R ⋈(A ,B )S)  (p(A ) (R)) ⋈(A ,B )S


i j k i j k

➡ p(A )(R  T)  p(A ) (R)  p(A ) (T)


i i i
where Ai belongs to R and T
• Commuting projection with binary operations
➡  C(R × S)   A’(R) ×  B’(S)

➡  C(R ⋈(A ,B )S)   A’(R) ⋈(A ,B )  B’(S)


j k j k

➡  C(R  S)   C(R)   C(S)

where R[A] and S[B]; C = A'  B' where A'  A, B'  B


Distributed DBMS © M. T. Özsu & P. Valduriez
Ch.7/11
Example
Recall the previous example: ENAME
Project
Find the names of employees other
than J. Doe who worked on the DUR=12  DUR=24
CAD/CAM project for either one or
two years.
PNAME=“CAD/CAM” Select
SELECT ENAME
FROM PROJ, ASG, EMP
ENAME≠“J. DOE”
WHERE ASG.ENO=EMP.ENO
AND ASG.PNO=PROJ.PNO
⋈PNO
ENAME ≠ "J. Doe"
⋈ENO
AND
Join
AND PROJ.PNAME="CAD/CAM"
AND (DUR=12 OR DUR=24) PROJ ASG EMP
Distributed DBMS © M. T. Özsu & P. Valduriez
Ch.7/12
Equivalent Query
ENAME

PNAME=“CAD/CAM”  (DUR=12  DUR=24) ENAME≠“J. Doe”

⋈PNO,ENO

EMP PROJ ASG


Distributed DBMS © M. T. Özsu & P. Valduriez
Ch.7/13
Restructuring
 ENAME

⋈PNO
 PNO,ENAME

⋈ENO

 PNO  PNO,ENO  PNO,ENAME

PNAME = "CAD/CAM" DUR =12DUR=24 ENAME ≠ "J. Doe"

PROJ ASG EMP


Distributed DBMS © M. T. Özsu & P. Valduriez
Ch.7/14
Step 2 – Data Localization
Input: Algebraic query on distributed relations
• Determine which fragments are involved
• Localization program
➡ substitute for each global query its materialization program
➡ optimize

Distributed DBMS © M. T. Özsu & P. Valduriez


Ch.7/15
Example
Assume  ENAME

➡ EMP is fragmented into EMP1, EMP2, DUR=12 DUR=24


EMP3 as follows:
✦ EMP1= ENO≤“E3”(EMP) PNAME=“CAD/CAM”
✦ EMP2= “E3”<ENO≤“E6”(EMP)
ENAME≠“J. DOE”
✦ EMP3= ENO≥“E6”(EMP)
➡ ASG fragmented into ASG1 and ASG2 ⋈PNO
as follows:
✦ ASG1= ENO≤“E3”(ASG) ⋈ENO

✦ ASG2= ENO>“E3”(ASG) PROJ  


Replace EMP by (EMP1  EMP2  EMP3)
and ASG by (ASG1  ASG2) in any query EMP1EMP2 EMP3 ASG1 ASG2
Distributed DBMS © M. T. Özsu & P. Valduriez
Ch.7/16
Provides Parallellism

⋈ENO ⋈ENO ⋈ENO ⋈ENO

EMP1 ASG1 EMP2 ASG2 EMP3 ASG1 EMP3 ASG2

Distributed DBMS © M. T. Özsu & P. Valduriez


Ch.7/17
Eliminates Unnecessary Work

⋈ENO ⋈ENO ⋈ENO

EMP1 ASG1 EMP2 ASG2 EMP3 ASG2

Distributed DBMS © M. T. Özsu & P. Valduriez


Ch.7/18
Reduction for PHF
• Reduction with selection
➡ Relation R and FR={R1, R2, …, Rw} where Rj=pj(R)

pi(Rj)= if x in R: ¬(pi(x) pj(x))


➡ Example
SELECT *
FROM EMP
WHERE ENO="E5"
ENO=“E5” ENO=“E5”

EMP1 EMP2 EMP3 EMP2


Distributed DBMS © M. T. Özsu & P. Valduriez
Ch.7/19
Reduction for PHF
• Reduction with join

➡ Possible if fragmentation is done on join attribute

➡ Distribute join over union

(R1 R2)⋈S  (R1⋈S)  (R2⋈S)

➡ Given Ri =p (R) and Rj = p (R)


i j

Ri ⋈Rj = if x in Ri, y in Rj: ¬(pi(x) pj(y))

Distributed DBMS © M. T. Özsu & P. Valduriez


Ch.7/20
Reduction for PHF
• Assume EMP is fragmented as ⋈ENO
before and
➡ ASG1: ENO ≤ "E3"(ASG)  
➡ ASG2: ENO > "E3"(ASG)
• Consider the query EMP1 EMP2 EMP3 ASG1 ASG2
SELECT *
FROM EMP,ASG
WHERE EMP.ENO=ASG.ENO 
• Distribute join over unions
• Apply the reduction rule ⋈ENO ⋈ENO ⋈ENO

EMP1 ASG1 EMP2 ASG2 EMP3 ASG2

Distributed DBMS © M. T. Özsu & P. Valduriez


Ch.7/21
Reduction for VF
• Find useless (not empty) intermediate relations

Relation R defined over attributes A = {A1, ..., An} vertically fragmented


as Ri = A'(R) where A' A:
 D,K(Ri) is useless if the set of projection attributes D is not in A'
Example: EMP1= ENO,ENAME (EMP); EMP2= ENO,TITLE (EMP)

SELECT ENAME
FROM EMP
ENAME ENAME

⋈ENO

EMP1 EMP2 EMP1


Distributed DBMS © M. T. Özsu & P. Valduriez
Ch.7/22
Reduction for DHF
• Rule :
➡ Distribute joins over unions
➡ Apply the join reduction for horizontal fragmentation
• Example
ASG1: ASG ⋉ENO EMP1
ASG2: ASG ⋉ENO EMP2
EMP1: TITLE=“Programmer” (EMP)
EMP2: TITLE=“Programmer” (EMP)
• Query
SELECT *
FROM EMP, ASG
WHEREASG.ENO = EMP.ENO
AND EMP.TITLE = "Mech. Eng."
Distributed DBMS © M. T. Özsu & P. Valduriez
Ch.7/23
Reduction for DHF
Generic query ⋈ENO
TITLE=“Mech. Eng.”

 

ASG1 ASG2 EMP1 EMP2

Selections first ⋈ENO

 TITLE=“Mech. Eng.”

ASG1 ASG2 EMP2


Distributed DBMS © M. T. Özsu & P. Valduriez
Ch.7/24
Reduction for DHF

Joins over unions

⋈ENO ⋈ENO

TITLE=“Mech. Eng.” TITLE=“Mech. Eng.”

ASG1 EMP2 ASG2 EMP2


Elimination of the empty intermediate relations
(left sub-tree) ⋈ENO

TITLE=“Mech. Eng.”

ASG2 EMP2
Distributed DBMS © M. T. Özsu & P. Valduriez
Ch.7/25
Reduction for Hybrid
Fragmentation
• Combine the rules already specified:
➡ Remove empty relations generated by contradicting selections on horizontal
fragments;
➡ Remove useless relations generated by projections on vertical fragments;

➡ Distribute joins over unions in order to isolate and remove useless joins.

Distributed DBMS © M. T. Özsu & P. Valduriez


Ch.7/26
Reduction for HF
Example
ENAME
Consider the following hybrid
fragmentation: ENAME
ENO=“E5”
EMP1= ENO≤"E4" ( ENO,ENAME (EMP))

EMP2= ENO>"E4" ( ENO,ENAME (EMP))


⋈ENO
 ENO=“E5”
EMP3= ENO,TITLE (EMP)

and the query 


EMP2
SELECT ENAME
FROM EMP
WHERE ENO="E5" EMP1 EMP2 EMP3

Distributed DBMS © M. T. Özsu & P. Valduriez


Ch.7/27

You might also like