0% found this document useful (0 votes)
94 views

Distributed Database Systems: Decomposition & Data Localization

This document discusses techniques for query decomposition and data localization in distributed database systems. It covers query normalization, analysis to eliminate semantically incorrect queries, eliminating redundant predicates, rewriting queries in relational algebra, and reducing queries by localizing data access based on the fragmentation scheme, including rules for horizontal, vertical, derived, and hybrid fragmentation. Examples are provided to illustrate each technique.

Uploaded by

Mustafa Güney
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
94 views

Distributed Database Systems: Decomposition & Data Localization

This document discusses techniques for query decomposition and data localization in distributed database systems. It covers query normalization, analysis to eliminate semantically incorrect queries, eliminating redundant predicates, rewriting queries in relational algebra, and reducing queries by localizing data access based on the fragmentation scheme, including rules for horizontal, vertical, derived, and hybrid fragmentation. Examples are provided to illustrate each technique.

Uploaded by

Mustafa Güney
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 59

Distributed Database Systems

Chapter 8

Decomposition
&
Data Localization

Distrubuted Database System 1


Query Decomposition

Distrubuted Database System 2


Query Decomposition

1 • Normalization

2 • Analysis

• Elimination of
3 Redundancy

4 • Rewriting

Distrubuted Database System 3


1. Normalization
For SQL statement, this is the
normalization on predicates in WHERE
clause, which may be arbitrarily complex,
quantifier-free predicate, preceded by
necessary quantifier (, )
Conjunctive normal form (more practical)

Disjunctive normal form

Distrubuted Database System 4


1. Normalization
Transformation rules

Distrubuted Database System 5


1. Normalization
Example

The conjuctive normal form

Distrubuted Database System 6


2. Analysis
Objective
◦ Reject type incorrect or semantically incorrect
queries

Type incorrect
◦ Undefined relation and attribute, wrong type
mapping etc.

Distrubuted Database System 7


2. Analysis
Example

Distrubuted Database System 8


2. Analysis
Semantically incorrect
◦ A query has some components not
contributing to the query result
Fact
◦ It’s impossible to determine the semantic
correctness of a general query. But it is
possible to do so for queries not containing
v and ¬

Distrubuted Database System 9


2. Analysis – Tool of Analysis
Query Graph
◦ one node representing the result relation
◦ other nodes to represent operand relations, and
◦ edges of two classes
 an edge to represent a join if neither of its two nodes
is the result
 an edge to represent a projection if one of its node is
the result node
Nodes and edges may be labeled by predicates for
selection, projection or join.

Distrubuted Database System 10


2. Analysis – Tool of Analysis
Join Graph
◦ a subgraph of query graph for join operation

Distrubuted Database System 11


2. Analysis – Tool of Analysis
Example 1

Distrubuted Database System 12


2. Analysis – Tool of Analysis
Example 1 – Query Graph

Distrubuted Database System 13


2. Analysis – Tool of Analysis
Example 1 – Join Graph

Distrubuted Database System 14


2. Analysis – Tool of Analysis
A conjunctive query without negation is
semantically incorrect if its query graph is
NOT connected!

Distrubuted Database System 15


2. Analysis – Tool of Analysis
Example 2

Distrubuted Database System 16


2. Analysis – Tool of Analysis
Example 2 – Query Graph

Distrubuted Database System 17


3. Elemination of Redundancy
The technique using idem potency rules
to eliminate redundant predicates from
WHERE clause.

Distrubuted Database System 18


3. Elemination of Redundancy
Example

Distrubuted Database System 19


4. Rewriting
Rewrite a calculus query in relational
algebra:
◦ translation, and
◦ reconstruction of algebra query to improve
performance

Distrubuted Database System 20


3. Rewriting
Relational algebra tree
a tree defined by:
◦ a root node representing the query result
◦ leaves representing database relations
◦ non-leaf nodes representing relations
produced by operations, and
◦ edges from leaves to root representing the
sequences of operations

Distrubuted Database System 21


3. Rewriting
How to translate an SQL query into an
algebra tree
1. create a leaf for every relation in the FROM
clause
2. create the root as a project operation
involving attributes in the SELECT clause
3. create the operation sequence by the
predicates and operators in the WHERE
clause

Distrubuted Database System 22


3. Rewriting
Example 1

Distrubuted Database System 23


3. Rewriting
Example 1 – Query tree

Distrubuted Database System 24


3. Rewriting
How to use transformation rules to
optimize
◦ separate unary operations to simplify the
query expression
◦ unary operations on the same relation may be
grouped to access the same relation once
◦ unary operations may be commuted with
binary operations, so that may be performed
first to reduce the size of intermediate
relations
◦ binary operations may be reordered
Distrubuted Database System 25
3. Rewriting
Example 2 – the optimization of previous
query tree

Distrubuted Database System 26


Localization of Distributed Data

Distrubuted Database System 27


Localization of Distributed Data
Task:
Translate a query on global relation into algebra
queries on physical fragment, and optimize the
query by reduction.

Distrubuted Database System 28


Reduction for Primary Horizontal
Fragmentation
Example:
EMP(ENO, ENAME, TITLE) is fragmented

Distrubuted Database System 29


Reduction for Primary Horizontal
Fragmentation
Reduction with selection
Rule 1

Distrubuted Database System 30


Reduction for Primary Horizontal
Fragmentation
Example 1

For the fragmented EMP we have

Distrubuted Database System 31


Reduction for Primary Horizontal
Fragmentation
Example 1- Step 1

Generate a global query tree

Distrubuted Database System 32


Reduction for Primary Horizontal
Fragmentation
Example 1- Step 2

Substitute fragments for EMP

Distrubuted Database System 33


Reduction for Primary Horizontal
Fragmentation
Example 1- Step 3

Substitute fragments for EMP

ENO=”E5” is contradictory to
ENO<=”E3” and ENO>”E6”
Distrubuted Database System 34
Reduction for Primary Horizontal
Fragmentation
Reduction with join
Rule 2

Distrubuted Database System 35


Reduction for Primary Horizontal
Fragmentation
Note the following transformation is often
used to eliminate useless join in
reduction:

Distrubuted Database System 36


Reduction for Primary Horizontal
Fragmentation
Example
Assume EMP is fragmented as before, and ASG
is fragmented as

Distrubuted Database System 37


Reduction for Primary Horizontal
Fragmentation
Example
EMP and ASG are fragmented using predicates
on the same attribute ENO

Distrubuted Database System 38


Reduction for Primary Horizontal
Fragmentation
Example
Generic query tree

Distrubuted Database System 39


Reduction for Primary Horizontal
Fragmentation
Example
Reduced query tree

Distrubuted Database System 40


Reduction for Vertical Fragmentation
 The reconstruction operation for a
relation vertically fragmented is join.

 Every fragment must contain the key of


the relation

Distrubuted Database System 41


Reduction for Vertical Fragmentation
 Example

Distrubuted Database System 42


Reduction for Vertical Fragmentation
 Let R(A1, A2, An) be a relation,

 Rule 3

Distrubuted Database System 43


Reduction for Vertical Fragmentation
 Example

Distrubuted Database System 44


Reduction for Vertical Fragmentation
 Example
Generic query tree

Distrubuted Database System 45


Reduction for Vertical Fragmentation
 Example
Reduced query tree

Distrubuted Database System 46


Reduction for Derived Fragmentation
 S is primary horizontal fragmented, R is
fragmented by , where A is the
common attributes set, and a foreign key
of R referring to S.

Distrubuted Database System 47


Reduction for Derived Fragmentation
 Example
 In the Engineering database ASG is
fragmented based on EMP as

Distrubuted Database System 48


Reduction for Derived Fragmentation
 Query optimization method Distribute
joins over unions and eliminate those
useless joins due to predicate conflicts.

 Example

Distrubuted Database System 49


Reduction for Derived Fragmentation
 Example
 Generic Query(ignoring the final
projection)

Distrubuted Database System 50


Reduction for Derived Fragmentation
 Example

Distrubuted Database System 51


Reduction for Derived Fragmentation
 Example
Distrubute join over union

Distrubuted Database System 52


Reduction for Derived Fragmentation
 Example
Remove the useless join (left brach of the tree)
to get the best result.

Distrubuted Database System 53


Reduction for Hybrid Fragmentation
 Hybrid Fragmentation

The Combination of horizontal and


vertical fragmentation

Distrubuted Database System 54


Reduction for Hybrid Fragmentation
 Example
EMP is vertically fragmented first, and then
horizontally next.

Distrubuted Database System 55


Reduction for Hybrid Fragmentation
 Combine all discussed three rules to
reduce hybrid fragmentation.
 Example
A query on EMP fragmented as above example.

Distrubuted Database System 56


Reduction for Hybrid Fragmentation
 Example
By rule 3, E3 is eliminated, and by rule 1, E1 is
eliminated. The reduced query is

Distrubuted Database System 57


Conclusions

Distrubuted Database System 58


Conclusions
 Decomposition generates algebraic
queries from calculus queries.
Localization express algebraic queries on
fragments. An algebraic query can be
optimized by transformation, heuristics,
and elimination of useless operations.

Distrubuted Database System 59

You might also like