Fragmentation
Fragmentation
Why fragment?
Usage:
- Apps work with views rather than entire relations.
Efficiency:
- Data stored close to where most frequently used.
- Data not needed by local applications is not stored.
Security:
- and so not available to unauthorized users.
Parallelism:
- With fragments as unit of distribution, T can be divided into several subqueries that operate
on fragments
Three Correctness of fragmentation rules:
Completeness: If relation R decomposed into fragments R1, R2, ... Rn, each data item that can be found in R must appear in
at least one fragment.
Reconstruction: Must be possible to define a relational operation that will reconstruct R from the fragments.
- for horizontal fragmentation: Union operation
- for vertical: Join
Disjointness: If data item di appears in fragment Ri, then should not appear in any other fragment.
- Exception: vertical fragmentation.
- For horizontal fragmentation, data item is a tuple.
- For vertical fragmentation, data item is an attribute.
Four types of fragmentation:
Horizontal: Consists of a subset of the tuples of a relation.
- Defined using Selection operation
- Determined by looking at predicates used by Ts.
- Involves finding set of minimal (complete and relevant) predicates.
- Set of predicates is complete, iff, any two tuples in same fragment are referenced with same probability by any application.
- Predicate is relevant if there is at least one application that accesses fragments differently.
Vertical: subset of atts of a relation.
- Defined using Projection operation
- Determined by establishing affinity of one attribute to another.
3. Mixed: horizontal fragment that is vertically fragmented, or a vertical fragment that is horizontally fragmented.
- Defined using Selection and Projection operations
4. Derived: horizontal fragment that is based on horizontal fragmentation of a parent relation.
- Ensures fragments frequently joined together are at same site.
- Defined using Semijoin operation
Advantages of Fragmentation
Horizontal:
-allows parallel processing on fragments of a relation
-allows a relation to be split so that tuples are located where they are most frequently accessed
Vertical:
-allows tuples to be split so that each part of the tuple is stored where it is most frequently accessed
-tuple-id attribute allows efficient joining of vertical fragments
-allows parallel processing on a relation
Vertical and horizontal fragmentation can be mixed.
-Fragments may be successively fragmented to an arbitrary depth.