0% found this document useful (0 votes)

72 views44 pages

Chapter 3: Distributed Database Design

The document discusses distributed database design. It covers fragmentation, which involves dividing relations into subsets called fragments that are distributed across sites. There are different types of fragmentation, including horizontal (by tuples), vertical (by attributes), and mixed. Fragmentation aims to improve reliability, performance, storage usage, communication costs, and security. The fragments must satisfy rules of completeness, reconstruction, and disjointness. Horizontal fragmentation uses selection operations to partition relations, with the goal of placing frequently accessed data at each site.

Uploaded by

Muhammad Mujtaba

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

72 views44 pages

Chapter 3: Distributed Database Design

Uploaded by

Muhammad Mujtaba

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 44

Chapter 3: Distributed Database Design

• Design problem
• Design strategies(top-down, bottom-up)
• Fragmentation
• Allocation and replication of fragments, optimality, heuristics

Acknowledgements: I am indebted to Arturas Mazeika for providing me his slides of this course.
Design Problem

• Design problem of distributed systems: Making decisions about the placement of

data and programs across the sites of a computer network as well as possibly
designing the network itself.

• In DDBMS, the distribution of applications involves

– Distribution of the DDBMS software
– Distribution of applications that run on the database

• Distribution of applications will not be considered in the following; instead the distributi
of data is studied.
Framework of Distribution

• Dimension for the analysis of distributed systems

– Level of sharing: no sharing, data sharing, data + program sharing
– Behavior of access patterns: static, dynamic
– Level of knowledge on access pattern behavior: no information, partial information,
complete information

• Distributed database design should be considered within this general framework.

Design Strategies

• Top-down approach
– Designing systems from scratch
– Homogeneous systems

• Bottom-up approach
– The databases already exist at a number of sites
– The databases should be connected to solve common tasks
Design Strategies . . .

• Top-down design strategy

Design Strategies . . .

• Distribution design is the central part of the design in DDBMSs (the other tasks are
similar to traditional databases)
– Objective: Design the LCSs by distributing the entities (relations) over the sites
– Two main aspects have to be designed carefully
∗ Fragmentation
· Relation may be divided into a number of sub-relations, which are distributed
∗ Allocation and replication
· Each fragment is stored at site with ”optimal” distribution
· Copy of fragment may be maintained at several sites
• In this chapter we mainly concentrate on these two aspects
• Distribution design issues
– Why fragment at all?
– How to fragment?
– How much to fragment?
– How to test correctness?
– How to allocate?
Design Strategies . . .

• Bottom-up design strategy

Fragmentation

• What is a reasonable unit of distribution? Relation or fragment of relation?

• Relations as unit of distribution:
– If the relation is not replicated, we get a high volume of remote data accesses.
– If the relation is replicated, we get unnecessary replications, which cause problems in
executing updates and waste disk space
– Might be an Ok solution, if queries need all the data in the relation and data stays at
the only sites that uses the data

• Fragments of relationas as unit of distribution:

– Application views are usually subsets of relations
– Thus, locality of accesses of applications is deﬁned on subsets of relations
– Permits a number of transactions to execute concurrently, since they will access
different portions of a relation
– Parallel execution of a single query (intra-query concurrency)
– However, semantic data control (especially integrity enforcement) is more difﬁcult

⇒ Fragments of relations are (usually) the appropriate unit of distribution.

Fragmentation . . .

• Fragmentation aims to improve:

– Reliability
– Performance
– Balanced storage capacity and costs
– Communication costs
– Security

• The following information is used to decide fragmentation:

– Quantitative information: frequency of queries, site, where query is run, selectivity of
the queries, etc.
– Qualitative information: types of access of data, read/write, etc.
Fragmentation . . .

• Types of Fragmentation
– Horizontal: partitions a relation along its tuples
– Vertical: partitions a relation along its attributes
– Mixed/hybrid: a combination of horizontal and vertical fragmentation

(a) Horizontal Fragmentation

(b) Vertical Fragmentation (c) Mixed Fragmentation

Fragmentation . . .

• Exampe

Data E-R Diagram

Fragmentation . . .

• Example (contd.): Horizontal fragmentation of PROJ relation

– PROJ1: projects with budgets less than 200, 000
– PROJ2: projects with budgets greater than or equal to 200, 000
Fragmentation . . .

• Example (contd.): Vertical fragmentation of PROJ relation

– PROJ1: information about project budgets
– PROJ2: information about project names and locations
Correctness Rules of Fragmentation

• Completeness
– Decomposition of relation R into fragments R1 , R2 , . . . , Rn is complete iff each
data item in R can also be found in some Ri .

• Reconstruction
– If relation R is decomposed into fragments R1 , R2 , . . . , Rn , then there should exist
some relational operator ∇ that reconstructs R from its fragments, i.e.,
R = R1 ∇ . . . ∇R n
∗ Union to combine horizontal fragments
∗ Join to combine vertical fragments
• Disjointness
– If relation R is decomposed into fragments R1 , R2 , . . . , Rn and data item di
appears in fragment Rj , then di should not appear in any other fragment Rk , k = j
(exception: primary key attribute for vertical fragmentation)
∗ For horizontal fragmentation, data item is a tuple
∗ For vertical fragmentation, data item is an attribute
Horizontal Fragmentation

• Intuition behind horizontal fragmentation

– Every site should hold all information that is used to query at the site
– The information at the site should be fragmented so the queries of the site run faster

• Horizontal fragmentation is deﬁned as selection operation, σp (R)

• Example:
σBUDGET<200000 (P ROJ )
σBUDGET≥200000 (P ROJ )
Horizontal Fragmentation . . .

• Computing horizontal fragmentation (idea)

– Compute the frequency of the individual queries of the site q1 , . . . , qQ
– Rewrite the queries of the site in the conjunctive normal form (disjunction of
conjunctions); the conjunctions are called minterms.
– Compute the selectivity of the minterms
– Find the minimal and complete set of minterms (predicates)
∗ The set of predicates is complete if and only if any two tuples in the same fragment
are referenced with the same probability by any application
∗ The set of predicates is minimal if and only if there is at least one query that
accesses the fragment
– There is an algorithm how to ﬁnd these fragments algorithmically (the algorithm
CON MIN and PHORIZONTAL (pp 120-122) of the textbook of
Horizontal Fragmentation . . .
• Example: Fragmentation of the P ROJ relation
– Consider the following query: Find the name and budget of projects given their PNO.
– The query is issued at all three sites
– Fragmentation based on LOC, using the set of predicates/minterms
{LOC =′ M ontreal′ , LOC =′ N ewY ork ′ , LOC =′ P aris′ }
P ROJ2 = σLOC =′ N ewY ork′ (P ROJ )
P ROJ1 = σLOC =′ M ontreal′ (P ROJ )
PNO PNAME BUDGET LOC
PNO PNAME BUDGET LOC
P2 Database Develop. 135000 New York
P1 Instrumentation 150000 Montreal
P3 CAD/CAM 250000 New York
P ROJ3 = σLOC =′ P aris′ (P ROJ )
PNO PNAME BUDGET LOC
P4 Maintenance 310000 Paris

• If access is only according to the location, the above set of predicates is complete
– i.e., each tuple of each fragment P ROJi has the same probability of being accessed

• If there is a second query/application to access only those project tuples where the
budget is less than $200000, the set of predicates is not complete.
– P 2 in P ROJ2 has higher probability to be accessed
DDB 2008/09 J. Gamper Page 17
Horizontal Fragmentation . . .

• Example (contd.):
– Add BU DGET ≤ 200000 and BU DGET > 200000 to the set of predicates
to make it complete.
⇒ {LOC =′ M ontreal′ , LOC =′ N ewY ork ′ , LOC =′ P aris′ ,
BU DGET ≥ 200000, BU DGET < 200000} is a complete set
– Minterms to fragment the relation are given as follows:

(LOC =′ M ontreal′ ) ∧ (BU DGET ≤ 200000)

(LOC =′ M ontreal′ ) ∧ (BU DGET > 200000)
(LOC =′ N ewY ork ′ ) ∧ (BU DGET ≤ 200000)
(LOC =′ N ewY ork ′ ) ∧ (BU DGET > 200000)
(LOC =′ P aris′ ) ∧ (BU DGET ≤ 200000)
(LOC =′ P aris′ ) ∧ (BU DGET > 200000)
Horizontal Fragmentation . . .

• Example (contd.): Now, P ROJ2 will be split in two fragments

P ROJ1 = σLOC =′ M ontreal′ (P ROJ ) P ROJ2 = σLOC =′ N Y ′ ∧BU DGET <200000 (P ROJ )
PNO PNAME BUDGET LOC PNO PNAME BUDGET LOC
P1 Instrumentation 150000 Montreal P2 Database Develop. 135000 New York

P ROJ3 = σLOC =′ P aris′ (P ROJ ) P ROJ2 ′= σLOC =′ N Y ′ ∧BU DGET ≥200000 (P ROJ )
PNO PNAME BUDGET LOC PNO PNAME BUDGET LOC
P4 Maintenance 310000 Paris P3 CAD/CAM 250000 New York

– P ROJ1 and P ROJ2 would have been split in a similar way if tuples with budgets
smaller and greater than 200.000 would be stored
Horizontal Fragmentation . . .

• In most cases intuition can be used to build horizontal partitions. Let {t1 , t2 , t3 },
{t4 , t5 }, and {t2 , t3 , t4 , t5 } be query results. Then tuples would be fragmented in the
following way:

t1 t2 t3 t4 t5
Vertical Fragmentation

• Objective of vertical fragmentation is to partition a relation into a set of smaller relations

so that many of the applications will run on only one fragment.

• Vertical fragmentation of a relation R produces fragments R1 , R2 , . . . , each of which

contains a subset of R’s attributes.

• Vertical fragmentation is deﬁned using the projection operation of the relational

algebra:
ΠA1 ,A2 ,...,An (R)
• Example:
P ROJ1 = ΠP N O,BU DGET (P ROJ )
P ROJ2 = ΠP N O,P N AM E,LOC (P ROJ )

• Vertical fragmentation has also been studied for (centralized) DBMS

– Smaller relations, and hence less page accesses
– e.g., MONET system
Vertical Fragmentation . . .

• Vertical fragmentation is inherently more complicated than horizontal fragmentation

– In horizontal partitioning: for n simple predicates, the number of possible minterms is
2n ; some of them can be ruled out by existing implications/constraints.
– In vertical partitioning: for m non-primary key attributes, the number of possible
fragments is equal to B (m) (= the mth Bell number), i.e., the number of partitions of
a set with m members.
∗ For large numbers, B (m) ≈ mm (e.g., B (15) = 109 )
• Optimal solutions are not feasible, and heuristics need to be applied.
Vertical Fragmentation . . .

• Two types of heuristics for vertical fragmentation exist:

– Grouping: assign each attribute to one fragment, and at each step, join some of the
fragments until some criteria is satisﬁed.
∗ Bottom-up approach
– Splitting: starts with a relation and decides on beneﬁcial partitionings based on the
access behaviour of applications to the attributes.
∗ Top-down approach
∗ Results in non-overlapping fragments
∗ “Optimal” solution is probably closer to the full relation than to a set of small
relations with only one attribute
∗ Only vertical fragmentation is considered here
Vertical Fragmentation . . .

• Application information: The major information required as input for vertical

fragmentation is related to applications
– Since vertical fragmentation places in one fragment those attributes usually accessed
together, there is a need for some measure that would deﬁne more precisely the
notion of “togetherness”, i.e., how closely related the attributes are.
– This information is obtained from queries and collected in the Attribute Usage Matrix
and Attribute Afﬁnity Matrix.
Vertical Fragmentation . . .

• Given are the user queries/applications Q = (q1 , . . . , qq ) that will run on relation
R(A1 , . . . , An )
• Attribute Usage Matrix: Denotes which query uses which attribute:

1 iff qi uses Aj
use (qi , Aj ) =
0 otherwise

– The use (qi , •) vectors for each application are easy to deﬁne if the designer knows
the applications that willl run on the DB (consider also the 80-20 rule)
Vertical Fragmentation . . .
• Example: Consider the following relation:
P ROJ (P N O, P N AM E, BU DGET, LOC )
and the following queries:

q1 = SELECT BUDGET FROM PROJ WHERE PNO=Value

q2 = SELECT PNAME,BUDGET FROM PROJ
q3 = SELECT PNAME FROM PROJ WHERE LOC=Value
q4 = SELECT SUM(BUDGET) FROM PROJ WHERE LOC =Value

• Lets abbreviate A1 = P N O, A2 = P N AM E, A3 = BU DGET, A4 = LOC

• Attribute Usage Matrix
Vertical Fragmentation . . .

• Attribute Afﬁnity Matrix: Denotes the frequency of two attributes Ai and Aj with
respect to a set of queries Q = (q1 , . . . , qn ):

aﬀ (Ai , Aj ) = ( ref l (qk )acc l (qk ))

use(q ,A )=1,
k: use(qk ,Ai )=1 sites l
k j

where
– ref l (qk ) is the cost (= number of accesses to (Ai , Aj )) of query qK at site l
– acc l (qk ) is the frequency of query qk at site l
Vertical Fragmentation . . .

• Example (contd.): Let the cost of each query be ref l (qk ) = 1, and the frequency
acc l (qk ) of the queries be as follows:
Site1 Site2 Site3
acc1 (q1 ) = 15 acc2 (q1 ) = 20 acc3 (q1 ) = 10
acc1 (q2 ) = 5 acc2 (q2 ) = 0 acc3 (q2 ) = 0
acc1 (q3 ) = 25 acc2 (q3 ) = 25 acc3 (q3 ) = 25
acc1 (q4 ) = 3 acc2 (q4 ) = 0 acc3 (q4 ) = 0

• Attribute afﬁnity matrix aﬀ (Ai , Aj ) =

– e.g., aff (A1 , A3 ) = k=1 1l=1 acc3 l (qk ) = acc 1 (q1 ) + acc 2 (q1 ) + acc 3 (q1 ) = 45
(q1 is the only query to access both A1 and A3 )
Vertical Fragmentation . . .
• Take the attribute affinity matrix (AA) and reorganize the attribute orders to form cluster
where the attributes in each cluster demonstrate high affinity to one another.
• Bond energy algorithm (BEA) has been suggested to be useful for that purpose for
several reasons:
– It is designed specifically to determine groups of similar items as opposed to a linear
ordering of the items.
– The final groupings are insensitive to the order in which items are presented.
– The computation time is reasonable (O (n2 ), where n is the number of attributes)

• BEA:
– Input: AA matrix
– Output: Clustered AA matrix (CA)
– Permutation is done in such a way to maximize the following global afﬁnity mesaure
(afﬁnity of Ai and Aj with their neighbors):
n n
AM = aff(Ai , Aj )[aff(Ai , Aj −1 ) + aff(Ai , Aj +1 ) +

i=1 j =1
aff(Ai−1 , Aj ) + aff(Ai+1 , Aj )]
Vertical Fragmentation . . .

• Example (contd.): Attribute Afﬁnity Matrix CA after running the BEA

– Elements with similar values are grouped together, and two clusters can be identiﬁed
– An additional partitioning algorithm is needed to identify the clusters in CA
∗ Usually more clusters and more than one candidate partitioning, thus additional
steps are needed to select the best clustering.
– The resulting fragmentation after partitioning (P N O is added in P ROJ2 explicilty
as key):

P ROJ1 = {P N O, BU DGET }
P ROJ2 = {P N O, P N AM E, LOC }
Correctness of Vertical Fragmentation

• Relation R is decomposed into fragments R1 , R2 , . . . , Rn

– e.g., P ROJ = {P N O, BU DGET, P N AM E, LOC } into
P ROJ1 = {P N O, BU DGET } and P ROJ2 = {P N O, P N AM E, LOC }
• Completeness
– Guaranteed by the partitioning algortihm, which assigns each attribute in A to one
partition

• Reconstruction
– Join to reconstruct vertical fragments
– R = R1 ⋊⋉· · · ⋊⋉Rn = P ROJ1 ⋊ P⋉ROJ2
• Disjointness
– Attributes have to be disjoint in VF. Two cases are distinguished:
∗ If tuple IDs are used, the fragments are really disjoint
∗ Otherwise, key attributes are replicated automatically by the system
∗ e.g., P N O in the above example
Mixed Fragmentation

• In most cases simple horizontal or vertical fragmentation of a DB schema will not be

sufﬁcient to satisfy the requirements of the applications.

• Mixed fragmentation (hybrid fragmentation): Consists of a horizontal fragment

followed by a vertical fragmentation, or a vertical fragmentation followed by a horizontal
fragmentation

• Fragmentation is deﬁned using the selection and projection operations of relational

algebra:

σp (ΠA1 ,...,An (R))

ΠA1 ,...,An (σp (R))
Replication and Allocation

• Replication: Which fragements shall be stored as multiple copies?

– Complete Replication
∗ Complete copy of the database is maintained in each site
– Selective Replication
∗ Selected fragments are replicated in some sites
• Allocation: On which sites to store the various fragments?
– Centralized
∗ Consists of a single DB and DBMS stored at one site with users distributed across
the network
– Partitioned
∗ Database is partitioned into disjoint fragments, each fragment assigned to one site
Replication . . .

• Replicated DB
– fully replicated: each fragment at each site
– partially replicated: each fragment at some of the sites

• Non-replicated DB (= partitioned DB)

– partitioned: each fragment resides at only one site

• Rule of thumb:
read only queries
– If update queries ≥ 1, then replication is advantageous, otherwise replication may
cause problems
Replication . . .

• Comparison of replication alternatives

Fragment Allocation

• Fragment allocation problem

– Given are:
– fragments F = {F1 , F2 , ..., Fn }
– network sites S = {S1 , S2 , ..., Sm }
– and applications Q = {q1 , q2 , ..., ql }
– Find: the ”optimal” distribution of F to S

• Optimality
– Minimal cost
∗ Communication + storage + processing (read and update)
∗ Cost in terms of time (usually)
– Performance
∗ Response time and/or throughput
– Constraints
∗ Per site constraints (storage and processing)
Fragment Allocation . . .

• Required information
– Database Information
∗ selectivity of fragments
∗ size of a fragment
– Application Information
∗ RRij : number of read accesses of a query qi to a fragment Fj
∗ U Rij : number of update accesses of query qi to a fragment Fj
∗ uij : a matrix indicating which queries updates which fragments,
∗ rij : a similar matrix for retrievals
∗ originating site of each query
– Site Information
∗ U SCk : unit cost of storing data at a site Sk
∗ LP Ck : cost of processing one unit of data at a site Sk
– Network Information
∗ communication cost/frame between two sites
∗ frame size
Fragment Allocation . . .
• We present an allocation model which attempts to
– minimize the total cost of processing and storage
– meet certain response time restrictions

• General Form:
min(Total Cost)
– subject to
∗ response time constraint
∗ storage constraint
∗ processing constraint

• Functions for the total cost and the constraints are presented in the next slides.
• Decision variable xij

1 if fragment Fi is stored at site Sj

xij =
0 otherwise
Fragment Allocation . . .

• The total cost function has two components: storage and query processing.

T OC = ST Cjk + QP Ci
Sk ∈S F j ∈F qi ∈ Q

– Storage cost of fragment Fj at site Sk :

ST Cjk = U SCk ∗ size(Fi ) ∗ xij

where U SCk is the unit storage cost at site k

– Query processing cost for a query qi is composed of two components:

∗ composed of processing cost (PC) and transmission cost (TC)

QP Ci = P Ci + T Ci
Fragment Allocation . . .
• Processing cost is a sum of three components:
– access cost (AC), integrity contraint cost (IE), concurency control cost (CC)

P Ci = ACi + IEi + CCi

– Access cost:

ACi = (U Rij + RRij ) ∗ xij ∗ LP Ck

sk ∈ S F j ∈ F

where LP Ck is the unit process cost at site k

– Integrity and concurrency costs:
∗ Can be similarly computed, though depends on the speciﬁc constraints

• Note: ACi assumes that processing a query involves decomposing it into a set of
subqueries, each of which works on a fragment, ...,
– This is a very simplistic model
– Does not take into consideration different query costs depending on the operator or
different algorithms that are applied
Fragment Allocation . . .

• The transmission cost is composed of two components:

– Cost of processing updates (TCU) and cost of processing retrievals (TCR)

T Ci = T CUi + T CRi

– Cost of updates:
∗ Inform all the sites that have replicas + a short conﬁrmation message back

T CUi = uij ∗ (update message cost + acknowledgment cost)

Sk ∈ S F j ∈ F

– Retrieval cost:
∗ Send retrieval request to all sites that have a copy of fragments that are needed +
sending back the results from these sites to the originating site.

T CRi = min ∗(cost of retrieval request + cost of sending back the result)
Sk ∈ S
Fj ∈F
Fragment Allocation . . .

• Modeling the constraints

– Response time constraint for a query qi

execution time of qi ≤ max. allowable response time for qi

– Storage constraints for a site Sk

storage requirement of Fj at Sk ≤ storage capacity of Sk

Fj ∈F

– Processing constraints for a site Sk

processing load of qi at site Sk ≤ processing capacity ofSk

qi ∈ Q
Fragment Allocation . . .

• Solution Methods
– The complexity of this allocation model/problem is NP-complete
– Correspondence between the allocation problem and similar problems in other areas
∗ Plant location problem in operations research
∗ Knapsack problem
∗ Network flow problem
– Hence, solutions from these areas can be re-used
– Use different heuristics to reduce the search space
∗ Assume that all candidate partitionings have been determined together with their
associated costs and benefits in terms of query processing.
· The problem is then reduced to find the optimal partitioning and placement for
each relation
∗ Ignore replication at the first step and find an optimal non-replicated solution
· Replication is then handeled in a second step on top of the previous
non-replicated solution.
Conclusion

• Distributed design decides on the placement of (parts of the) data and programs across
the sites of a computer network
• On the abstract level there are two patterns: Top-down and Bottom-up
• On the detail level design answers two key questions: fragmentation and
allocation/replication of data
– Horizontal fragmentation is defined via the selection operation σp (R)
∗ Rewrites the queries of each site in the conjunctive normal form and finds a
minimal and complete set of conjunctions to determine fragmentation
– Vertical fragmentation via the projection operation πA (R)
∗ Computes the attribute affinity matrix and groups “similar” attributes together
– Mixed fragmentation is a combination of both approaches
• Allocation/Replication of data
– Type of replication: no replication, partial replication, full replication
– Optimal allocation/replication modelled as a cost function under a set of constraints
– The complexity of the problem is NP-complete
– Use of different heuristics to reduce the complexity

Distributeddbms
No ratings yet
Distributeddbms
46 pages
Distributed Database Chapter 3 Modified
No ratings yet
Distributed Database Chapter 3 Modified
40 pages
8th DD 2023-4 Seg 3
No ratings yet
8th DD 2023-4 Seg 3
11 pages
BIT - University of Colombo - Fundamentals of DB Systems
No ratings yet
BIT - University of Colombo - Fundamentals of DB Systems
41 pages
9 Kirti 1
No ratings yet
9 Kirti 1
11 pages
DD Design
No ratings yet
DD Design
17 pages
Lecture4-Distribution - Design - Replica Allocation
No ratings yet
Lecture4-Distribution - Design - Replica Allocation
70 pages
Chapter Four: Theory of Production and Cost
No ratings yet
Chapter Four: Theory of Production and Cost
33 pages
2 Distribution Design
No ratings yet
2 Distribution Design
73 pages
3 Distribution Design
No ratings yet
3 Distribution Design
110 pages
Diagrama Elétrico Rolo 3411
100% (1)
Diagrama Elétrico Rolo 3411
67 pages
Fragmentaion
No ratings yet
Fragmentaion
10 pages
Distributed Databases and Client-Server Architectures
No ratings yet
Distributed Databases and Client-Server Architectures
41 pages
IJERT Efficient Fragmentation and Alloca
No ratings yet
IJERT Efficient Fragmentation and Alloca
7 pages
Lecture 9
No ratings yet
Lecture 9
53 pages
2 Distribution Design
No ratings yet
2 Distribution Design
73 pages
ADB - Unit - II (Chapter-2)
No ratings yet
ADB - Unit - II (Chapter-2)
67 pages
3-Distribution Design
No ratings yet
3-Distribution Design
66 pages
Distribution Design
No ratings yet
Distribution Design
33 pages
Chapter 6 DDBMS
No ratings yet
Chapter 6 DDBMS
41 pages
Fragmentation
No ratings yet
Fragmentation
7 pages
Fragmentation
No ratings yet
Fragmentation
32 pages
Distributed Database Design
No ratings yet
Distributed Database Design
15 pages
On The Exam We Can Have 1 Cheat Sheet: Blg/Edit?Usp Sharing
No ratings yet
On The Exam We Can Have 1 Cheat Sheet: Blg/Edit?Usp Sharing
40 pages
DDBS Lecture3
No ratings yet
DDBS Lecture3
33 pages
Adobe Scan 02 Dec 2023
No ratings yet
Adobe Scan 02 Dec 2023
27 pages
Fragmentation Instructor: Mehwashma Amir
No ratings yet
Fragmentation Instructor: Mehwashma Amir
17 pages
Distributed Database Design
No ratings yet
Distributed Database Design
49 pages
Distributed Database Design
No ratings yet
Distributed Database Design
51 pages
4.1 Lecture 4 Distributed Databases
No ratings yet
4.1 Lecture 4 Distributed Databases
42 pages
3distribution Design
No ratings yet
3distribution Design
65 pages
Distributed DB New
No ratings yet
Distributed DB New
44 pages
Unit I Distributed Databases
No ratings yet
Unit I Distributed Databases
15 pages
Chapter 6
No ratings yet
Chapter 6
27 pages
Distributed Database Frank Chinembiri and Florence-2
No ratings yet
Distributed Database Frank Chinembiri and Florence-2
42 pages
CSE 453 Slide 2
No ratings yet
CSE 453 Slide 2
75 pages
Dbms Unit V Notes 2 27
No ratings yet
Dbms Unit V Notes 2 27
26 pages
SAD Unit 4 Distributed Database Design
No ratings yet
SAD Unit 4 Distributed Database Design
30 pages
Chapter 3 Distributed Database Design
No ratings yet
Chapter 3 Distributed Database Design
34 pages
Chapter - 7 Distributed Database System
No ratings yet
Chapter - 7 Distributed Database System
29 pages
Distributed Database Design
No ratings yet
Distributed Database Design
73 pages
3 Distribution Design
No ratings yet
3 Distribution Design
65 pages
Distrubuted Database Concept
No ratings yet
Distrubuted Database Concept
22 pages
Chapter 7 - Distributed Database System
No ratings yet
Chapter 7 - Distributed Database System
42 pages
Lecture 2 Distriburted Databases
No ratings yet
Lecture 2 Distriburted Databases
45 pages
Chapter 2
No ratings yet
Chapter 2
61 pages
Distributed Databases and Client-Server Architectures
No ratings yet
Distributed Databases and Client-Server Architectures
41 pages
10 EIM Q2M1 TLE10 - EIM - Q2 - Mod1 - Wk1-5 - Elec-Meter-Connection-and-Grounding - v3
100% (1)
10 EIM Q2M1 TLE10 - EIM - Q2 - Mod1 - Wk1-5 - Elec-Meter-Connection-and-Grounding - v3
35 pages
DDB 05 PDF
No ratings yet
DDB 05 PDF
19 pages
DBMS-Unit 5
No ratings yet
DBMS-Unit 5
27 pages
Unit 1
No ratings yet
Unit 1
28 pages
Fragmentation
No ratings yet
Fragmentation
1 page
Distributed Database Concepts
No ratings yet
Distributed Database Concepts
52 pages
Distributed Database Design
No ratings yet
Distributed Database Design
52 pages
Assignment 04
No ratings yet
Assignment 04
10 pages
Dist DB
No ratings yet
Dist DB
15 pages
ddb03 2
No ratings yet
ddb03 2
62 pages
Distributed Database Design 3rd Assignment
100% (2)
Distributed Database Design 3rd Assignment
22 pages
Distributed Database Concepts
No ratings yet
Distributed Database Concepts
35 pages
Distributed Database: Source
No ratings yet
Distributed Database: Source
19 pages
Chapter 5 Distributed Database Design
No ratings yet
Chapter 5 Distributed Database Design
12 pages
Inbound 91797242154262642
No ratings yet
Inbound 91797242154262642
7 pages
Unit 1 PDF
No ratings yet
Unit 1 PDF
33 pages
QC Yorp Forms
No ratings yet
QC Yorp Forms
4 pages
Compal Confidential: NAWF2 M/B Schematics Document
No ratings yet
Compal Confidential: NAWF2 M/B Schematics Document
53 pages
Hotel Bill 25092024
No ratings yet
Hotel Bill 25092024
1 page
Winback - en Brochure Rshock Version J3 Mars 2021 A
100% (1)
Winback - en Brochure Rshock Version J3 Mars 2021 A
12 pages
MAP050-King in Yellow in Carcosa - Compressed
No ratings yet
MAP050-King in Yellow in Carcosa - Compressed
11 pages
l1 Auto Sensors Accessible
No ratings yet
l1 Auto Sensors Accessible
14 pages
Lesson Plan
No ratings yet
Lesson Plan
3 pages
Fine Wines - Skinner Auctions 2622B and 2614T
No ratings yet
Fine Wines - Skinner Auctions 2622B and 2614T
108 pages
Computer Aided Drug Design PPT 5
No ratings yet
Computer Aided Drug Design PPT 5
1 page
DS4510 5010
100% (1)
DS4510 5010
2 pages
Canicosa Contract To Sell
No ratings yet
Canicosa Contract To Sell
5 pages
4 交易之王语录
No ratings yet
4 交易之王语录
98 pages
IIM KZ EPGP Combine Brochure Batch 17 32c718e31a
No ratings yet
IIM KZ EPGP Combine Brochure Batch 17 32c718e31a
20 pages
Sec 4 Water - Resources - (Regulation - and - Management) - Act, - 2010-1-16
No ratings yet
Sec 4 Water - Resources - (Regulation - and - Management) - Act, - 2010-1-16
16 pages
Class X Unit 3 DBMS
No ratings yet
Class X Unit 3 DBMS
78 pages
Sas#12 Acc150 Quiz
No ratings yet
Sas#12 Acc150 Quiz
3 pages
Module 8 Tle
No ratings yet
Module 8 Tle
13 pages
Refrigerated vs. Desiccant Dryers - Choosing The Right One - Rev
No ratings yet
Refrigerated vs. Desiccant Dryers - Choosing The Right One - Rev
48 pages
Ledesma vs. CA Notes
No ratings yet
Ledesma vs. CA Notes
4 pages
2024-Vector Control of Brushless Doubly-Fed Induction Machines Based On Highly Efficient Nonlinear Controllers
No ratings yet
2024-Vector Control of Brushless Doubly-Fed Induction Machines Based On Highly Efficient Nonlinear Controllers
12 pages
Installation of OBLF Spectrometer
No ratings yet
Installation of OBLF Spectrometer
15 pages
Kikambala Revised Drawings
No ratings yet
Kikambala Revised Drawings
1 page
Chapter 2
No ratings yet
Chapter 2
42 pages
SLIDE PAPARAN POLPUM KEMENDAGRI 18 JAN 23 TTG PEMILU
No ratings yet
SLIDE PAPARAN POLPUM KEMENDAGRI 18 JAN 23 TTG PEMILU
35 pages
Fashion Polka Dot Background Business PPT Templates
No ratings yet
Fashion Polka Dot Background Business PPT Templates
25 pages
References
No ratings yet
References
3 pages
The Tech Interview Playbook: From DSA to System Design
From Everand
The Tech Interview Playbook: From DSA to System Design
Chinmoy Mukherjee
No ratings yet

Chapter 3: Distributed Database Design

Uploaded by

Chapter 3: Distributed Database Design

Uploaded by

Chapter 3: Distributed Database Design

• Design problem of distributed systems: Making decisions about the placement of

• In DDBMS, the distribution of applications involves

• Dimension for the analysis of distributed systems

• Distributed database design should be considered within this general framework.

• Top-down design strategy

• Bottom-up design strategy

• What is a reasonable unit of distribution? Relation or fragment of relation?

• Fragments of relationas as unit of distribution:

⇒ Fragments of relations are (usually) the appropriate unit of distribution.

• Fragmentation aims to improve:

• The following information is used to decide fragmentation:

(a) Horizontal Fragmentation

(b) Vertical Fragmentation (c) Mixed Fragmentation

Data E-R Diagram

• Example (contd.): Horizontal fragmentation of PROJ relation

• Example (contd.): Vertical fragmentation of PROJ relation

• Intuition behind horizontal fragmentation

• Horizontal fragmentation is deﬁned as selection operation, σp (R)

• Computing horizontal fragmentation (idea)

(LOC =′ M ontreal′ ) ∧ (BU DGET ≤ 200000)

• Example (contd.): Now, P ROJ2 will be split in two fragments

• Objective of vertical fragmentation is to partition a relation into a set of smaller relations

• Vertical fragmentation of a relation R produces fragments R1 , R2 , . . . , each of which

• Vertical fragmentation is deﬁned using the projection operation of the relational

• Vertical fragmentation has also been studied for (centralized) DBMS

• Vertical fragmentation is inherently more complicated than horizontal fragmentation

• Two types of heuristics for vertical fragmentation exist:

• Application information: The major information required as input for vertical

q1 = SELECT BUDGET FROM PROJ WHERE PNO=Value

• Lets abbreviate A1 = P N O, A2 = P N AM E, A3 = BU DGET, A4 = LOC

aﬀ (Ai , Aj ) = ( ref l (qk )acc l (qk ))

• Attribute afﬁnity matrix aﬀ (Ai , Aj ) =

• Example (contd.): Attribute Afﬁnity Matrix CA after running the BEA

• Relation R is decomposed into fragments R1 , R2 , . . . , Rn

• In most cases simple horizontal or vertical fragmentation of a DB schema will not be

• Mixed fragmentation (hybrid fragmentation): Consists of a horizontal fragment

• Fragmentation is deﬁned using the selection and projection operations of relational

σp (ΠA1 ,...,An (R))

• Replication: Which fragements shall be stored as multiple copies?

• Non-replicated DB (= partitioned DB)

• Comparison of replication alternatives

• Fragment allocation problem

1 if fragment Fi is stored at site Sj

– Storage cost of fragment Fj at site Sk :

ST Cjk = U SCk ∗ size(Fi ) ∗ xij

– Query processing cost for a query qi is composed of two components:

P Ci = ACi + IEi + CCi

ACi = (U Rij + RRij ) ∗ xij ∗ LP Ck

where LP Ck is the unit process cost at site k

• The transmission cost is composed of two components:

T CUi = uij ∗ (update message cost + acknowledgment cost)

• Modeling the constraints

execution time of qi ≤ max. allowable response time for qi

storage requirement of Fj at Sk ≤ storage capacity of Sk

– Processing constraints for a site Sk

processing load of qi at site Sk ≤ processing capacity ofSk

You might also like