0% found this document useful (0 votes)

344 views73 pages

Distributed Database Design Concepts

The document discusses distributed database design. It covers topics such as distributed database design concepts, data distribution objectives, data fragmentation, allocation of fragments, and transparencies in distributed database design. It describes issues in distributed database design like placement of data, programs and applications across computer network sites. It also discusses dimensions of the distributed design problem like access patterns, sharing levels and knowledge levels. Finally, it outlines the distributed design process and covers issues like why and how to fragment data, degree of fragmentation, allocation alternatives, and information requirements.

Uploaded by

sheenam_bhatia

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

344 views73 pages

Distributed Database Design Concepts

Uploaded by

sheenam_bhatia

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

Distributed Database Design

TOPICS
Distributed database design concept,
objective of Data Distribution,
Data Fragmentation,
The allocation of fragment ,
Transparencies in Distributed Database Design

Design Problem
In the general setting :

Making decisions about the placement of data

and programs across the sites of a computer
network as well as possibly designing the
network itself.
In Distributed DBMS, the placement of

applications entails
placement of the distributed DBMS software; and
placement of the applications that run on the

database

Dimensions of the Problem

Access pattern behavior
dynamic
static

data
data +
program

Level of sharing

partial
information
Level of knowledge
complete
information

Distribution Design
Top-down
mostly in designing systems from scratch
mostly in homogeneous systems

Bottom-up
when the databases already exist at a

number of sites

Top-Down Design
Requirements
Analysis
Objectives
User Input
Conceptual
Design

View Integration

View Design

Access
Information

GCS

Distribution
Design
LCSs
Physical
Design
LISs

ESs

User Input

Distribution Design Issues

Why fragment at all?
How to fragment?
How much to fragment?
How to test correctness?
How to allocate?
Information requirements?

Fragmentation
Can't we just distribute relations?
What is a reasonable unit of distribution?
relation
views are subsets of relations locality
extra communication

fragments of relations (sub-relations)

concurrent execution of a number of transactions
that access different portions of a relation
views that cannot be defined on a single fragment
will require extra processing
semantic data control (especially integrity
enforcement) more difficult

Fragmentation Alternatives
Horizontal
PROJ

PROJ1 : projects with budgets less

than $200,000
PROJ2 : projects with budgets
greater than or equal to
$200,000
PROJ1
PNO

PNO

PNAME

BUDGET

P1 Instrumentation 150000
P2 Database Develop.135000
P3 CAD/CAM
250000
P4 Maintenance
310000
P5 CAD/CAM
500000

LOC
Montreal
New York
New York
Paris
Boston

PROJ2
PNAME

P1 Instrumentation

BUDGET

LOC

15000 Montreal
0
P2 Database Develop.135000 New York

PNO
P3

PNAME
CAD/CAM

BUDGET

LOC

250000 New York

P4 Maintenance

310000 Paris

500000 Boston

CAD/CAM

Fragmentation Alternatives
Vertical
PROJ

PROJ1: information about

project budgets
PROJ2: information about
project names and
locations

PNO

PNAME

BUDGET

P1 Instrumentation 150000
P2 Database Develop.135000
P3
CAD/CAM
250000
P4 Maintenance
310000
P5
CAD/CAM
500000

PROJ1

PROJ2

PNO

BUDGET

PNO

P1
P2
P3
P4
P5

150000
135000
250000
310000
500000

PNAME

LOC

P1 Instrumentation Montreal
P2 Database Develop. New York
P3
CAD/CAM
New York
P4 Maintenance
Paris
P5 CAD/CAM
Boston

LOC
Montreal
New York
New York
Paris
Boston

Degree of Fragmentation
finite number of alternatives

tuples
or
attributes

relations

Finding the suitable level of partitioning within

this range

Correctness of Fragmentation
Completeness
Decomposition of relation R into fragments R1, R2, ...,

Rn is complete if and only if each data item in R can

also be found in some Ri
Reconstruction
If relation R is decomposed into fragments R1, R2, ...,

Rn, then there should exist some relational operator

such that
R = 1inRi

Disjointness
If relation R is decomposed into fragments R1, R2, ...,

Rn, and data item di is in Rj, then di should not be in

any other fragment Rk (k j ).

Allocation Alternatives
Non-replicated

partitioned : each fragment resides at only

one site

Replicated

fully replicated : each fragment at each site

partially replicated : each fragment at some

of the sites
If read-only queries << 1, replication is advantageous,
update queries

otherwise replication may cause problems

Rule of thumb:

Comparison of Replication
Alternatives
Full-replication

Partial-replication

Partitioning

QUERY
PROCESSING

Easy

Same Difficulty

DIRECTORY
MANAGEMENT

Easy or
Non-existant

Same Difficulty

CONCURRENCY
CONTROL

Moderate

Difficult

Easy

RELIABILITY

Very high

High

Low

Possible
application

Realistic

Possible
application

REALITY

Information Requirements
Four categories:

Database information

Application information

Communication network information

Computer system information

PHF Information Requirements

Database Information
relationship

SKILL
TITLE, SAL
L1

EMP
ENO, ENAME, TITLE

PROJ
PNO, PNAME, BUDGET, LOC

ASG
ENO, PNO, RESP, DUR

cardinality of each relation: card(R)

Application Information
1. Qualitative Information
. The fundamental qualitative information consists of the
predicates used in user queries.
. Analyze user queries based on 80/20 rule: 20% of user
queries account for 80% of the total data access.
One should investigate the more important queries
2. Quantitative Information
. Minterm Selectivity sel(mi): number of tuples that
would be accessed by a query specified according to
a given minterm predicate.
. Access Frequency acc(mi): the access frequency of a
given minterm predicate in a given period.

Fragmentation
Horizontal Fragmentation (HF)
Primary Horizontal Fragmentation

(PHF)
Derived Horizontal Fragmentation

(DHF)
Vertical Fragmentation (VF)
Hybrid Fragmentation (HF)

Primary Horizontal
Fragmentation
EMP table
Three branch

offices, with each

employee
working at only
one office

Create table MPLS_EMPS as

Select * From EMP Where Loc =
Minneapolis;
Create table LA_EMPS as
Select *From EMP Where Loc = LA;
Create table NY_EMPS as Select *
From EMP Where Loc = New York;

After fragmentation
Select * from MPLS_EMPS
Union
Select * from LA_EMPS)
Union
Select * from NY_EMPS;

Example 1
AP1:looking for those employees who work in Los
Angeles (LA).
Pr = {p1: Loc= LA}

M= {m1: Loc = LA, m2: Loc<>LA}

is a minimal and complete set of minterm predicates for AP1

Fragment F1: Create table LA_EMPS as

Select * from EMP Where Loc = "LA";

Fragment F2: Create table NON_LA_EMPS as
Select * from EMP Where Loc <> "LA";
Minimal :the rows are accessed differently by at least one
application.
Complete :the rows have the same probability of being
accessed by any application.

AP1:Exclude any employee whose salary was less

than or equal to 30000

Pr = {p1: Loc = "LA,p2: salary > 30000}
M = {m1: Loc = "LA" Sal > 30000,

m2: Loc = "LA" Sal <= 30000,

m3: Loc <>"LA" Sal > 30000,
m4: Loc <>"LA" Sal <= 30000}
Any fragmentation must satisfy the following rules as defined :
Rule 1: Completeness. Decomposition of R into R1, R2, . . . , Rn is complete if and only if

each data item in R can also be found in some Ri.

Rule 2: Reconstruction. If R is decomposed into R1, R2, . . . , Rn, then there should exist
some relational operator, , such that R = 1in Ri.
Rule 3: Disjointness. If R is decomposed into R1, R2, . . . , Rn, and di is a tuple in Rj, then
di should not be in any other fragment, such as Rk, where k = j.

NOTE: N simple predicates in Pr, M will have 2N minterm

predicates.

PHF - Information
Requirements
Application Information
simple predicates : Given R[A1, A2, , An], a simple

predicate pj is
pj : Ai Value

where {=,<,,>,,}, Value Di and Di is the domain

of Ai.
For relation R we define Pr = {p1, p2, ,pm}
Example :
PNAME = "Maintenance"
BUDGET 200000

minterm predicates : Given R and Pr = {p1, p2, ,pm}

define M = {m1,m2,,mr} as
M = { mi | mi =

pjPrpj* }, 1jm, 1iz

where pj* = pj or pj* = (pj).

PHF Information Requirements

Application Information
minterm selectivities: sel(mi)
The number of tuples of the relation that would be

accessed by a user query which is specified

according to a given minterm predicate mi.

access frequencies: acc(qi)

The frequency with which a user application qi

accesses data.
Access frequency for a minterm predicate can also be

defined.

Primary Horizontal Fragmentation

Definition :
Rj = Fj(R), 1 j w
where Fj is a selection formula, which is (preferably) a
minterm predicate.
Therefore,
A horizontal fragment Ri of relation R consists of all the
tuples of R which satisfy a minterm predicate mi.

Given a set of minterm predicates M, there are as many

horizontal fragments of relation R as there are minterm
predicates.
Set of horizontal fragments also referred to as minterm
fragments.

PHF Algorithm
Given: A relation R, the set of simple
predicates Pr
Output:The set of fragments of R = {R1, R2,
,Rw} which obey the fragmentation
rules.
Preliminaries :
Pr should be complete
Pr should be minimal

Completeness of Simple
Predicates
A set of simple predicates Pr is said to be

complete if and only if the accesses to the tuples

of the minterm fragments defined on Pr requires
that two tuples of the same minterm fragment
have the same probability of being accessed by
any application.
Example 2:
Assume PROJ[PNO,PNAME,BUDGET,LOC] has two

applications defined on it.

Find the budgets of projects at each location. (1)
Find projects with budgets less than $200000. (2)

Completeness of Simple Predicates

According to (1),
Pr={LOC=Montreal,LOC=New York,LOC=Paris}

which is not complete with respect to (2).

Modify
Pr ={LOC=Montreal,LOC=New York,LOC=Paris,
BUDGET200000,BUDGET>200000}

which is complete.

Find projects with budgets less than $200000. (2)

Minimality of Simple
Predicates
If a predicate influences how fragmentation

is performed, (i.e., causes a fragment f to

be further fragmented into, say, fi and fj)
then there should be at least one
application that accesses fi and fj
differently.
In other words, the simple predicate should
be relevant in determining a fragmentation.
If all the predicates of a set Pr are relevant,
then Pr is minimal.
acc(mi ) acc(mj )

card( fi ) card( fj )

Minimality of Simple
Predicates
Example :
Pr ={LOC=Montreal,LOC=New York,
LOC=Paris,
BUDGET200000,BUDGET>200000}

is minimal (in addition to being complete).

However, if we add
PNAME = Instrumentation

then Pr is not minimal.

COM_MIN Algorithm
Given: a relation R and a set of simple
predicates Pr
Output: a complete and minimal set of
simple predicates Pr' for Pr
Rule 1: a relation or fragment is partitioned
into at least two parts which are
accessed differently by at least one
application.

COM_MIN Algorithm
Initialization :
find a pi Pr such that pi partitions R according to
Rule 1
set Pr' = pi ; Pr Pr {pi} ; F {fi}
Iteratively add predicates to Pr' until it is complete
find a pj Pr such that pj partitions some fk defined
according to minterm predicate over Pr' according to
Rule 1
set Pr' = Pr' {pj }; Pr Pr {pj }; F F {fi}
if pk Pr' which is nonrelevant then
Pr' Pr {pk}
F F {fk}

PHORIZONTAL Algorithm
Makes use of COM_MIN to perform fragmentation.
Input: a relation R and a set of simple predicates
Pr
Output: a set of minterm predicates M according to
which relation R is to be fragmented
Pr' COM_MIN (R,Pr)
determine the set M of minterm predicates
determine the set I of implications among pi Pr
eliminate the contradictory minterms from M

PHF Example 3
Two candidate relations : PAY and PROJ.
Fragmentation of relation PAY
Application: Check the salary info and determine

raise.
Employee records kept at two sites application
run at two sites
Simple predicates
p1 : SAL 30000
p2 : SAL > 30000
Pr = {p1,p2} which is complete and minimal Pr'=Pr

Minterm predicates
m1 : (SAL 30000)
m2 : NOT(SAL 30000) (SAL > 30000)

PHF Example 3

PAY1
TITLE

PAY2
SAL

TITLE

SAL

Mech. Eng. 27000

Elect. Eng.

40000

Programmer 24000

Syst. Anal.

34000

PHF Example 3
Fragmentation of relation PROJ

Applications:
Find the name and budget of projects given their no.
Issued

at three sites

Access project information according to budget

one

site accesses 200000 other accesses >200000

Simple predicates
For application (1)

p1 : LOC = Montreal
p2 : LOC = New York
p3 : LOC = Paris

For application (2)

p4 : BUDGET 200000
p5 : BUDGET > 200000
Pr = Pr' = {p1,p2,p3,p4,p5}

PHF Example 3
Fragmentation of relation PROJ continued
Minterm fragments left

m1 : (LOC = Montreal) (BUDGET

200000)
m2 : (LOC = Montreal) (BUDGET > 200000)
m3 : (LOC = New York) (BUDGET 200000)
m4 : (LOC = New York) (BUDGET > 200000)
m5 : (LOC = Paris) (BUDGET 200000)
m6 : (LOC = Paris) (BUDGET > 200000)

PHF Example
PROJ2

PROJ1
PNO
P1

PNAME

BUDGET

Instrumentation150000

LOC
Montrea
l

PROJ4
PNO

PNAME

CAD/CAM

PNO
P2

PNAME

BUDGET

LOC

Database
Develop.

135000 New York

PROJ6
BUDGET
250000

LOC
New
York

PNO
P4

PNAME

BUDGET

LOC

Maintenance

310000

Paris

PHF Correctness
Completeness
Since Pr' is complete and minimal, the selection predicates

are complete

Reconstruction
If relation R is fragmented into FR = {R1,R2,,Rr}
R =

Ri FR Ri

Disjointness
Minterm predicates that form the basis of fragmentation

should be mutually exclusive.

Derived Horizontal
Fragmentation
Defined on a member relation of a link according to a

selection operation specified on its owner.

Each link is an equijoin.
Equijoin can be implemented by means of semijoins.
SKILL
TITLE, SAL
L1
EMP

PROJ

ENO, ENAME, TITLE

PNO, PNAME, BUDGET, LOC

ASG
ENO, PNO, RESP, DUR

DHF Definition
Given a link L where owner(L)=S and
member(L)=R, the derived horizontal
fragments of R are defined as
Ri = R F Si, 1iw

where w is the maximum number of

fragments that will be defined on R and
Si = Fi(S)

where Fi is the formula according to which

the primary horizontal fragment Si is defined.

DHF Example
Given link L1 where owner(L1)=SKILL/PAY and member(L1)=EMP
Group engineers into two groups according to their salary: those making less
than or equal to $30,000, and those making more than $30,000.
EMP1 = EMP SKILL/PAY1
EMP2 = EMP SKILL/PAY2

where
SKILL/PAY1 = SAL30000(SKILL/PAY)
SKILL/PAY2 = SAL>30000(SKILL/PAY)

EMP1

EMP2

ENO

ENAME

E3
E4
E7

A. Lee
J. Miller
R. Davis

TITLE
Mech. Eng.
Programmer
Mech. Eng.

ENO

ENAME

TITLE

E1
E2
E5
E6
E8

J. Doe
M. Smith
B. Casey
L. Chu
J. Jones

Elect. Eng.
Syst. Anal.
Syst. Anal.
Elect. Eng.
Syst. Anal.

DHF Correctness

Completeness

Referential integrity
Let R be the member relation of a link whose owner is relation S which is fragmented

as FS = {S1, S2, ..., Sn}. Furthermore, let A be the join attribute between R and S. Then,
for each tuple t of R, there should be a tuple t' of S such that
t[A] = t' [A]

Reconstruction
Same as primary horizontal fragmentation.

Disjointness
Simple join graphs between the owner and the member fragments.

Vertical Fragmentation
Group the columns of a table into fragments.
Because each fragment contains a subset of the total set of columns in the table, VF can be

used to enforce security and/or privacy of data.

More difficult than horizontal, because more alternatives exist.
In the case of vertical partitioning, if a relation has m non-primary key attributes, the

number of possible fragments is equal to B(m), which is the mth Bell number
Two approaches :
grouping
attributes to fragments
first step creates as many vertical fragments as the number of non-key columns in the

table. Then grouping approach uses joins across the primary key, to group some of
these fragments together, and continues as needed
Not usually considered a valid approach
splitting
relation to fragments
placing each non-key column in one and only one fragment

Need to design affinity or closeness

If a table has 15 columns, then the number of

possible vertical fragments is 109 and

If the number of vertical fragments for a table
with 30 columns is 1023.
Evaluating is not practical.
Sol : Find the closeness/affinity between the
attributes to decide whether to group them into
same fragment or not

VF Information
Requirements
Application Information
Attribute affinities
a measure that indicates how closely related the attributes are
This is obtained by: access frequency + usage pattern
Access freq: how many times an application/query runs in a given
period of time at different sites
Usage pattern:Indicates whether a column is used by an
application/query.
Attribute usage values
Given a set of queries Q = {q1, q2,, qq} that will run on the

relation
R[A1 , A2,, An],
1 if attribute Aj is referenced by query qi
use(qi,Aj) =
0 otherwise

use(qi,) can be defined accordingly

VF Definition of use(qi,Aj)
Consider the following 4 queries for relation PROJ
q1: SELECT BUDGET q2: SELECT PNAME,BUDGET
FROM PROJ
FROM PROJ
WHERE
PNO=Value
q3: SELECT PNAMEq4: SELECT SUM(BUDGET)
FROM PROJ
FROM PROJ
WHERE
LOC=Value
WHERE
LOC=Value

Let A1= PNO, A2= PNAME, A3= BUDGET, A4=

LOC
A1

VF Affinity Measure
af(Ai,Aj)
The attribute affinity measure between two
attributes Ai and Aj of a relation R[A1, A2, , An]
with respect to the set of applications Q = (q1, q2,
, qq) is defined as follows :

af (Ai, Aj)

(query access)

all queries that access A and A

query access

access
access frequency of a query
execution

all sites

VF Calculation of af(Ai, Aj)

Example: Assume each query in the previous example
accesses the attributes once during each execution.
S1 S2 S3
Also assume the access frequencies
q1

15 20
5

25 25

Then
af(A1, A3) = 15*1 + 20*1+10*1
= 45

and the attribute affinity matrix AA is

A1
A2
A3
A4

A1 A2 A3 A4
45 0 45 0
5 75
0 80
45 5 53 3
3 78
0 75

VF Clustering Algorithm
Take the attribute affinity matrix AA and

reorganize the attribute orders to form

clusters where the attributes in each
cluster demonstrate high affinity to one
another.
Bond Energy Algorithm (BEA) has been
used for clustering of entities. BEA finds an
ordering of entities (in our case attributes)
such that the global affinity measure is
maximized.

Bond Energy Algorithm

Input: The AA affinity matrix
Output: The clustered affinity matrix CA which
is a perturbation of AA
Initialization: Place and fix one of the columns of
AA in CA.
Iteration: Place the remaining n-i columns in the
remaining i+1 positions in the CA matrix. For
each column, choose the placement that makes
the most contribution to the global affinity
measure.
Row order: Order the rows according to the
column ordering.

Bond Energy Algorithm

Best placement? Define contribution of a
placement:
cont(Ai, Ak, Aj) = 2bond(Ai, Ak)+2bond(Ak, Al)
2bond(Ai, Aj)
n

where
af(Az,Ax)af(Az,Ay)
bond(Ax,Ay
)=
z 1

BEA Example
Consider the following AA matrix and the corresponding CA matrix
where A1 and A2 have been placed. Place A3:

Ordering (0-3-1) :
cont(A0,A3,A1) = 2bond(A0 , A3)+2bond(A3 , A1)2bond(A0 , A1)
= 2* 0 + 2* 4410 2*0 = 8820

Ordering (1-3-2) :
cont(A1,A3,A2) = 2bond(A1 , A3)+2bond(A3 , A2)2bond(A1,A2)
= 2* 4410 + 2* 890 2*225 = 10150

Ordering (2-3-4) :
cont (A2,A3,A4)= 1780

BEA Example
Therefore, the CA matrix has the form

A1 A3 A2

45 45
0

5 80

45 53
0

3 75

When A is placed, the final form of the CA

matrix (after row

A1 A3 A2 A4
organization)
is 45 0 0
A1 45
A3 45 53

5 80 75

3 75 78

VF Algorithm
How can you divide a set of clustered attributes {A1, A2, , An} into two (or more) sets {A1, A2, , Ai} and {Ai, , An} such that there are no (or minimal)

applications that access both (or more than one) of the sets.
the function will produce fragments that are balanced.

For Best partitioning split the columns into a one-column BC and n 1 column TC first , and then repeatedly add columns from TC to BC until TC
is

left with only one column.

choose the splitting that has the highest Z value.

Z can be positive if total accesses to only one fragment are maximized while the total accesses to both fragments are minimized

Disadvantage :Not able to carve out an embedded or inner block of columns as a partition.

Sol: Overcome by adding a shift operation(moves the topmost row of the matrix to the bottom and then it moves the leftmost column of the matrix to the
extreme right)

A1 A2 A3 Ai Ai+1. . .Am
...

A1
A2

Ai
...

Ai+1
Am

SHIFT OPERATION

VF ALgorithm
Define
TQ = set of applications that access only TA(Top corner
attributes)
BQ = set of applications that access only BA
OQ = set of applications that access both TA and BA

and
CTQ = total number of accesses to attributes by applications
that access only TA
CBQ = total number of accesses to attributes by applications
that access only BA
COQ = total number of accesses to attributes by applications
that access both TA and BA

Then find the point along the diagonal that maximizes

Goal Function: Z =CTQCBQCOQ2

Example
1)

TC: all the applications that access one of the TC columns (C4, C1, or
C3) but do not access any BC
[i.e AP1, AP2, and AP4]

No application is BC-only.

AP3 accesses both TC and BC columns,

TCW = AFF(AP1) + AFF(AP2) + AFF(AP4) = 3 + 7 + 3 = 13

BCW = none = 0
BOCW = AFF(AP3) = 4
Z = 13*0 42 = 16
2) TC
TCW = AFF(AP1) + AFF(AP2) = 3 + 7 = 10
BCW = AFF(AP3) = 4
BOCW = AFF(AP4) = 3
Z = 4*10 32 = 40 9 = 31

Result :The two vertical fragments will be defined as VF1(C, C4) and
VF2(C, C1, C2, C3).

3)
TCW = AFF(AP2) = 7
BCW = AFF(AP3) + AFF(AP4) = 4 + 3 = 7
BOCW = AFF(AP1) = 3
Z = 7*7 32 = 49 9 = 40
4)
TCW = AFF(AP4) = 3
BCW = AFF(AP2) = 7
BOCW = AFF(AP1) + AFF(AP3) = 3 + 4 =
7
Z = 3*7 72 = 21 49 = 28

VF Algorithm
Two problems :
Cluster forming in the middle of the CA matrix
Shift a row up and a column left and apply the

algorithm to find the best partitioning point

Do this for all possible shifts
Cost O(m2)

More than two clusters

m-way partitioning
try 1, 2, , m1 split points along diagonal and try to

find the best point for each of these

Cost O(2m)

VF Correctness
A relation R, defined over attribute set A and key K,
generates the vertical partitioning FR = {R1, R2, , Rr}.
Completeness
The following should be true for A:
A=

AR i

Reconstruction
Reconstruction can be achieved by
R=

K Ri, Ri FR

Disjointness
TID's are not considered to be overlapping since they are

maintained by the system

Duplicated keys are not considered to be overlapping

Hybrid Fragmentation
R
HF

R11

R12

R21

R22

R23

Fragment Allocation
Problem Statement

Given
F = {F1, F2, , Fn}

fragments

S ={S1, S2, , Sm}

network sites

Q = {q1, q2,, qq}

applications

Find the "optimal" distribution of F to S.

Optimality

Minimal cost
Communication + storage + processing (read & update)
Cost in terms of time (usually)

Performance

Response time and/or throughput

Constraints

Per site constraints (storage & processing)

Information Requirements
Database information

selectivity of fragments
size of a fragment

Application information

access types and numbers

access localities

Communication network information

unit cost of storing data at a site
unit cost of processing at a site

Computer system information

bandwidth
latency
communication overhead

Allocation
File Allocation (FAP) vs Database Allocation
(DAP):
Fragments are not individual files
relationships have to be maintained

Access to databases is more complicated

remote file access model not applicable
relationship between allocation and query processing

Cost of integrity enforcement should be

considered
Cost of concurrency control should be

considered

Allocation Information
Requirements
Database Information
selectivity of fragments
size of a fragment

Application Information
number of read accesses of a query to a fragment
number of update accesses of query to a fragment
A matrix indicating which queries updates which fragments
A similar matrix for retrievals
originating site of each query

Site Information
unit cost of storing data at a site
unit cost of processing at a site

Network Information
communication cost/frame between two sites
frame size

Allocation Model
General Form
min(Total Cost)
subject to
response time constraint
storage constraint
processing constraint
Decision Variable
xij

1 if fragment Fi is stored at site Sj

0 otherwise

Allocation Model
Total Cost

query processing cost

all queries

cost of storing a fragment at a site

all sites all fragments

Storage Cost (of fragment Fj at Sk)

(unit storage cost at Sk) (size of Fj) xjk

Query Processing Cost (for one query)

processing component + transmission component

Allocation Model
Query Processing Cost

Processing component
access cost + integrity enforcement cost +
concurrency control cost

Access cost

(no. of update accesses+ no. of read accesses)

all sites all fragments

xij local processing cost at a site

Integrity enforcement and concurrency

control costs
Can be similarly calculated

Allocation Model
Query Processing Cost

Transmission component
cost of processing updates + cost of processing
retrievals

Cost of updates

update message cost

all sites all fragments

acknowledgment cost
all sites all fragments

Retrieval Cost

minall sites
(cost of retrieval command

all fragments

cost of sending back the result)

Allocation Model
Constraints
Response Time
execution time of query max. allowable response
time for that query

Storage Constraint (for a site)

storage requirement of a fragment at that

site at that site
storage capacity

all fragments

Processing constraint (for a site)

processing load of a query at that site

all queries

processing capacity of that site

Allocation Model
Solution Methods
FAP is NP-complete
DAP also NP-complete

Heuristics based on
single commodity warehouse location (for

FAP)
knapsack problem
branch and bound techniques
network flow

Allocation Model
Attempts to reduce the solution space
assume all candidate partitionings known;

select the best partitioning

ignore replication at first
sliding window on fragments

CSE 453 Slide 2
No ratings yet
CSE 453 Slide 2
75 pages
Fragmentation: Univ.-Prof. Dr. Peter Brezany Institut Für Scientific Computing Universität Wien
No ratings yet
Fragmentation: Univ.-Prof. Dr. Peter Brezany Institut Für Scientific Computing Universität Wien
17 pages
Chapter 5 Distributed Database Design
No ratings yet
Chapter 5 Distributed Database Design
12 pages
Understanding Distributed Database Fragmentation
No ratings yet
Understanding Distributed Database Fragmentation
19 pages
Distributed Database Technology Overview
No ratings yet
Distributed Database Technology Overview
52 pages
Date's Twelve Rules For Distributed Database Systems
No ratings yet
Date's Twelve Rules For Distributed Database Systems
3 pages
DBMS - LAB Manual
No ratings yet
DBMS - LAB Manual
22 pages
Cs9152 DBT Unit I Notes
100% (1)
Cs9152 DBT Unit I Notes
53 pages
Optimization For Data Science - Lecture1 - Slides
No ratings yet
Optimization For Data Science - Lecture1 - Slides
9 pages
Project Management
No ratings yet
Project Management
207 pages
Query Processing and Optimization
No ratings yet
Query Processing and Optimization
33 pages
Chapter 4 Distributed Database Systems
No ratings yet
Chapter 4 Distributed Database Systems
69 pages
DSV Module-3
No ratings yet
DSV Module-3
24 pages
M. Tech. Semester - I: Advanced Computer Architecture (MCSCS102IBMCSCS 902)
No ratings yet
M. Tech. Semester - I: Advanced Computer Architecture (MCSCS102IBMCSCS 902)
12 pages
Recoverability and Serializability
No ratings yet
Recoverability and Serializability
3 pages
5 Partitioning An Array
No ratings yet
5 Partitioning An Array
22 pages
Cns Unit2
No ratings yet
Cns Unit2
147 pages
Message Authentication
No ratings yet
Message Authentication
47 pages
AKTU Cyber Security Notes 2024-25
0% (1)
AKTU Cyber Security Notes 2024-25
3 pages
Message Authentication and Digital Signatures
No ratings yet
Message Authentication and Digital Signatures
23 pages
Network Programming Paradigm
No ratings yet
Network Programming Paradigm
5 pages
More Complex SQL Retrieval Queries
No ratings yet
More Complex SQL Retrieval Queries
24 pages
Lab Manual: Department of Computer Engineering
No ratings yet
Lab Manual: Department of Computer Engineering
66 pages
Computer Networks Importance (Mumbai University)
No ratings yet
Computer Networks Importance (Mumbai University)
2 pages
Micro-Mobility Protocols Overview
No ratings yet
Micro-Mobility Protocols Overview
12 pages
DCCN Notes
No ratings yet
DCCN Notes
27 pages
Unit 2 - Week 1: Introduction To Clouds, Virtualization and Virtual Machine
No ratings yet
Unit 2 - Week 1: Introduction To Clouds, Virtualization and Virtual Machine
48 pages
Bellman Ford Algorithm
No ratings yet
Bellman Ford Algorithm
4 pages
PHP PDF Generation with Graphics Concepts
No ratings yet
PHP PDF Generation with Graphics Concepts
6 pages
Access Control Models and Methods - Types of Access Control
No ratings yet
Access Control Models and Methods - Types of Access Control
12 pages
Logical Network Perimeter in Cloud Computing
No ratings yet
Logical Network Perimeter in Cloud Computing
7 pages
Chapter3 Solving Problems by Searching and Constraint Satisfaction Problem
No ratings yet
Chapter3 Solving Problems by Searching and Constraint Satisfaction Problem
43 pages
Stock Market Big Data Insights
No ratings yet
Stock Market Big Data Insights
3 pages
Karnataka PGCET MCA Question Paper
No ratings yet
Karnataka PGCET MCA Question Paper
3 pages
2 RoutingAlgorithms
No ratings yet
2 RoutingAlgorithms
36 pages
HPC QB With Answer
No ratings yet
HPC QB With Answer
17 pages
Computer Networks, Fifth Edition by Andrew Tanenbaum and David Wetherall, © Pearson Education-Prentice Hall, 2011
No ratings yet
Computer Networks, Fifth Edition by Andrew Tanenbaum and David Wetherall, © Pearson Education-Prentice Hall, 2011
59 pages
CS8082 Machine Learning Exam Prep
No ratings yet
CS8082 Machine Learning Exam Prep
5 pages
The Cisco Service Oriented Network Architecture
No ratings yet
The Cisco Service Oriented Network Architecture
3 pages
Cloud Computing Final Exam Guide
No ratings yet
Cloud Computing Final Exam Guide
2 pages
Bellman-Ford Algorithm Quiz
No ratings yet
Bellman-Ford Algorithm Quiz
4 pages
Cloud Computing & Virtualization Guide
No ratings yet
Cloud Computing & Virtualization Guide
7 pages
Multiple Granularity Locking
No ratings yet
Multiple Granularity Locking
1 page
III Sem Syllabus RNSIT New
No ratings yet
III Sem Syllabus RNSIT New
19 pages
Class - 6 - m1 - Roles - and - Boundaries
No ratings yet
Class - 6 - m1 - Roles - and - Boundaries
19 pages
Understanding Transaction Management
No ratings yet
Understanding Transaction Management
28 pages
AI & Soft Computing Lab Manual
No ratings yet
AI & Soft Computing Lab Manual
30 pages
ENCh 09
No ratings yet
ENCh 09
45 pages
Unit 1 Notes
No ratings yet
Unit 1 Notes
29 pages
Data Stream Sampling Techniques
No ratings yet
Data Stream Sampling Techniques
3 pages
DWDM Unit 6 Cluster Analysis
No ratings yet
DWDM Unit 6 Cluster Analysis
183 pages
ADB - Unit - II (Chapter-2)
No ratings yet
ADB - Unit - II (Chapter-2)
67 pages
2 Distribution Design
No ratings yet
2 Distribution Design
73 pages
2 Distribution Design
No ratings yet
2 Distribution Design
73 pages
Lec3 21 10 16.
No ratings yet
Lec3 21 10 16.
52 pages
Distributed Database Design
No ratings yet
Distributed Database Design
49 pages
DDB 05 PDF
No ratings yet
DDB 05 PDF
19 pages
2 Distribution Design
No ratings yet
2 Distribution Design
76 pages
Distributed Database Design Overview
No ratings yet
Distributed Database Design Overview
51 pages
Lecture4-Distribution - Design - Replica Allocation
No ratings yet
Lecture4-Distribution - Design - Replica Allocation
70 pages
OOPM Theory Questions
No ratings yet
OOPM Theory Questions
7 pages
HCL ME Tablet U1 Specs & Price
No ratings yet
HCL ME Tablet U1 Specs & Price
4 pages
Presidential Duty to Uphold International Law
No ratings yet
Presidential Duty to Uphold International Law
47 pages
Toru Dutt
No ratings yet
Toru Dutt
5 pages
PHP CRUD Operations with MySQL Guide
No ratings yet
PHP CRUD Operations with MySQL Guide
3 pages
ASP.NET Data Binding Techniques
No ratings yet
ASP.NET Data Binding Techniques
22 pages
Semantic Web Important Questions and Answers
No ratings yet
Semantic Web Important Questions and Answers
8 pages
Microsoft DP-600 Exam Guide
No ratings yet
Microsoft DP-600 Exam Guide
10 pages
SAP HANA 2.0 Certification Exam Guide
No ratings yet
SAP HANA 2.0 Certification Exam Guide
28 pages
Bahria University (Karachi Campus) : Database Management System)
100% (1)
Bahria University (Karachi Campus) : Database Management System)
6 pages
Lab 2
No ratings yet
Lab 2
6 pages
Flavianus Dui Saverino Rahim - 13.2020.1.00964 - 6
No ratings yet
Flavianus Dui Saverino Rahim - 13.2020.1.00964 - 6
4 pages
SAP Archiving Solutions with OpenText
No ratings yet
SAP Archiving Solutions with OpenText
2 pages
Data Analyst with Tableau & SQL Expertise
No ratings yet
Data Analyst with Tableau & SQL Expertise
4 pages
Matthew Williams CISB305 Fall 2022 Assignment 4
No ratings yet
Matthew Williams CISB305 Fall 2022 Assignment 4
4 pages
Oracle 8i/9i Database Auditing Guide
No ratings yet
Oracle 8i/9i Database Auditing Guide
42 pages
Oracle 19c - Complete Checklist For Upgrading To Oracle Database 19c (19.x) Using DBUA
No ratings yet
Oracle 19c - Complete Checklist For Upgrading To Oracle Database 19c (19.x) Using DBUA
22 pages
AZ-900_ Microsoft Azure Fundamentals Certification Dump Questions Answers Examples 6
No ratings yet
AZ-900_ Microsoft Azure Fundamentals Certification Dump Questions Answers Examples 6
8 pages
Newton School Free SQL Handbook
No ratings yet
Newton School Free SQL Handbook
108 pages
Index Ism
No ratings yet
Index Ism
16 pages
Big Data Analytics
No ratings yet
Big Data Analytics
3 pages
Aws Bds-c00 Certification Exam Sample Questions With Correct Answers 2025
No ratings yet
Aws Bds-c00 Certification Exam Sample Questions With Correct Answers 2025
7 pages
Big Data Analytics
No ratings yet
Big Data Analytics
2 pages
Unit-6 Storage Strategies
No ratings yet
Unit-6 Storage Strategies
43 pages
Crawling The Web: Seed Page and Then Uses The External Links Within It To Attend To Other Pages
No ratings yet
Crawling The Web: Seed Page and Then Uses The External Links Within It To Attend To Other Pages
25 pages
Lock-Based Concurrency Control in DBMS
No ratings yet
Lock-Based Concurrency Control in DBMS
15 pages
Microsoft Jet Database Engine Programmer's Guide - Chapter 2
No ratings yet
Microsoft Jet Database Engine Programmer's Guide - Chapter 2
33 pages
Linux Interview Questions & Answers Guide
No ratings yet
Linux Interview Questions & Answers Guide
3 pages
Unit 1
No ratings yet
Unit 1
18 pages
Internship Report 1
No ratings yet
Internship Report 1
22 pages
DSX InfoSphere DataStage Is Big Data Integration 2013-05-13
50% (2)
DSX InfoSphere DataStage Is Big Data Integration 2013-05-13
30 pages
PowerChannel KB 15238
No ratings yet
PowerChannel KB 15238
3 pages
Data Base Notes
No ratings yet
Data Base Notes
7 pages
History Data Retrieval in Wonderware
No ratings yet
History Data Retrieval in Wonderware
10 pages