0% found this document useful (0 votes)
7 views73 pages

2 Distribution Design

The document outlines the principles of distributed database systems, covering topics such as design, data control, query processing, transaction processing, and data replication. It emphasizes the importance of fragmentation for performance, detailing horizontal and vertical fragmentation types, along with their correctness criteria. Additionally, it discusses the allocation of data fragments and the implications of replication strategies in distributed environments.

Uploaded by

kieunty22416c
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views73 pages

2 Distribution Design

The document outlines the principles of distributed database systems, covering topics such as design, data control, query processing, transaction processing, and data replication. It emphasizes the importance of fragmentation for performance, detailing horizontal and vertical fragmentation types, along with their correctness criteria. Additionally, it discusses the allocation of data fragments and the implications of replication strategies in distributed environments.

Uploaded by

kieunty22416c
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 73

Principles of Distributed Database

Systems
M. Tamer Özsu
Patrick Valduriez

© 2020, M.T. Özsu & P. Valduriez


Outline
 Introduction
 Distributed and Parallel Database Design
 Distributed Data Control
 Distributed Query Processing
 Distributed Transaction Processing
 Data Replication
 Database Integration – Multidatabase Systems
 Parallel Database Systems
 Peer-to-Peer Data Management
 Big Data Processing
 NoSQL, NewSQL and Polystores
 Web Data Management
© 2020, M.T. Özsu & P. Valduriez
Outline
 Distributed and Parallel Database Design
 Fragmentation
 Data distribution
 Combined approaches

© 2020, M.T. Özsu & P. Valduriez


Distribution Design
 Use of some auxiliary
information
 Design global conceptual
schema (GCS) and
follows two tasks :
partitioning
(fragmentation) and
allocation

Fig. 2.1 Distribution


design process

© 2020, M.T. Özsu & P. Valduriez


Outline
 Distributed and Parallel Database Design
 Fragmentation
 Data distribution
 Combined approaches

© 2020, M.T. Özsu & P. Valduriez


Fragmentation
The main reasons and objectives for fragmentation are slightly
different
 Distributed DBMSs
o The main reason : Data locality => like queries to access
data at a single site in order to avoid costly remote data
access => reduce costly remote data access
o The seconde reason : fragmentation enables
 interquery parallelism
 intraquery parallelism
 Parallel DBMSs
o Concern is load balancing : nodes do work unevenly =>
increases the latency of queries and transactions

© 2020, M.T. Özsu & P. Valduriez


Fragmentation
Fragmentation related difficulties :
First problem :
Fragmentation is important for system performance
=> raises difficulties in distributed DBMSs.
=> incurs a performance penalty due to the need to
perform distributed joins and the cost of distributed transaction
commitment (COMMIT)
Second problem : semantic data control, specifically to integrity
checking.
Attributes participating in a constraint may be decomposed into
different fragments that are allocated to different sites.
=> integrity checking itself involves distributed execution,
which is costly

© 2020, M.T. Özsu & P. Valduriez


Design problem

The challenge is to partition and allocate the data in such a


way
 most user queries and transactions are local to one site
 minimizing distributed queries and transactions

The unit of distribution/allocation is a fragment

© 2020, M.T. Özsu & P. Valduriez


Example Database

© 2020, M.T. Özsu & P. Valduriez


Fragmentation Alternatives – Horizontal

PROJ1 : projects with budgets


less than $200,000
PROJ2 : projects with budgets
greater than or equal
to $200,000

© 2020, M.T. Özsu & P. Valduriez


Fragmentation Alternatives – Vertical

PROJ1: information about


project budgets
PROJ2: information about
project names and
locations

© 2020, M.T. Özsu & P. Valduriez


Data Fragmentation - Fragmentation types
Relational tables can be partitioned either horizontally or vertically.
 Horizontal fragmentation (HF): select operator = > selection

predicates determine the fragmentation


More prevalent in most systems in particular in parallel DBMSs
Þ The reason is the intraquery parallelism that most recent big data

platforms advocate.
 Primary Horizontal Fragmentation (PHF)
 Derived Horizontal Fragmentation (DHF)
 Vertical fragmentation (VF) : project operator
Used in column-store parallel DBMSs for analytical applications
(typically require fast access to a few attributes)
 Hybrid fragmentation

© 2020, M.T. Özsu & P. Valduriez


Correctness of Fragmentation
 Completeness
 Decomposition of relation R into fragments R1, R2, ..., Rn is
complete if and only if each data item in R can also be found in
some Ri
 Reconstruction
 If relation R is decomposed into fragments R1, R2, ..., Rn, then
there should exist some relational operator ∇ such that
R = ∇1≤i≤nRi
 Disjointness
 If relation R is decomposed into fragments R1, R2, ..., Rn, and
data item di is in Rj, then di should not be in any other fragment
Rk (k ≠ j ).

© 2020, M.T. Özsu & P. Valduriez


Allocation Alternatives

 Non-replicated
 partitioned : each fragment resides at only one site
 Replicated
 fully replicated : each fragment at each site
 partially replicated : each fragment at some of the sites
 Rule of thumb:

© 2020, M.T. Özsu & P. Valduriez


Comparison of Replication Alternatives

© 2020, M.T. Özsu & P. Valduriez


PHF – Information Requirements
 Database Information source(L1) = PAY and
 relationship
target(L1) = EMP

 cardinality of each relation: card(R)


 workload information : queries that are run on the
database

© 2020, M.T. Özsu & P. Valduriez


PHF - Information Requirements
 Application Information
 simple predicates : Given R [A1, A2, …, An], a simple predicate
pj is
pj : Ai θValue
where θ  {=,<,≤,>,≥,≠}, Value  Di and Di is the domain of Ai.
For relation R we define Pr = {p1, p2, …,pm}
Example :
PNAME = "Maintenance"
BUDGET ≤ 200000
 minterm predicates : Given R and Pr = {p1, p2, …,pm}
define M = {m1,m2,…,mr} as
M = { mi | mi =  pjPr pj* }, 1≤j≤m, 1≤i≤z
where pj* = pj or pj* = ¬(pj).
© 2020, M.T. Özsu & P. Valduriez
PHF – Information Requirements

Example
m1: PNAME="Maintenance"  BUDGET≤200000

m2: NOT(PNAME="Maintenance")  BUDGET≤200000

m3: PNAME= "Maintenance"  NOT(BUDGET≤200000)

m4: NOT(PNAME="Maintenance")  NOT(BUDGET≤200000)

© 2020, M.T. Özsu & P. Valduriez


PHF – Information Requirements

 Application Information
 Quantitative information about the workload
 minterm selectivities: sel(mi)
 The number of tuples of the relation that would be
accessed by a user query which is specified according to
a given minterm predicate mi.
 access frequencies: acc(qi)
 The frequency with which a user application qi accesses
data.
 Access frequency for a minterm predicate can also be
defined.

© 2020, M.T. Özsu & P. Valduriez


Primary Horizontal Fragmentation
Relations PAY and PROJ are subject to primary horizontal
fragmentation
Definition :
Rj = Fj(R), 1 ≤ j ≤ w
where Fj is a selection formula, which is (preferably) a minterm
predicate.
Therefore,
A horizontal fragment Ri of relation R consists of all the tuples of R
which satisfy a minterm predicate mi.


Given a set of minterm predicates M, there are as many horizontal
fragments of relation R as there are minterm predicates.
Set of horizontal fragments also referred to as minterm fragments.

© 2020, M.T. Özsu & P. Valduriez


PHF – Algorithm

Given: A relation R, the set of simple predicates Pr


Output: The set of fragments of R = {R1, R2,…,Rw} which
obey the fragmentation rules.

Preliminaries :
 Pr should be complete
 Pr should be minimal

© 2020, M.T. Özsu & P. Valduriez


Completeness of Simple Predicates

 A set of simple predicates Pr is said to be complete if


and only if the accesses to the tuples of the minterm
fragments defined on Pr requires that two tuples of the
same minterm fragment have the same probability of being
accessed by any application.

 Example :
 Assume PROJ [PNO,PNAME,BUDGET,LOC] has two
applications defined on it.
 Find the budgets of projects at each location. (1)
 Find projects with budgets less than $200000. (2)

© 2020, M.T. Özsu & P. Valduriez


Completeness of Simple Predicates

According to (1),
Pr={LOC=“Montreal”,LOC=“New York”,LOC=“Paris”}

which is not complete with respect to (2).


Modify
Pr ={LOC=“Montreal”,LOC=“New York”,LOC=“Paris”,
BUDGET≤200000,BUDGET>200000}

which is complete.

© 2020, M.T. Özsu & P. Valduriez


Minimality of Simple Predicates

 If a predicate influences how fragmentation is performed,


(i.e., causes a fragment f to be further fragmented into,
say, fi and fj) then there should be at least one
application that accesses fi and fj differently.
 In other words, the simple predicate should be relevant
in determining a fragmentation.
 If all the predicates of a set Pr are relevant, then Pr is
minimal.

© 2020, M.T. Özsu & P. Valduriez


Minimality of Simple Predicates

Example :
Pr ={LOC=“Montreal”,LOC=“New York”, LOC=“Paris”,
BUDGET≤200000,BUDGET>200000}

is minimal (in addition to being complete). However, if we


add
PNAME = “Instrumentation”

then Pr is not minimal.

© 2020, M.T. Özsu & P. Valduriez


COM_MIN Algorithm

Given: a relation R and a set of simple predicates Pr


Output: a complete and minimal set of simple predicates
Pr' for Pr

Rule 1: a relation or fragment is partitioned into at least


two parts which are accessed differently by at
least one application.

© 2020, M.T. Özsu & P. Valduriez


COM_MIN Algorithm

© 2020, M.T. Özsu & P. Valduriez


PHORIZONTAL Algorithm

Makes use of COM_MIN to perform fragmentation.


Input: a relation R and a set of simple predicates Pr
Output: a set of minterm predicates M according to which
relation R is to be fragmented

 Pr'  COM_MIN (R,Pr)


 determine the set M of minterm predicates
 determine the set I of implications among pi  Pr
 eliminate the contradictory minterms from M

© 2020, M.T. Özsu & P. Valduriez


PHF – Example

 Two candidate relations : PAY and PROJ.


 Fragmentation of relation PAY
 Application: Check the salary info and determine raise.
 Employee records kept at two sites  application run at two sites
 Simple predicates
p1 : SAL ≤ 30000
p2 : SAL > 30000
Pr = {p1,p2} which is complete and minimal Pr'=Pr
 Minterm predicates
m1 : (SAL ≤ 30000)
m2 : NOT(SAL ≤ 30000) = (SAL > 30000)

© 2020, M.T. Özsu & P. Valduriez


PHF – Example

© 2020, M.T. Özsu & P. Valduriez


PHF – Example 2.10
 Fragmentation of relation PROJ
 Applications:
 Find the name and budget of projects given their no.
 Issued at three sites

 Access project information according to budget


 one site accesses ≤200000 other accesses >200000

 Simple predicates
 For application (1)
p1 : LOC = “Montreal”
p2 : LOC = “New York”
p3 : LOC = “Paris”
 For application (2)
p4 : BUDGET ≤ 200000
p5 : BUDGET > 200000
 Pr = Pr' = {p1,p2,p4}

© 2020, M.T. Özsu & P. Valduriez


PHF – Example

 Fragmentation of relation PROJ continued


 Minterm fragments left after elimination
m1 : (LOC = “Montreal”)  (BUDGET ≤ 200000)
m2 : (LOC = “Montreal”)  (BUDGET > 200000)
m3 : (LOC = “New York”)  (BUDGET ≤ 200000)
m4 : (LOC = “New York”)  (BUDGET > 200000)
m5 : (LOC = “Paris”)  (BUDGET ≤ 200000)
m6 : (LOC = “Paris”)  (BUDGET > 200000)

© 2020, M.T. Özsu & P. Valduriez


PHF – Example 2.10

© 2020, M.T. Özsu & P. Valduriez


PHF – Correctness

 Completeness
 Since Pr' is complete and minimal, the selection predicates are
complete
 Reconstruction
 If relation R is fragmented into FR = {R1,R2,…,Rr}

R = Ri FR Ri
 Disjointness
 Minterm predicates that form the basis of fragmentation should
be mutually exclusive.

© 2020, M.T. Özsu & P. Valduriez


Derived Horizontal Fragmentation

 Defined on a member relation of a link according to a


selection operation specified on its owner.
 Each link is an equijoin.
 Equijoin can be implemented by means of semijoins.

© 2020, M.T. Özsu & P. Valduriez


DHF – Definition

Given a link L where owner(L)=S and member(L)=R, the


derived horizontal fragments of R are defined as
Ri = R ⋉F Si, 1≤i≤w
where w is the maximum number of fragments that will be
defined on R and
Si = F (S)
i

where Fi is the formula according to which the primary


horizontal fragment Si is defined.

© 2020, M.T. Özsu & P. Valduriez


DHF – Example 2.11

Given link L1 where owner(L1)=PAY and member(L1)=EMP


SEMIJOIN :
EMP1 = EMP ⋉ PAY1
EMP2 = EMP ⋉ PAY2
where
PAY1 = SAL≤30000(PAY)
PAY2 = SAL>30000(PAY)

The fragmentation algorithm,


then, is quite trivial.

© 2020, M.T. Özsu & P. Valduriez


Derived Horizontal Fragmentation
There is more than one possible derived horizontal
fragmentation of R.

© 2020, M.T. Özsu & P. Valduriez


Derived Horizontal Fragmentation
The choice of candidate fragmentation is based on
two criteria:
 The fragmentation with better join characteristics

not that straightforward


(1) by performing it on smaller relations
(2) by potentially performing joins in parallel.
 The fragmentation used in more queries
quite straightforward
=> consideration the frequency that the data is
accessed by the workload

© 2020, M.T. Özsu & P. Valduriez


PHF – Example 2.12
Fragmentation of relation PROJ in Example 2.10
Fragmentation of relation EMP in Example 2.11

 Fragmentation of relation ASG


 Applications :
 Finds the names of engineers who work on local projects with higher
probability than those of projects at other locations.
Issued at all sites.
PAY → EMP → ASG

 At each employee administrative site, access the responsibilities on


the projects and how long they will work on those project. (ASG)

© 2020, M.T. Özsu & P. Valduriez


PHF – Example 2.12
The derived fragmentation of ASG according to {PROJ1,
PROJ3, PROJ4,PROJ6} .
QUERY 1:
ASG1 = ASG ⋉ PROJ1
ASG2 = ASG ⋉ PROJ3
ASG3 = ASG ⋉ PROJ4
ASG4 = ASG ⋉ PROJ6

QUERY 2:
ASG1 = ASG ⋉ EMP1
ASG2 = ASG ⋉ EMP2

© 2020, M.T. Özsu & P. Valduriez


PHF – Example 2.12

Derived
fragmentation of ASG
with respect to PROJ

© 2020, M.T. Özsu & P. Valduriez


PHF – Example 2.12

Derived
fragmentation of ASG
with respect to EMP

© 2020, M.T. Özsu & P. Valduriez


DHF – Correctness

 Completeness
 Referential integrity
 Let R be the member relation of a link whose owner is relation S
which is fragmented as FS = {S1, S2, ..., Sn}. Furthermore, let A
be the join attribute between R and S. Then, for each tuple t of
R, there should be a tuple t' of S such that
t[A] = t' [A]
 Reconstruction
 Same as primary horizontal fragmentation.
 Disjointness
 Simple join graphs between the owner and the member
fragments.

© 2020, M.T. Özsu & P. Valduriez


Vertical Fragmentation
 More difficult than horizontal, because more alternatives
exist.
=> it is futile to attempt to obtain optimal solutions to
the vertical partitioning problem
 Two types of heuristic approaches
 Grouping
 starts by assigning each attribute to one fragment, and at each step,
joins some of the fragments until some criteria are satisfied
=> overlapping fragments
 Splitting
 starts with a relation and decides on beneficial partitionings based
on the access behavior of applications to the attributes
=> generates nonoverlapping fragments (disjointness)

© 2020, M.T. Özsu & P. Valduriez


VF – Information Requirements
 Application Information
 Attribute affinities
 a measure that indicates how closely related the attributes are
 This is obtained from more primitive usage data
 Attribute usage values
 Given a set of queries Q = {q1, q2,…, qq} that will run on the relation
R[A1, A2,…, An],

 if attribute Aj is referenced by query qi


1
use(qi,Aj) =
 0 otherwise
use(qi,•) can be defined accordingly

© 2020, M.T. Özsu & P. Valduriez


VF – Definition of use(qi,Aj)
Consider the following 4 queries for relation PROJ
q1: SELECT BUDGET q2: SELECT
PNAME,BUDGET
FROM PROJ FROM PROJ
WHERE PNO=Value
q3: SELECT PNAME q4: SELECT SUM(BUDGET)
FROM PROJ FROM PROJ
WHERE LOC=Value WHERE LOC=Value

Fig. 2.12 Example attribute


usage matrix

© 2020, M.T. Özsu & P. Valduriez


VF – Affinity Measure aff(Ai,Aj)
Attribute usage values are not sufficiently general to form the
basis of attribute splitting and fragmentation.

The attribute affinity measure between two attributes Ai and Aj


of a relation R[A1, A2, …, An] with respect to the set of
applications Q = (q1, q2, …, qq) is defined as follows :

: the number of accesses to attributes (Ai, Aj) for each execution of application at
site
: the application access frequency measure previously defined and modified to
include frequencies at different sites

© 2020, M.T. Özsu & P. Valduriez


VF – Calculation of aff(Ai, Aj)

Assume each query in the previous example accesses the attributes once
during each execution.
Also assume the access frequencies S 1 S 2 S
3

q1 15 20 10
q2 5 0 0
q3 25 25 25

q4 3 0 0

=1
Then
aff(A1, A3) = ++
= 45

© 2020, M.T. Özsu & P. Valduriez


VF – Clustering Algorithm

Bond energy algorithm (BEA) find some means of grouping


the attributes of a relation based on the attribute affinity
values in AA.
 Takes as input the attribute affinity matrix for relation

R(A1, . . . , An)
 Permutes its rows and columns, and generates a

clustered affinity matrix (CA)

© 2020, M.T. Özsu & P. Valduriez


VF – Clustering Algorithm
The permutation is done in such a way as to maximize the
following global affinity measure (AM):

Where

Resulting in the grouping of large values with large ones,


and small values with small ones

© 2020, M.T. Özsu & P. Valduriez


Bond Energy Algorithm

© 2020, M.T. Özsu & P. Valduriez


Bond Energy Algorithm

Input: The AA matrix


Output: The clustered affinity matrix CA which is a
perturbation of AA
 Initialization: Place and fix one of the columns of AA in
CA.
 Iteration: Place the remaining n-i columns in the
remaining i+1 positions in the CA matrix. For each
column, choose the placement that makes the most
contribution to the global affinity measure.
 Row order: Order the rows according to the column
ordering.

© 2020, M.T. Özsu & P. Valduriez


Bond Energy Algorithm

“Best” placement? Define contribution of a placement:

cont(Ai, Ak, Aj) = 2bond(Ai, Ak)+2bond(Ak, Aj) –2bond(Ai, Aj)

where
n
bond(Ax,Ay) = 
z 1
aff(A ,A )aff(A ,A )
z x z y

© 2020, M.T. Özsu & P. Valduriez


BEA – Example
Consider the following AA matrix and the corresponding CA matrix where
PNO and PNAME have been placed. Place BUDGET:

Ordering (0-3-1) :
cont(A0,BUDGET,PNO) = 2bond(A0, BUDGET)+2bond(BUDGET, PNO)
–2bond(A0 , PNO)
= 8820
Ordering (1-3-2) :
cont(PNO,BUDGET,PNAME) = 10150
Ordering (2-3-4) :
cont (PNAME,BUDGET,LOC) = 1780

© 2020, M.T. Özsu & P. Valduriez


BEA – Example
 Therefore, the CA matrix has the form ordering (1-3-2)

 When LOC is placed, the final form of the CA


matrix (after row organization) is

© 2020, M.T. Özsu & P. Valduriez


VF – Algorithm

How can you divide a set of clustered attributes {A1, A2,


…, An} into two (or more) sets {A1, A2, …, Ai} and {Ai+1,
…, An} such that there are no (or minimal) applications
that access both (or more than one) of the sets.

© 2020, M.T. Özsu & P. Valduriez


VF – ALgorithm

Define
TQ = set of applications that access only TA
BQ = set of applications that access only BA
OQ = set of applications that access both TA and BA
and
CTQ = total number of accesses to attributes by applications
that access only TA
CBQ = total number of accesses to attributes by applications
that access only BA
COQ = total number of accesses to attributes by applications
that access both TA and BA
Then find the point along the diagonal that maximizes

CTQCBQCOQ2

© 2020, M.T. Özsu & P. Valduriez


VF – ALgorithm

© 2020, M.T. Özsu & P. Valduriez


VF – Algorithm
Two problems :
 Cluster forming in the middle of the CA matrix
 Shift a row up and a column left and apply the algorithm to find
the “best” partitioning point
 Do this for all possible shifts
 Cost O(m2)
 More than two clusters
 m-way partitioning
 try 1, 2, …, m–1 split points along diagonal and try to find the
best point for each of these
 Cost O(2m)

© 2020, M.T. Özsu & P. Valduriez


VF – Correctness

A relation R, defined over attribute set A and key K,


generates the vertical partitioning FR = {R1, R2, …, Rr}.
 Completeness
 The following should be true for A:
A =  ARi

 Reconstruction
 Reconstruction can be achieved by
R = ⋈K Ri, Ri  FR
 Disjointness
 TID's are not considered to be overlapping since they are
maintained by the system
 Duplicated keys are not considered to be overlapping
© 2020, M.T. Özsu & P. Valduriez
Hybrid Fragmentation

© 2020, M.T. Özsu & P. Valduriez


Reconstruction of HF

© 2020, M.T. Özsu & P. Valduriez


Outline
 Distributed and Parallel Database Design
 Fragmentation
 Data distribution
 Combined approaches

© 2020, M.T. Özsu & P. Valduriez


Fragment Allocation
Following fragmentation, the next decision problem is to
allocate fragments to the sites of the distributed DBMS.

This can be done by either placing each fragment


at a single site or replicating it on a number of sites.
The reasons for replication are reliability and efficiency of
read-only queries.
 multiple copies of a fragment => good chance that some copy of
the data will be accessible somewhere even when system
failures occur.
 read-only queries that access the same data items can be
executed in parallel since copies exist on multiple
sites
On the other hand, the execution of update queries causes
trouble.

© 2020, M.T. Özsu & P. Valduriez


Fragment Allocation
 Problem Statement
Given
F = {F1, F2, …, Fn} fragments
S ={S1, S2, …, Sm} network sites
Q = {q1, q2,…, qq} applications
Find the "optimal" distribution of F to S.
 Optimality
 Minimal cost
 Communication + storage + processing (read & update)
 Cost in terms of time (usually)
 Performance
Response time and/or throughput (the number of transactions per second)
 Constraints
 Per site constraints (storage & processing)

© 2020, M.T. Özsu & P. Valduriez


Auxiliary Requirements
 Database information
 selectivity of fragments (defined the selectivity of minterms)
 size of a fragment
 Application information
 access types (read or update) and access numbers
 access localities
 Communication network information
 unit cost of storing data at a site
 unit cost of processing at a site
 Computer system information
 bandwidth
 latency
 communication cost per message between sites Si and Sj

© 2020, M.T. Özsu & P. Valduriez


Allocation

File Allocation Problem (FAP) vs Database Allocation


Problem (DAP):
 Fragments are not individual files
 relationships have to be maintained
 Access to databases is more complicated
 remote file access model not applicable
 relationship between allocation and query processing
 Cost of integrity enforcement should be considered
 Cost of concurrency control should be considered

© 2020, M.T. Özsu & P. Valduriez


Allocation Model

General Form
min(Total Cost)
subject to
response time constraint
storage constraint
processing constraint

Decision Variable

1 if fragment Fi is stored at site Sj


xij 
0 otherwise

© 2020, M.T. Özsu & P. Valduriez


Allocation Model

 Total Cost

 query processing cost 


all queries

  cost of storing a fragment at a site


all sites all fragments

 Storage Cost (of fragment Fj at Sk)


(unit storage cost at Sk)  (size of Fj)  xjk
 Query Processing Cost (for one query)
processing component + transmission component

© 2020, M.T. Özsu & P. Valduriez


Allocation Model

 Query Processing Cost

Processing component
access cost + integrity enforcement cost + concurrency control cost
 Access cost

  (no. of update accesses+ no. of read accesses) 


all sites all fragments
xij  local processing cost at a site
 Integrity enforcement and concurrency control costs
 Can be similarly calculated

© 2020, M.T. Özsu & P. Valduriez


Allocation Model

 Query Processing Cost


Transmission component
cost of processing updates + cost of processing retrievals
 Cost of updates

  update message cost 


all sites all fragments
  acknowledgment cost
all sites all fragments
 Retrieval Cost

 min all sites (cost of retrieval command 


all fragments cost of sending back the result)

© 2020, M.T. Özsu & P. Valduriez


Allocation Model

 Constraints
 Response Time
execution time of query ≤ max. allowable response time for that query

 Storage Constraint (for a site)

 storage requirement of a fragment at that site 


storage capacity at that site
all fragments
 Processing constraint (for a site)

 processing load of a query at that site 


all queries processing capacity of that site

© 2020, M.T. Özsu & P. Valduriez

You might also like