UNIT-1-Distributed Database
UNIT-1-Distributed Database
16-1
Introduction
16-2
Distributed Database System
▪ A system involving multiple sites connected together via communication
network.
▪ User at any site can access data stored at any site.
▪ Each site is a database system in its own right: its own local database,
local users, local DBMS, local DC manager.
Communi
User cation
manager
DBM
S
datab
ase
Communication
Network
16-4
The Twelve Objectives
1. Local Autonomy
• all operations at a given site are controlled by that site, should not
depend on other sites.
• local data is locally owned and managed.
• Not wholly achievable => sites should be autonomous to the
maximum extend possible.
2. No Reliance on a Central Site
• all sites must be treated as equals.
• the central site may be bottleneck.
3. Continuous Operation
• Reliability
• Availability
• Never require the system to be shutdown to perform some function:
e.g. add a new site.
Wei-Pang Yang, Information Management, NDHU 16-5
The Twelve Objectives (cont.)
4. Location Independence ( Location Transparency )
• user should not need to know at which site the data is stored, but should be
able to behave as if the entire database were stored at their own local site.
• a request for some remote data => system should find the data
automatically.
C
• Advantages A
<1> Simplify user programs and activities
<e.g.> SELECT S# B
FROM S
AT SITE A
WHERE SNAME = 'John'
<2> allow data to be moved from one site to another at any time without
invalidating any program or activities.
E1 DX E1 DX 45K E2 DY 40K
45K
E2 DY E3 DZ 50K E4 DY 63K
40K
E3 DZ 50K E5 DZ 40K
E4 DY 63K copy
E5 DZ 40K replica of London fragment replica of New York fragment
EMP# DEPT# SALARY EMP# DEPT# SALARY
E2 DY 40K E1 DX 45K
E4 DY 63K E3 DZ 50K
E5 DZ 40K
16-13
Basic Point: Network are slow !
Basic point: network are slow !
Site A: S SP Site B: P
site A site B
<e.g.> S SP
• Database :
S: 1,000 tuples, at site A S'
SP' S#
SP: 2,000 tuples, at site B
# of tuples in S where S.S#=SP.S#: 100,
length of a S tuple: 100 bit
length of a SP tuple: 100 bit
length of the S# field: 10 bit
• Regular Join:
<1> Ship S to site B ( 1000 * 100 bits )
<2> Join S and SP at site B
communication time = 1 + 1000*100/10000 = 11 sec
Wei-Pang Yang, Information Management, NDHU 16-19
Query Processing: Semijoin (cont.)
• Semijoin
<1> site B: step 1. Project SP on S# (get SP')
site A site B step 2. ship to site A
S SP <2> site A: step 3. Join the projection of SP' on S# with S
step 4. The result S‘, ship to site B
S' <3> site B: step 5. Join S' with SP
SP' S#
communication time = 1+10*2000/10000+1+100*100/10000
= 1+2+1+1= 5 sec
Site A Site B
S SP SP'
S' S1
S# S# P#
Join S4
# = 100 # = 1,000 #=2,000
S' ... # =< 2,000
S921
100 bits
100 bits
10 bits