RDBMS - Module5 - Distributed and Parallel DB
RDBMS - Module5 - Distributed and Parallel DB
7 Databases
single computer. He does not have any idea that the particular data which he is
accessing may be located at some other site. A
distributed database management
system is a set of programme that uses client-server architecture to process
information requests.
follows:
Student
Student_Prag2
Fragi
=h G Brnck s (Student)
r(Student)
Student_Frag2
Student_Prag1 Marks Roll No Name Branch Marks
Branch
Roll No
Name 3 Himanshu 79
CSE 95
Ashu Rashmi IT 65
1
CSE 84
Binoy
CSE 70
Naina
Student_Vfrag2
tudent_Vfrag1 Branch Tuple_ld
Name
Roll No Marks Tuple_Id
Ashu CSE
1
1
95 CSE
2 Binoy 3
2 84 IT
Himanshu
3 CSE 4
3 79 Naina
4 78 4 IT 5
Rashmi
65 5
5
Fragmentation
Fig. 7.3: Vertical
a relation is first
In this type of fragmentation further
(c) Mixed Fragmentation: obtained is
RollNo Name
1 Ashu
2 Binoy
Naina
Fig.74:Mixed Fragmentation
Database 1
Site 1
Site 3
Database 2
Database 3
Site 2
Site 4
Database 3
Site
3 Site 2
Centralised
Database
Site 4
Site 1
Site 2
Site 4
Site 3
Architecture
Fig. 7.7: Truly Distributed
Distributed and Parallel Databases
215
Advantages
(a) Data access is fast as processor communicates through memory writes.
(b)Low communication overhead.
Disadvantages
(a) Cache coherency: If an update is done to shared memory then it should
also be done to local cache.
(b) Architecture not scalable beyond 32 or 64 processors.
2. Shared Disk Architecture: In this architecturethere are multiple processors
and each processor have there own private memory, but they all share
some common disk via interconnection network.
Disk Disk
Advantages:
bus iss not a bottleneck.
(a) Since each processorhas its own memory,
fails, then other can take over.
(b) If one processor or memory
(c) Load balancing is easy.
Disadvantages:
(a) Problems of scalability
as with increase in processor number of disk
to disk becomes a
accessalsoincreasesand interconnection bottleneck.
(b) Due to increase in processor, existing processors get slow down
because of increased contention of memory access and network
bandwidth.
3. Shared Nothing Architecture: Every processor connected to the
interconnection network has its own individual memory and disk. All
communication is done through high speed communication network.
Memory Memory
Processor Disk
Memory
Advantages:
(a) Better scalability. No sharing of resources minimises contention among
processors.
(b) High speed. As queries are executed at individual node so onlyqueries
requiring access to non-local disk and result pass through network.
(c) Support large number of processors.
Disadvantages:
(a)Communication costs are higher.
(b) Difficulty in load balancing.
(c) Cost of non local disk access is higher than shared one.
(d) Since, there is no sharing of disk and data, so if one processor fails
data becomes inaccessible to other processor.
Distributed and Parallel Databases
217
A. Hierarchical Architecture
Processor Processor
Disk
Memory
inter Connection
Network
Processor Processor
Disk Disk Memory
Fig. 7.11: Hierarchial
Architecture of Parallel Database
It is a combination
of shared memory, shared disk
and shared nothing
architectures.Initially
the system can be seen as shared nothing
systen. Now
each node is shared memory system.
Within system each node the system is
shared disk system.
Advantages:
1. Inputoutput parallelism: A
relation is partitioned and kept on multiple
disk toreduce the retrievaltime. Now each partitionis processed
parallely
and then finally combined. Various strategiesto partition a relation are:
(a) Hash partitioning: Every tuple of a relation is hashed on some
partitioning attributeof the relation.
Ifthehash function returns value i
then this tuple is kept on disk i.
(b) Round robin partitioning: ith tuple of the relation is kept on disk
number D, mod n. So, all tuples are evenly distributed across every
disk.
each disk. For example range partitioning with three disks numbered