ADS Chapter 7 Distributed Database
ADS Chapter 7 Distributed Database
Distributed Database
Distributed Database System
A distributed database is a database that consists of two or more files located in
different sites either on the same network or on entirely different networks. Portions of
the database are stored in multiple physical locations and processing is distributed
among multiple database nodes.
A type of database in which all data stored on A type of database that consists of two or more
the central device. Central device may be a database files located at different places over the
mobile or a computer etc. network.
As there are multiple database files, in a
Managing, modifying and backup of data is easy
distributed database, it requires time to manage
because all data stored in one place.
data.
Requires time to access data because multiple It has good speed in accessing data because data
users access the database files. files are retrieved from the nearest database.
If the central server fails users are not able to If one database fails user still access another
access database database
Has more data consistency and it provides the May have data replication and there can be some
complete user view. data inconsistency.
Features of distributed databases
•Databases in the collection are logically interrelated with each other. Often
they represent a single logical database.
•Data is physically stored across multiple sites. Data in each site can be
managed by a DBMS independent of the other sites.
In a homogenous distributed database system, all the physical locations have the same
underlying hardware and run the same operating systems and database applications.
Homogenous distributed database systems appear to the user as a single system, and
they can be much easier to design and manage.
For a distributed database system to be homogenous, the data structures at each location
must be either identical or compatible.
The database application used at each location must also be either identical or
compatible.
Distributed database architecture/Type
Distributed databases can be homogenous or heterogeneous.
Different sites may use different schemas and software, although a difference in schema
can make query and transaction processing difficult.
Different nodes may have different hardware, software and data structure, or they may be in
locations that are not compatible.
Heterogeneous distributed databases are often difficult to use, making them economically
infeasible for many businesses.
Advantages of distributed databases
There are many advantages to using distributed databases.
More Reliable − In case of database failures, the total system of centralized databases
comes to a halt. However, in distributed systems, when a component fails, the functioning
of the system continues may be at a reduced performance. Hence DDBMS is more reliable.
Advantages of distributed databases
There are many advantages to using distributed databases.
•Large Overhead. Many operations on multiple sites requires numerous calculations and
constant synchronization when database replication is used, causing a lot of processing
overhead.
•Data Integrity. A possible issue when using database replication is data integrity, which is
compromised by updating data at multiple sites.
There are 2 ways in which data can be stored on different sites. These are:
1. Replication –
In this approach, the entire relationship is stored redundantly at 2 or more
sites. If the entire database is available at all sites, it is a fully
redundant(replication) database. If half of the files or some of the files are
replicated it is called partial replication.
There are 2 ways in which data can be stored on different sites. These are:
1. Replication –
This is advantageous as it increases the availability of data at different sites. Also, now
query requests can be processed in parallel. However, it has certain disadvantages as
well. Data needs to be constantly updated.
Any change made at one site needs to be recorded at every site that relation is stored or else
it may lead to inconsistency.
This is a lot of overhead. Also, concurrency control becomes way more complex as
concurrent access now needs to be checked over a number of sites.
Distributed Data Storage :
There are 2 ways in which data can be stored on different sites. These are:
2.Fragmentation –
In this approach, the relations are fragmented (i.e., they’re divided into smaller parts) and each of the
fragments is stored in different sites where they’re required.
When it comes to fragmentation of distributed database storage, the relations are fragmented, which
means they are split into smaller parts. Each of the fragments is stored on a different site, where it
is required.
The prerequisite for fragmentation is to make sure that the fragments can later be reconstructed into
the original relation without losing data.
The advantage of fragmentation is that there are no data copies, which prevents data inconsistency.
Distributed Data Storage :
There are 2 ways in which data can be stored on different sites. These are:
2. Fragmentation –
Fragmentation of relations can be done in two ways:
Horizontal fragmentation – Splitting by rows –
The relation is fragmented into groups of tuples so that each tuple is assigned to at least one fragment.