1 Distributed DB
1 Distributed DB
Mysuru Campus
School of Computing
Advanced Databases
Distributed Databases
Dr.Thilagaraj.T
Assistant Professor
Distributed Database System(DDBS)
• To form a distributed database system (DDBS), the files must
be
• structured,
• logically interrelated, and
• physically distributed across multiple sites.
• Distribution of control
• Distribution dimension of the taxonomy deals with data
• Physical distribution of data over multiple sites
Two classes:
• Client/Server distribution
• Peer-to-peer distribution(or full distribution)
Client/Server distribution
• Concentrates data management duties at servers
• Clients focus on providing the application environment
including user interface
• Communication duties shared between clients and servers
• Client server DBMSs – distributing functionality
• Client server architecture
Peer-to-peer systems
• No distinction of client machines versus servers
• Each machine has full DBMS functionality and communicate
with others to execute queries and transactions
• Also called as fully distributed
Heterogeneity
• Heterogeneity is applied to the network, computer hardware,
operating system and implementation of different developers.
• A key component of the heterogeneous distributed system
client-server environment is middleware.
• Middleware is a set of services that enables application and
end-user to interacts with each other across a heterogeneous
distributed system.
Fragmentation in Distributed DBMS
• Fragmentation is a process of dividing the whole or full
database into various subtables or sub relations so that data can
be stored in different systems.
• The small pieces of sub relations or subtables are
called fragments.
• These fragments are called logical data units and are stored at
various sites.
• It must be made sure that the fragments are such that they can
be used to reconstruct the original relation (i.e, there isn’t any
loss of data).
• In the fragmentation process, let’s say, If a table T is
fragmented and is divided into a number of fragments say T1,
T2, T3….TN.
• The fragments contain sufficient information to allow the
restoration of the original table T.
• This restoration can be done by the use of UNION or JOIN
operation on various fragments. This process is called data
fragmentation.
• All of these fragments are independent which means these
fragments can not be derived from others.
• The users needn’t be logically concerned about fragmentation
is called fragmentation Independence or we can
say fragmentation transparency.
Advantages :
• As the data is stored close to the usage site, the efficiency of
the database system will increase
• Local query optimization methods are sufficient for some
queries as the data is available locally
• In order to maintain the security and privacy of the database
system, fragmentation is advantageous
Disadvantages :
• Access speeds may be very high if data from different
fragments are needed
• If we are using recursive fragmentation, then it will be very
expensive
We have three methods for data fragmenting of a table:
• Horizontal fragmentation
• Vertical fragmentation
• Mixed or Hybrid fragmentation
Horizontal fragmentation
• Horizontal fragmentation refers to the process of dividing a
table horizontally by assigning each row or (a group of rows)
of relation to one or more fragments.
• These fragments are then be assigned to different sides in the
distributed system.
• Some of the rows or tuples of the table are placed in one
system and the rest are placed in other systems.
• The rows that belong to the horizontal fragments are
specified by a condition on one or more attributes of the
relation.
Horizontal fragmentation
• In relational algebra horizontal fragmentation on table T, can
be represented as follows:
The original relation can be obtained by the combination of JOIN and UNION
operations which is given as follows:
Resource Allocation
Network Information
Allocation Model
the processing component, PC, consists of three cost factors, the access cost (AC), the
integrity enforcement cost (IE) (communication and processing cost), and the
concurrency control cost (CC)(the activity of co- ordinating concurrent accesses):