0% found this document useful (0 votes)
14 views7 pages

Advanced Database Chapter 7 Assignment PDF

The document is a group assignment from Haramaya University's College of Computing and Informatics, focusing on advanced database concepts, specifically distributed database systems. It covers topics such as data fragmentation, replication, allocation techniques, and types of distributed database systems, along with their advantages and disadvantages. The assignment is submitted to Mr. Bahar and is due on May 26, 2025.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views7 pages

Advanced Database Chapter 7 Assignment PDF

The document is a group assignment from Haramaya University's College of Computing and Informatics, focusing on advanced database concepts, specifically distributed database systems. It covers topics such as data fragmentation, replication, allocation techniques, and types of distributed database systems, along with their advantages and disadvantages. The assignment is submitted to Mr. Bahar and is due on May 26, 2025.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

HARAMAYA UNIVERSITY

College of Computing and Informatics


Department of Computer Science
GROUP ASSIGNMENT OF ADVANCED DATABASE CHAPTER SEVEN
SECTION ONE
NAME ID_NO
1. ABDULHAKIM MICHAEL 0466/16

2 AREGAA MELION 0739/16

3. AFENDI MOHAMMED 0597/16

4. GURMESA ADUGNA 1443/16

SUBMITTED TO: MR. BAHAR

SUBMISSION DATE: MAY 26, 2025


Chapter Seven
Distributed Database System
7.1. Distributed Database Concepts
7.2. Data Fragmentation, Replication, and Allocation Techniques for Distributed Database
Design

7.3. Types of Distributed Database Systems


7.4. Query Processing in Distributed Databases

Distributed Database Concepts


• A distributed database (DDB) is a collection of multiple logically related database distributed
over a computer network.

• is a collection of multiple interconnected databases spread across different locations.

• Distributed DBMS is the software system that permits the management of a Distributed DB
while making the distribution transparent to the user.

• It allows data to be stored and accessed from multiple sites.


• It acts like one big database for users, even though the data is spread out.

• Distributed DB logically related shared data and metadata at several physically independent
sites connected via network.
A DDBS has the following components &types:

1. Local DBMS: Manages data at each individual location.


• Think of it as a branch manager in a large bank.

• The local DBMS manages its own assigned portion of the entire database. It handles tasks like:

• Storing and retrieving data specific to its location.


• Enforcing data integrity and security rules for its data.

• Processing queries for data residing locally.

• Example: A university DDBMS might have a local DBMS on each department server. The
Computer Science department's local DBMS would manage student data related to CS courses
and grades.

#Types of Distributed Database Systems


Homogeneous DDBS: all the LDBMSs use the same type of software (e.g., all Oracle databases);
Simpler & lack flexibility.
Heterogeneous DDBS: A mix of different products and platforms at different sites (e.g., one site
uses Oracle, another uses MySQL, both). Is more flexibility and more complex to manage

The two main ways these computer "branches" can share data:
• Strict Sharing (Federated): Imagine libraries agree on one way to find books (global schema).
Each library keeps its own system (LDBMS) but everyone searches the same way. There are
strict rules to access books (centralized policy).
• It's like everyone having to follow the same search method in the central library catalog, even
though each branch might have its own way (differences in data models, constraints, query
language) of keeping track of books on the shelves.

• Flexible Sharing (Multi-database): Imagine researchers gather data from anywhere (local
databases) like libraries, websites, and personal files. There's no one way to organize
everything. Their tool figures out how to access data on the fly (dynamic schema). Each source
keeps its own way of organizing data (local control).

•Choosing the Right System:


• Strict Sharing (Federated): Good when everyone needs to access data in the same way and
consistency is important (e.g., banks).
• Flexible Sharing (Multi-database): Useful when combining data from various sources where
adaptability is more important than a strict organization (e.g., research projects).

2. Distributed DDBMS: oversees the entire distributed database system.


• It can be Central Controller, Master DDBMS, Distributed Database Manager, etc. It's
responsible for:
• Managing communication
• Ensuring everything runs smoothly between the individual databases (Local DBMS).

• Ensuring data consistency across all locations

• Processing queries that involve data from multiple locations.


• Think of the Distributed DDBMS like a large corporation with several regional offices.
• Each office (local DBMS) manages its own data and tasks.
• But there's also a headquarters (Distributed DDBMS) that oversees the entire operation,
coordinates communication between the offices, and ensures everything is consistent across
the company.
3. Global System Catalog (GSC): The central library, keeping track of where all the data resides.

• It stores information about the entire distributed database. This includes:


• The GSC acts like a directory, telling the Distributed DDBMS where specific pieces of data
reside on different local DBMSs.
• It stores information about the data itself, like data types and constraints

• The GSC may hold details on user permissions to access different parts of the distributed
database. Specific to the fragmentation, replication, and allocation schemas
•In simpler terms, the GSC provides a reference book for the distributed database, while the
Distributed DDBMS acts as the conductor managing the overall operations.

4. Data Communication (DC): The messengers, allowing data to be exchanged


between locations; is the exchange of data between different parts of a DDBS. is
the software that enables all sites to communicate with each other.
• The DC component contains information about the sites and the links.
• Think of DC the secure communication network connecting all.

Concurrency Control and Recovery


• Distributed Databases encounter a number of concurrency control and recovery problems
which are not present in centralized databases.
• Some of them are listed below.

• Dealing with multiple copies of data items

• Failure of individual sites


• Communication link failure

• Distributed commit
• Distributed deadlock

Data Fragmentation:
• Dividing the data into smaller, manageable chunks is called fragmentation.

• This makes it easier to store and access data on different machines.


• Imagine splitting a giant book into chapters and storing them on separate shelves.

Types of Fragmentation:
• Horizontal: Splitting data based on rows (e.g., customer records for specific regions).

• Vertical: Splitting data based on columns (e.g., customer name and address in one fragment,
order history in another).

Replication
• Keeping copies of the same data fragment on multiple machines is called replication.

• Replication involves creating copies of data fragments and storing them on multiple machines
in a DDBS.

Example: A company replicates customer data on servers in different countries for global access
during emergencies.

• In full replication the entire database is replicated and in partial replication some selected
part is replicated to some of the sites.

Benefits:
• Improved Availability: If one server fails, users can access data from a replica.

• Enhanced Performance: Users can access data from the closest replica, reducing network
traffic.

Allocation
• Allocation is the process of determining the optimal location for each data fragment in a
DDBS.
Allocation Techniques: where to store each fragment is crucial:

Centralized: A central coordinator decides fragment placement.


Distributed: Each site manages its own fragments.

Factors to Consider:
Workload: Where do most queries originate? Fragments containing frequently accessed data
might be placed closer to those users.
•Access Patterns: How is the data typically accessed (by row, by column)? Allocation considers
these patterns for efficient retrieval.
• Network Speed: Fragments are ideally placed on servers with good network connectivity for
quick data transfer.

Key Differences:
Focus: Fragmentation divides data, replication creates copies, and allocation decides storage
location.
Impact on Data: Fragmentation physically splits data, replication increases storage needs,
allocation doesn't alter data itself

Query Processing in Distributed Databases


Executing queries on a distributed database requires extra steps:

• Fragment selection: Identifying which fragments hold the data needed for the query.

Data transfer: Moving data from relevant fragments to a central location or processing it locally.
Query result assembly: Combining results from different fragments to get the final answer.

Distributed transactions: Ensuring data consistency across multiple databases during updates.
Concurrency control: Managing access to shared data fragments to avoid conflicts.

Distributed deadlock: A situation where multiple transactions are waiting on each other,
causing a halt.
Multi-User DBMS

• A multi-user DBMS (Database Management System) is designed multiple users to access and
manipulate the database simultaneously.
File-Server vs. Client-Server Architectures:

In a file-server architecture, the DBMS is centralized on a server, and clients access the
database through file sharing.
The server handles data storage, retrieval, and processing, while clients send requests for data
operations.

In a client-server architecture, the DBMS functionality is divided between the client and server
components.
Clients are responsible for the user interface and application logic, while the server handles
data storage and processing.

Two-Tier Client-Server Architecture:


In a two-tier client-server architecture, the system consists of two layers: the client layer and
the server layer.

The client layer handles the user interface and application logic, while the server layer manages
data storage and processing
Three-Tier Client-Server Architecture:

In a three-tier client-server architecture, the system has three layers: the presentation layer,
application layer, and data layer.
The presentation layer handles the user interface, the application layer contains the business lo

ADVANTAGES of DDBS
Reflects organizational structure

Improved shareable and local autonomy

Improved availability

Improved reliability

Improved performance

Economics

Modular growth

Integration

DISADVANTAGES of DDBS
1. Remaining competitive

2. Complexity
3. Cost

4. Security

5. Integrity control more difficult


6. Lack of standards

7. Lack of experience
8. Database design more complex and the data layer manages data storage and
retrieval

You might also like