0% found this document useful (0 votes)
68 views10 pages

Distributed Database: GDC Thana Semester 6

A distributed database is a collection of interconnected databases located across multiple physical locations that communicate over a computer network. It allows for location independence by storing data across servers, uses distributed query processing to retrieve data from multiple sites, and provides transaction management through commit protocols and recovery methods. Examples include Apache Cassandra, Apache Hbase, Amazon SimpleDB, and Foundation DB. Major companies using distributed databases are Google, Facebook, and LinkedIn. Distributed database systems can be either homogeneous, with identical databases at all sites, or heterogeneous, with varying schemas and software between sites.

Uploaded by

Hamza khan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
68 views10 pages

Distributed Database: GDC Thana Semester 6

A distributed database is a collection of interconnected databases located across multiple physical locations that communicate over a computer network. It allows for location independence by storing data across servers, uses distributed query processing to retrieve data from multiple sites, and provides transaction management through commit protocols and recovery methods. Examples include Apache Cassandra, Apache Hbase, Amazon SimpleDB, and Foundation DB. Major companies using distributed databases are Google, Facebook, and LinkedIn. Distributed database systems can be either homogeneous, with identical databases at all sites, or heterogeneous, with varying schemas and software between sites.

Uploaded by

Hamza khan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

Distributed Database

GDC Thana Semester 6th


Distributed database

• A distributed database is a collection of multiple interconnected

databases, which are spread physically across various locations

that communicate via a computer network.


Distributed Database

• A distributed database is a database that runs and stores data across

multiple computers, as opposed to doing everything on a single machine.

Typically, distributed databases operate on two or more interconnected

servers on a computer network.


Distributed Database Features
• Location independency - Data is physically stored at multiple sites and managed by an independent DDBMS.

• Distributed query processing - Distributed databases answer queries in a distributed environment that manages

data at multiple sites. High-level queries are transformed into a query execution plan for simpler management.

• Distributed transaction management - Provides a consistent distributed database through commit protocols,

distributed concurrency control techniques, and distributed recovery methods in case of many transactions and

failures.
Distributed Database Features
• Seamless integration - Databases in a collection usually represent a single logical database, and they are
interconnected.

• Network linking - All databases in a collection are linked by a network and communicate with each other.

• Transaction processing - Distributed databases incorporate transaction processing, which is a program


including a collection of one or more database operations. Transaction processing is an atomic process

that is either entirely executed or not at all.


Examples of distributed databases
• Apache Cassandra offers support for clusters that span multiple locations, and it features its own query language,
Cassandra Query Language (CQL). Additionally, Cassandra’s replication strategies are configurable.

• Apache Hbase runs on top of the Hadoop Distributed File System and provides a fault-tolerant way to store large
quantities of sparse data. It also features compression, in-memory operation, and Bloom filters on a per-column basis.

• Amazon SimpleDB is used as a web service with Amazon Elastic Compute Cloud and Amazon.

• Foundation DB is a multimodal database designed around a core database that exposes an ordered key valued store
with each transaction.
Companies using distributed databases

1. Google: Google utilizes its distributed database system called Spanner to power various services like
Google Search, Google Maps,

2. Facebook: Facebook relies on Apache Cassandra, a distributed NoSQL database system, for storing and
managing its vast amount of user data.

3. LinkedIn: Kafka enables LinkedIn to handle real-time data streaming and messaging, supporting various
use cases such as log aggregation, event sourcing, and data integration.
Types of distributed database System

Homogeneous Database: In a homogeneous database, all different sites store


database identically. The operating system, database management system, and
the data structures used – all are the same at all sites.

For example, when Store A receives a new shipment of a particular product and
updates its inventory, that update is immediately sent to all the other stores'
database servers. As a result, all the stores have an up-to-date and consistent
view of the inventory across the entire retail chain.
Heterogeneous Database:

• In a heterogeneous distributed database, different sites can use different schema


and software that can lead to problems in query processing and transactions. Also,

a particular site might be completely unaware of the other sites. Different computers

may use a different operating system, different database application. They may even

use different data models for the database. Hence, translations are required for

different sites to communicate.

You might also like