Parallel and Distributed Database Systems

Uploaded by

dwightschrute826

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

907 views22 pages

Parallel and Distributed Database Systems

Uploaded by

dwightschrute826

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

Parallel and Distributed Databases

Syllabus Content
• Parallel Database:
• Architecture, I/O Parallelism, Interquery, Intraquery
• Intraoperation and Interoperation Parallelism
• Distributed Databases
• Types of Distributed Database Systems,
• Distributed Data Storage, Distributed Query Processing
Parallel Database
• Parallel DBMS is a Database Management System that runs through
multiple processors and disks.
• They combine two or more processors also disk storage that helps make
operations and executions easier and faster.
• They are designed to execute concurrent operations.
• Architectural Models
• There are several architectural models for parallel Database, which are
given below −
• Shared memory architecture.
• Shared disk architecture.
• Shared nothing architecture.
Parallel Database
• Shared Memory System
• Every computer processor is able to access and
process data from multiple memory modules or
unit through intercommunication channel.
• This architecture is also commonly known as SMP
or Symmetric Multi-processing
• Shared Disk System
• A Shared Disk System is an architecture of
Database Management System where every
computer processors can access multiple disk
through intercommunication network.
• It can also access and utilize every local memory.
Parallel Database
• Shared Nothing System
• A Shared Nothing System is an architecture of
Database Management System where every
processor has their own disk and memory for the
objective of efficient workflows.
• The processors can communicate with other
processors using intercommunication network.
• Each of the processors act like servers to store
data on the disk.
I/O parallelism in parallel database
• I/O parallelism refers to reducing the time required to retrieve relations from disk by
partitioning the relations on multiple disks.
• The most common form of data partitioning in a parallel database environment is
horizontal partitioning.
• In horizontal partitioning, the tuples of a relation are divided (or declustered) among
many disks, so that each tuple resides on one disk.
•Partitioning Techniques
•Three basic data-partitioning strategies. Assume that there are n disks,
•D0,D1, . . .,Dn−1, across which the data are to be partitioned.
•Round-robin.
•This strategy scans the relation in any order and sends the ith tuple to disk number Di
mod n.
•The round-robin scheme ensures an even distribution of tuples across disks; that is,
each disk has approximately the same number of tuples as the others.
I/O parallelism in parallel database
•Hash partitioning.
•This declustering strategy designates one or more attributes from the given relation’s schema as
the partitioning attributes.
•A hash function is chosen whose range is {0, 1, . . . , n − 1}. Each tuple of the original relation is
hashed on the partitioning attributes. If the hash function returns i, then the tuple is placed on
disk Di.
•Range partitioning.
•This strategy distributes contiguous attribute-value ranges to each disk. It chooses a partitioning
attribute, A, as a partitioning vector.
•The relation is partitioned as follows. Let [v0, v1, . . . , vn−2] denote the partitioning vector, such
that, if i < j, then vi < vj. Consider a tuple t such that t[A] = x. If x < v0, then t goes on disk D0. If x ≥
vn−2, then t goes on disk Dn−1. If vi ≤ x < vi+1, then t goes on disk Di+1.
•For example, range partitioning with three disks numbered 0, 1, and 2 may assign tuples with
values less than 5 to disk 0, values between 5 and 40 to disk 1, and values greater than 40 to
disk 2.
Interquery and Intraquery parallelism
• Interquery Parallelism
• In interquery parallelism, different queries or transaction execute in parallel with
one another.
• This form of parallelism can increase transactions throughput. The response
times of individual transactions are not faster than they would be if the
transactions were run in isolation.
• Thus, the primary use of interquery parallelism is to scale up a transaction
processing system to support a more significant number of transactions per
second.
• Interquery parallelism is the easiest form of parallelism to support in a database
system—particularly in a shared-memory parallel system.
• Database systems designed for single-processor systems can be used with few or
no changes on a shared-memory parallel architecture, since even sequential
database systems support concurrent processing.
Interquery and Intraquery parallelism
• Intraquery Parallelism
• Intraquery parallelism defines the execution of a single query in parallel on
multiple processors and disks.
• Using intraquery parallelism is essential for speeding up long-running queries.
• This application of parallelism decomposes the serial SQL, query into lower-
level operations such as scan, join, sort, and aggregation.
• To illustrate the parallel evaluation of a query, consider a query that requires a
relation to be sorted. Suppose that the relation has been partitioned across
multiple disks by range partitioning on some attribute, and the sort is
requested on the partitioning attribute. The sort operation can be
implemented by sorting each partition in parallel, then concatenating the
sorted partitions to get the final sorted relation.
Intraoperation and Interoperation Parallelism
• we may be able to pipeline the output of one operation to another operation.
The two operations can be executed in parallel on separate processors,
• one generating output that is consumed by the other, even as it is generated.
• In summary, the execution of a single query can be parallelized in two ways:
• Intraoperation parallelism.
• We can speed up processing of a query by parallelizing the execution of each
individual operation, such as sort, select, project, and join.
• Interoperation parallelism.
• We can speed up processing of a query by executing in parallel the different
operations in a query expression.
Distributed Databases
• What is a distributed database?
• Distributed database system is one in which the
data belonging to a single logical database is
distributed to two or more physical databases
to ensure reliability and availability
• A distributed database is a database in which all
storage devices are not attached to a common
CPU. Data may be stored in multiple sites
separate from each other.
• In a distributed database, the data is spread or
replicated among several databases which are
physically separate from each other. These
databases are connected through a network so
that they appear as a single database to the
user.
Types of Distributed Databases
• Distributed databases can be broadly
classified into homogeneous and
heterogeneous distributed database
environments
• Homogeneous Distributed Databases
• In a homogeneous distributed database, all
the sites use identical DBMS and operating
systems. Its properties are
• The sites use very similar software.
• The sites use identical DBMS or DBMS from
the same vendor.
• Each site is aware of all other sites and
cooperates with other sites to process user
requests.
• The database is accessed through a single
interface as if it is a single database.
Types of Distributed Databases
• There are two types of homogeneous
distributed database are:
[Link] − Each database is
independent that functions on its
own. They are integrated by a
controlling application and use
message passing to share data
updates.
[Link]-autonomous − Data is distributed
across the homogeneous nodes and a
central or master DBMS co-ordinates
data updates across the sites.
Types of Distributed Databases
• Heterogeneous Distributed Databases
• In a heterogeneous distributed database,
different sites have different operating systems,
DBMS products and data models. Its properties
are −
• Different sites use dissimilar schemas and
software.
• The system may be composed of a variety of
DBMSs like relational, network, hierarchical or
object oriented.
• Query processing is complex due to dissimilar
schemas.
• Transaction processing is complex due to
dissimilar software.
• A site may not be aware of other sites and so
there is limited co-operation in processing user
requests.
Types of Distributed Databases
• Types of Heterogeneous Distributed
Databases
[Link] − The heterogeneous
database systems are independent in
nature and integrated together so
that they function as a single
database system.
[Link]-federated − The database systems
employ a central coordinating module
through which the databases are
accessed.
Distributed Data Storage

• Distributed Data storage is an intelligent distribution of your data pieces,

(called data fragments) to improve database performance and Data
Availability for end-users.
• It aims to reduce overall costs of transaction processing while also
providing accurate data rapidly in your DDBMS systems.
• Distributed Data storage is one of the key steps in building your
Distributed Database Systems.
• There are two common strategies used in optimal Data Allocation: Data
Fragmentation and Data Replication.
Distributed Data Storage
• Fragmentation –
In this approach, the relations are fragmented (i.e., they’re divided into smaller parts) and
each of the fragments is stored in different sites where they’re required.
• Fragmentation is a process of disintegrating relations or tables into several partitions in
multiple sites. It divides a database into various subtables and sub relations so that data can
be distributed and stored efficiently. Fragmentation of relations can be done in two ways:
•Horizontal fragmentation– Splitting by rows – The relation is fragmented into groups of
tuples so that each tuple is assigned to at least one fragment.
• For example, in the student schema, if the details of all students of Computer Science
Course needs to be maintained at the School of Computer Science, then the designer will
horizontally fragment the database as follows −
• CREATE COMP_STD AS
• SELECT * FROM STUDENT
• WHERE COURSE = "Computer Science";
Distributed Data Storage
•Vertical fragmentation – Splitting by columns –
•The schema of the relation is divided into smaller schemas. Each fragment must contain a
common candidate key so as to ensure a lossless join.
•In certain cases, an approach that is hybrid of fragmentation and replication is used.
•For example, let us consider that a University database keeps records of all registered
students in a Student table having the following schema.
• STUDENT
Regd_No Name Course Address Semester Fees Marks

• Now, the fees details are maintained in the accounts section. In this case, the designer will
fragment the database as follows −
• CREATE TABLE STD_FEES AS
• SELECT Regd_No, Fees
• FROM STUDENT;
Distributed Data Storage
•Hybrid Fragmentation
•In hybrid fragmentation, a combination of horizontal and vertical
fragmentation techniques are used.
•Hybrid fragmentation can be done in two alternative ways −
•At first, generate a set of horizontal fragments; then generate vertical
fragments from one or more of the horizontal fragments.
•At first, generate a set of vertical fragments; then generate horizontal
fragments from one or more of the vertical fragments.
Distributed Data Storage
•Fragmentation Example
Distributed Data Storage
• Replication –
In this approach, the entire relationship is stored redundantly at 2 or more sites. If the entire database is
available at all sites, it is a fully redundant database. Hence, in replication, systems maintain copies of data.
• This is advantageous as it increases the availability of data at different sites.
• However, it has certain disadvantages as well. Data needs to be constantly updated. Any change made at one site
needs to be recorded at every site that relation is stored or else it may lead to inconsistency. This is a lot of
overhead. Also, concurrency control becomes way more complex as concurrent access now needs to be checked
over a number of sites.
• Advantages of Data Replication
• Reliability − In case of failure of any site, the database system continues to work since a copy is available at another
site(s).
• Reduction in Network Load − Since local copies of data are available, query processing can be done with reduced
network usage, particularly during prime hours. Data updating can be done at non-prime hours.
• Quicker Response − Availability of local copies of data ensures quick query processing and consequently quick
response time.
• Simpler Transactions − Transactions require less number of joins of tables located at different sites and minimal
coordination across the network. Thus, they become simpler in nature.
Distributed Data Storage
•Types of Data Replication In DBMS
•Transactional Replication
•Snapshot Replication
•Merge Replication
•Transactional Replication
•Transactional Replication makes a complete copy of your database, as well as copies of new data changes. In this type of
Data Replication, changes to your database are synced in real-time and in the same order as they occur. This guarantees
transactional consistency.
•Snapshot Replication
•Snapshot Replication is perhaps the simplest type of Data Replication that copies “snapshots” of your database. It
replicates the current state of your database as is, at a specific point in time, without including any changes/updates to
your data. This kind of replication is helpful when changes made to your databases are infrequent.
•Merge Replication
•Merge Replication combines data from several databases into a single database. This type of Data Replication tracks
subsequent data changes and schema modifications made at publishers and subscribers and synchronizes the same to your
database using merge agents. A great advantage of using Merge Replication is that it allows publishers and subscribers to
independently modify the database.

I/O Parallelism in Database Systems
100% (1)
I/O Parallelism in Database Systems
52 pages
Query Processing and Optimization
No ratings yet
Query Processing and Optimization
33 pages
B+ Tree Dbms
No ratings yet
B+ Tree Dbms
22 pages
Shadow Paging in Database Recovery
No ratings yet
Shadow Paging in Database Recovery
2 pages
JAVA LAB MANUAL - 3rd SEM
No ratings yet
JAVA LAB MANUAL - 3rd SEM
30 pages
Data Storage and Querying
No ratings yet
Data Storage and Querying
2 pages
Interquery vs Intraquery Parallelism
No ratings yet
Interquery vs Intraquery Parallelism
2 pages
Chapter 6 (Pipelining and Superscalar Techniques)
No ratings yet
Chapter 6 (Pipelining and Superscalar Techniques)
10 pages
Computer Organization: Sandeep Kumar
No ratings yet
Computer Organization: Sandeep Kumar
117 pages
Assignment 4
No ratings yet
Assignment 4
7 pages
18cs53 Dbms Module 5
No ratings yet
18cs53 Dbms Module 5
25 pages
Multiprocessors and Multicomputers
No ratings yet
Multiprocessors and Multicomputers
27 pages
DBMS UNIT IV NOTES File Organization and Indexing
No ratings yet
DBMS UNIT IV NOTES File Organization and Indexing
64 pages
Characteristics of Database Approach
No ratings yet
Characteristics of Database Approach
52 pages
Database Languages and Interfaces
No ratings yet
Database Languages and Interfaces
1 page
Deadlock in DBMS
No ratings yet
Deadlock in DBMS
3 pages
4th Sem DBMS LAB Manual
No ratings yet
4th Sem DBMS LAB Manual
43 pages
Two-Phase Locking in DBMS Transactions
No ratings yet
Two-Phase Locking in DBMS Transactions
32 pages
Types of Database Normal Forms
No ratings yet
Types of Database Normal Forms
19 pages
What Is Data Structure
No ratings yet
What Is Data Structure
49 pages
Serial and Parallel First 3 Lecture
No ratings yet
Serial and Parallel First 3 Lecture
17 pages
Os File Allocation Methods
No ratings yet
Os File Allocation Methods
5 pages
Understanding Three-Tier DBMS Architecture
No ratings yet
Understanding Three-Tier DBMS Architecture
8 pages
Reader Writer Problem
No ratings yet
Reader Writer Problem
2 pages
SIMD Computer Organizations
0% (1)
SIMD Computer Organizations
20 pages
4 Serializability
No ratings yet
4 Serializability
6 pages
BCA IV Sem Database Management System
No ratings yet
BCA IV Sem Database Management System
15 pages
The Relational Data Model and Relational Database Constraints
No ratings yet
The Relational Data Model and Relational Database Constraints
41 pages
Os Unit 5 Class Notes
No ratings yet
Os Unit 5 Class Notes
17 pages
16 Marks
No ratings yet
16 Marks
5 pages
DBMS Important Questions
No ratings yet
DBMS Important Questions
4 pages
DBMS Lab Manual Jan2024 PDF
No ratings yet
DBMS Lab Manual Jan2024 PDF
80 pages
Sparse Matrix
100% (1)
Sparse Matrix
8 pages
ER Diagram Examples in DBMS
No ratings yet
ER Diagram Examples in DBMS
9 pages
DBMS MT-1 QuestionPaper
No ratings yet
DBMS MT-1 QuestionPaper
2 pages
1multiprocessors and Multicomputers: A. Multiprocessor System Interconnects
No ratings yet
1multiprocessors and Multicomputers: A. Multiprocessor System Interconnects
16 pages
Data Abstraction Dbms
No ratings yet
Data Abstraction Dbms
2 pages
04 10-Mark Questions
No ratings yet
04 10-Mark Questions
3 pages
Centralized and Client - Server Architecture For DBMS - by Krishnaharshith - Medium
No ratings yet
Centralized and Client - Server Architecture For DBMS - by Krishnaharshith - Medium
10 pages
Dbms
No ratings yet
Dbms
42 pages
Deadlock Management in DBMS
No ratings yet
Deadlock Management in DBMS
12 pages
Backup and Recovery
No ratings yet
Backup and Recovery
35 pages
Unit Iv Indexing and Hashing: Basic Concepts
No ratings yet
Unit Iv Indexing and Hashing: Basic Concepts
35 pages
Fundamentals of Algorithmic Problem Solving: B.B. Karki, LSU 2.1 CSC 3102
No ratings yet
Fundamentals of Algorithmic Problem Solving: B.B. Karki, LSU 2.1 CSC 3102
4 pages
Part - A: Database Management System Lab
No ratings yet
Part - A: Database Management System Lab
26 pages
Deadlock in OS
No ratings yet
Deadlock in OS
43 pages
Parallel Processing Practice Questions
100% (2)
Parallel Processing Practice Questions
1 page
Parallel DBMS Vendors
No ratings yet
Parallel DBMS Vendors
14 pages
C Pointers: Concepts and Examples
100% (1)
C Pointers: Concepts and Examples
8 pages
Chapter 9 Transactions Management and Concurrency Control
No ratings yet
Chapter 9 Transactions Management and Concurrency Control
36 pages
Indexed Sequential File Organization
No ratings yet
Indexed Sequential File Organization
5 pages
DBMS Unit 1 Notes
100% (1)
DBMS Unit 1 Notes
22 pages
Types of Queue in Data Structure
100% (1)
Types of Queue in Data Structure
3 pages
Assembler Pass 1.
No ratings yet
Assembler Pass 1.
66 pages
Chapter 5-The Memory System
100% (1)
Chapter 5-The Memory System
80 pages
Quick Sort
No ratings yet
Quick Sort
7 pages
Parallel and Distributed Databases in DBMS
No ratings yet
Parallel and Distributed Databases in DBMS
31 pages
Adv DBMS-Unit 2
No ratings yet
Adv DBMS-Unit 2
15 pages
Parallel and Distributed DBMS Techniques
No ratings yet
Parallel and Distributed DBMS Techniques
15 pages
Parallel and Distributed Databases NOTES
No ratings yet
Parallel and Distributed Databases NOTES
98 pages
CN Lab Manual ECE 6th Sem
50% (2)
CN Lab Manual ECE 6th Sem
49 pages
Mastering MySQL Notes
No ratings yet
Mastering MySQL Notes
5 pages
Games
No ratings yet
Games
3 pages
String Manipulation and Data Structures
No ratings yet
String Manipulation and Data Structures
9 pages
Minitab Licensing Log Analysis
No ratings yet
Minitab Licensing Log Analysis
6 pages
Create CRM Activity Function Module
No ratings yet
Create CRM Activity Function Module
13 pages
Fire c166
No ratings yet
Fire c166
108 pages
Infineon Hyperbus Specification Low Signal Count High Performance DDR Bus Additionaltechnicalinformation en
No ratings yet
Infineon Hyperbus Specification Low Signal Count High Performance DDR Bus Additionaltechnicalinformation en
45 pages
Channel Coding Techniques For Wireless Communications
No ratings yet
Channel Coding Techniques For Wireless Communications
17 pages
Thimble - 2.1.3.zip - TXT Filename UTF-8''thimble 2.1.3.zip
No ratings yet
Thimble - 2.1.3.zip - TXT Filename UTF-8''thimble 2.1.3.zip
12 pages
Layered Tasks Osi Model TCP Ip Model
No ratings yet
Layered Tasks Osi Model TCP Ip Model
33 pages
Krename-3 0 3
No ratings yet
Krename-3 0 3
19 pages
Firewire Seminar Report
No ratings yet
Firewire Seminar Report
34 pages
Psc-Unit 1-4-File Handling in Python
No ratings yet
Psc-Unit 1-4-File Handling in Python
9 pages
ARM Assembly Language Programming Examples
100% (2)
ARM Assembly Language Programming Examples
12 pages
X7968r / X7968m: Broadband Wireless Gateway
No ratings yet
X7968r / X7968m: Broadband Wireless Gateway
2 pages
SQL Injection Cheat Sheet
No ratings yet
SQL Injection Cheat Sheet
3 pages
Eeprom Emulation st10
100% (1)
Eeprom Emulation st10
15 pages
Implementing Wire Protocols With Boost Fusion - Thomas Rodgers - CppCon 2014
100% (1)
Implementing Wire Protocols With Boost Fusion - Thomas Rodgers - CppCon 2014
82 pages
MIPS Exception Handling and I/O
No ratings yet
MIPS Exception Handling and I/O
10 pages
Dynamic Crystal Reports in C# Guide
No ratings yet
Dynamic Crystal Reports in C# Guide
10 pages
Ali MBIS403 Data Modelling and Database Development Week 7
No ratings yet
Ali MBIS403 Data Modelling and Database Development Week 7
3 pages
Garmin G1000 HSDB
No ratings yet
Garmin G1000 HSDB
6 pages
Self-Defending Databases Guide
No ratings yet
Self-Defending Databases Guide
76 pages
NFC Tag with I2C, Energy Harvesting
No ratings yet
NFC Tag with I2C, Energy Harvesting
82 pages
RoR Technical Interview Questions
No ratings yet
RoR Technical Interview Questions
12 pages
Relational Database Implementation Guide
No ratings yet
Relational Database Implementation Guide
23 pages
Fixed vs. Variable Partitioning in OS
No ratings yet
Fixed vs. Variable Partitioning in OS
6 pages
Standard Eeprom Ics: SLX 24C164 16 Kbit (2048 8 Bit) Serial Cmos-Eeprom With C Synchronous 2-Wire Bus
No ratings yet
Standard Eeprom Ics: SLX 24C164 16 Kbit (2048 8 Bit) Serial Cmos-Eeprom With C Synchronous 2-Wire Bus
25 pages
BW PPT
No ratings yet
BW PPT
83 pages

Parallel and Distributed Database Systems

Uploaded by

Parallel and Distributed Database Systems

Uploaded by

Parallel and Distributed Databases

• Distributed Data storage is an intelligent distribution of your data pieces,

You might also like