0% found this document useful (0 votes)

246 views27 pages

Parallel Database: Architecture For Parallel Databases. Parallel Query Evaluation Parallelizing Individual Operations

A parallel database system seeks to improve performance through parallelizing various database operations like loading data, building indexes, and evaluating queries using multiple CPUs and disks in parallel. There are three main architectures for parallel databases - shared memory, shared disk, and shared nothing. The shared nothing architecture provides linear scale up and speed up but is harder to program. Data partitioning techniques like horizontal partitioning and vertical partitioning enable parallel query evaluation.

Uploaded by

Car

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

246 views27 pages

Parallel Database: Architecture For Parallel Databases. Parallel Query Evaluation Parallelizing Individual Operations

Uploaded by

Car

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 27

Parallel database

1. Introduction
2. Architecture for Parallel databases.
3. Parallel query Evaluation
4. Parallelizing Individual operations.

1
1
Introduction
 What is a Centralized Database ?
-all the data is maintained at a single site and assumed that the processing of
individual transaction is essentially sequential.

2
PARALLEL DBMSs
WHY DO WE NEED THEM?

• More and More Data!

We have databases that hold a high amount of

data, in the order of 1012 bytes:

10,000,000,000,000 bytes!

• Faster and Faster Access!

We have data applications that need to process

data at very high speeds:

10,000s transactions per second!

SINGLE-PROCESSOR DBMS AREN’T UP TO THE JOB!

3
3
Why Parallel Access To Data?
At 10 MB/s 1,000 x parallel
1.2 days to scan 1.5 minute to scan.
Ba
nd
1 Terabyte wi 1 Terabyte

dt
h
10 MB/s
Parallelism:
divide a big problem
into many smaller ones
to be solved in parallel.
4
Parallel DB
 Parallel database system seeks to improve performance through
parallelization of various operations such as loading data ,building
indexes, and evaluating queries by using multiple CPUs and Disks in
Parallel.

 Motivation for Parallel DB

 Parallel machines are becoming quite common and affordable
 Prices of microprocessors, memory and disks have dropped sharply
 Databases are growing increasingly large
 large volumes of transaction data are collected and stored for later
analysis.
 multimedia objects like images are increasingly stored in databases

5
PARALLEL DBMSs
BENEFITS OF A PARALLEL DBMS

 Improves Response Time.

INTERQUERY PARALLELISM
It is possible to process a number of transactions in parallel with each other.

 Improves Throughput.

INTRAQUERY PARALLELISM
It is possible to process ‘sub-tasks’ of a transaction in parallel with each other.

6
PARALLEL DBMSs
HOW TO MEASURE THE BENEFITS

 Speed-Up
– Adding more resources results in proportionally less running time for a
fixed amount of data.
10 seconds to scan a DB of 10,000 records using 1 CPU
1 second to scan a DB of 10,000 records using 10 CPUs

 Scale-Up
 If resources are increased in proportion to an increase in data/problem
size, the overall time should remain constant

– 1 second to scan a DB of 1,000 records using 1 CPU

1 second to scan a DB of 10,000 records using 10 CPUs

7
Architectures for Parallel Databases
 The basic idea behind Parallel DB is to carry out evaluation steps in
parallel whenever is possible.

 There are many opportunities for parallelism in RDBMS.

 3 main architectures have been proposed for building parallel DBMSs.

1. Shared Memory
2. Shared Disk
3. Shared Nothing

8
Shared Memory
 Advantages:
1. It is closer to conventional machine
, Easy to.
2. OS services arprogram
3. overhead is lowe leveraged to
utilize the additional CPUs.
 Disadvantage:
1. It leads to bottleneck problem
2. Expensive to build
3. It is less sensitive to
partitioning

9
Shared Disk
 Advantages:
1. Almost same
 Disadvantages:
1. More interference
2. Increases N/W band width
3. Shared disk less sensitive to
partitioning

10
Shared Nothing
 Advantages:
1. It provides linear scale up
&linear speed up
2. Shared nothing benefits from
"good" partitioning
3. Cheap to build
 Disadvantage
1. Hard to program
2. Addition of new nodes
requires reorganizing

11
PARALLEL DBMSs
SPEED-UP

Number of transactions/second

Linear speed-up (ideal)

2000/Sec
1600/Sec
Sub-linear speed-up
1000/Sec

5 CPUs 10 CPUs 16 CPUs

12 1. Parallel DB /D.S.Jagli
Number of CPUs 09/01/20
PARALLEL DBMSs
SCALE-UP

Number of transactions/second

1000/Sec Linear scale-up (ideal)

900/Sec Sub-linear scale-up

5 CPUs 10 CPUs
1 GB Database 2 GB Database

13 1. Parallel DB /D.S.Jagli
Number of CPUs, Database size 09/01/20
PARALLEL QUERY EVALUATION

A relational query execution plan is graph/tree of

relational algebra operators (based on this
operators can execute in parallel)

14
Different Types of DBMS ||-ism
 Parallel evaluation of a relational query in DBMS With shared –nothing
architecture
1. Inter-query parallelism
 Multiple queries run on different sites
2. Intra-query parallelism
 Parallel execution of single query run on different sites.
a) Intra-operator parallelism
a) get all machines working together to compute a given operation (scan, sort, join).
b) Inter-operator parallelism
 each operator may run concurrently on a different site (exploits
pipelining).
 In order to evaluate different operators in parallel, we need to
evaluate each operator in query plan in Parallel.

15
Data Partitioning
 Types of Partitioning
1. Horizontal Partitioning: tuple of a relation are divided among
many disks such that each tuple resides on one disk.
It enables to exploit the I/O band width of disks by reading &
writing them in parallel.
Reduce the time required to retrieve relations from disk by
partitioning the relations on multiple disks.
1. Range Partitioning
2. Hash Partitioning
3. Round Robin Partitioning

2. Vertical Partitioning

16
1.Range Partitioning
 Tuples are sorted (conceptually), and n ranges are chosen for
the sort key values so that each range contains roughly the
same number of tuples;
 tuples in range i are assigned to processor i.
 Eg:
 sailor _id 1-10 assigned to disk 1
sailor _id 10-20 assigned to disk 2
sailor _id 20-30 assigned to disk 3

 range partitioning can lead to data skew; that is, partitions with widely
varying number of tuples across

17
2.Hash Partitioning
 A hash function is applied to selected fields of a tuple to determine its
processor.

 Hash partitioning has the additional virtue that it keeps data evenly
distributed even if the data grows and shrinks over time.

18
3.Round Robin Partitioning

 If there are n processors, the i th tuple is assigned to processor i mod n in

round-robin partitioning.

 Round-robin partitioning is suitable for efficiently evaluating queries that

access the entire relation.

 If only a subset of the tuples (e.g., those that satisfy the selection
condition age = 20) is required, hash partitioning and range partitioning
are better than round-robin partitioning

19
Range Hash Round Robin

A...E F...J K...N O...S T...Z A...E F...J K...N O...S T...Z A...E F...J K...N O...S T...Z

Good for equijoins, Good for equijoins, Good to spread load

exact-match queries, exact match queries
and range queries

20
Parallelizing Sequential Operator
Evaluation Code
1. An elegant software architecture for parallel DBMSs enables us to
readily parallelize existing code for sequentially evaluating a
relational operator.

2. The basic idea is to use parallel data streams.

3. Streams are merged as needed to provide the inputs for a relational

operator.

4. The output of an operator is split as needed to parallelize subsequent

processing.

5. A parallel evaluation plan consists of a dataflow network of

relational, merge, and split operators.
21
PARALLELIZING INDIVIDUAL
OPERATIONS
 How various operations can be implemented in parallel in a shared-
nothing architecture?

 Techniques
1. Bulk loading& scanning
2. Sorting
3. Joins

22
1.Bulk Loading and scanning
 scanning a relation: Pages can be read in parallel while scanning a
relation, and the retrieved tuples can then be merged, if the relation is
partitioned across several disks.

 bulk loading: if a relation has associated indexes, any sorting of data

entries required for building the indexes during bulk loading can also
be done in parallel.

23
2.Parallel Sorting :
 Parallel sorting steps:
1. First redistribute all tuples in the relation using range partitioning.
2. Each processor then sorts the tuples assigned to it
3. The entire sorted relation can be retrieved by visiting the processors in
an order corresponding to the ranges assigned to them.

 Problem: Data skew

 Solution: “sample” the data at the outset to determine good

range partition points.

A particularly important application of parallel sorting is sorting the data

entries in tree-structured indexes.
24
3.Parallel Join
1. The basic idea for joining A and B in parallel is to decompose the join
into a collection of k smaller joins by using partition.

2. By using the same partitioning function for both A and B, we ensure that
the union of the k smaller joins computes the join of A and B.

 Hash-Join
 Sort-merge-join

25
Sort-merge-join
 partition A and B by dividing the range of the join attribute into k disjoint
subranges and placing A and B tuples into partitions according to the
subrange to which their values belong.

 Each processor carry out a local join.

 In this case the number of partitions k is chosen to be equal to the number of

processors n .

 The result of the join of A and B, the output of the join process may be split
into several data streams.

The advantage that the output is available in sorted order

26
Dataflow Network of Operators for
Parallel Join

Good use of split/merge makes it easier to build parallel versions of sequential

join code
27

DBMS Notes Unit 1
No ratings yet
DBMS Notes Unit 1
76 pages
Lecture 1 Parallel Databases
No ratings yet
Lecture 1 Parallel Databases
30 pages
Enhanced Data Models For Advanced Applications
91% (11)
Enhanced Data Models For Advanced Applications
15 pages
Distributed Database System (KCA045)
No ratings yet
Distributed Database System (KCA045)
9 pages
DBMS Record Lab Manual
100% (1)
DBMS Record Lab Manual
23 pages
DBMS Unit 1 Notes
100% (1)
DBMS Unit 1 Notes
22 pages
Parallel DB /D.S.Jagli 1 5/4/2012 1 1. Parallel DB /D.S.Jagli
No ratings yet
Parallel DB /D.S.Jagli 1 5/4/2012 1 1. Parallel DB /D.S.Jagli
70 pages
Dbms Practical File
100% (1)
Dbms Practical File
22 pages
Dbms Complete Notes
No ratings yet
Dbms Complete Notes
66 pages
DBMS Notes
No ratings yet
DBMS Notes
141 pages
R.D.B.M.S Practical Lab Record
No ratings yet
R.D.B.M.S Practical Lab Record
22 pages
Introduction To Parallel Databases
No ratings yet
Introduction To Parallel Databases
24 pages
System Analysis and Design Notes For Bca 2nd Semester Vbspu
No ratings yet
System Analysis and Design Notes For Bca 2nd Semester Vbspu
20 pages
DBMS Unit 1 Database System Architecture GTU Study Material Presentations
No ratings yet
DBMS Unit 1 Database System Architecture GTU Study Material Presentations
38 pages
DBMS Unit 4 Notes PDF
No ratings yet
DBMS Unit 4 Notes PDF
61 pages
4th Sem DBMS LAB Manual
No ratings yet
4th Sem DBMS LAB Manual
43 pages
M.C.a. (Sem - IV) Paper - IV - Adavanced Database Techniques
No ratings yet
M.C.a. (Sem - IV) Paper - IV - Adavanced Database Techniques
114 pages
Agriculture Research Proposal
100% (7)
Agriculture Research Proposal
10 pages
DBMS in 5
No ratings yet
DBMS in 5
83 pages
RDBMS Unit - I
No ratings yet
RDBMS Unit - I
38 pages
6.1 Emerging Databases
No ratings yet
6.1 Emerging Databases
18 pages
WT 4th and 5th Unit
No ratings yet
WT 4th and 5th Unit
23 pages
KMBNIT03 - Unit 1
No ratings yet
KMBNIT03 - Unit 1
24 pages
DMW Lab Manual (1) EDIT
No ratings yet
DMW Lab Manual (1) EDIT
118 pages
Candidate Generation and Pruning
No ratings yet
Candidate Generation and Pruning
9 pages
Hbase
No ratings yet
Hbase
13 pages
Lending-Schema (Branch-Name, Branch-City, Assets, Customer-Name, Loan-Number, Amount)
No ratings yet
Lending-Schema (Branch-Name, Branch-City, Assets, Customer-Name, Loan-Number, Amount)
6 pages
Software Engineering Notes (Unit-III)
No ratings yet
Software Engineering Notes (Unit-III)
21 pages
Question Bank Unit 1 - Introduction To Database Management Systems and ER Model
No ratings yet
Question Bank Unit 1 - Introduction To Database Management Systems and ER Model
2 pages
DBMS Lab (18IS507) Manual With Solutions-1
No ratings yet
DBMS Lab (18IS507) Manual With Solutions-1
24 pages
Cost Benefit Analysis
No ratings yet
Cost Benefit Analysis
14 pages
QuestionBank LabPractcals
No ratings yet
QuestionBank LabPractcals
12 pages
LAB Set Questions Rdbms
No ratings yet
LAB Set Questions Rdbms
18 pages
4-Fundamentals of Database Management
No ratings yet
4-Fundamentals of Database Management
99 pages
Parallel & Distributed Databases: C S 5 6 1 - S P R I N G 2 0 1 2 Wpi, Mohamed Eltabakh
No ratings yet
Parallel & Distributed Databases: C S 5 6 1 - S P R I N G 2 0 1 2 Wpi, Mohamed Eltabakh
23 pages
Stack Organization in Computer Architecture
No ratings yet
Stack Organization in Computer Architecture
9 pages
What Is DBMS - Application, Types, Example, Advantages
No ratings yet
What Is DBMS - Application, Types, Example, Advantages
7 pages
Spatial and Temporal Database
No ratings yet
Spatial and Temporal Database
44 pages
Unit VIII - Query Processing and Security
No ratings yet
Unit VIII - Query Processing and Security
29 pages
Purchase Order: Po No. Dated
No ratings yet
Purchase Order: Po No. Dated
3 pages
Dbms Mod4 PDF
No ratings yet
Dbms Mod4 PDF
36 pages
Recoverability and Serializability
No ratings yet
Recoverability and Serializability
3 pages
Minor Project 2 Shubham Bahri Bca M
No ratings yet
Minor Project 2 Shubham Bahri Bca M
78 pages
Failure Classification in DBMS
No ratings yet
Failure Classification in DBMS
2 pages
Term Work: Database Management System
No ratings yet
Term Work: Database Management System
67 pages
Program-1 1. Login To Oracle by User Name Given by You
No ratings yet
Program-1 1. Login To Oracle by User Name Given by You
48 pages
Dbms Lesson Plan
No ratings yet
Dbms Lesson Plan
11 pages
Detailed University Schema: Appendix
No ratings yet
Detailed University Schema: Appendix
2 pages
ParallelDBs PDF
No ratings yet
ParallelDBs PDF
23 pages
Dbms Practical
0% (1)
Dbms Practical
2 pages
DBMS - Unit 4
No ratings yet
DBMS - Unit 4
22 pages
Numerical Based On Indexing: Problem 1.2
No ratings yet
Numerical Based On Indexing: Problem 1.2
3 pages
Relational Model
No ratings yet
Relational Model
20 pages
Concurrency Control Dbms
No ratings yet
Concurrency Control Dbms
49 pages
Rdbms Assignments 15
No ratings yet
Rdbms Assignments 15
41 pages
Coa Lab Manual
No ratings yet
Coa Lab Manual
21 pages
Integrity and Domain Constraints
No ratings yet
Integrity and Domain Constraints
25 pages
DataWarehouseMining Complete Notes
No ratings yet
DataWarehouseMining Complete Notes
55 pages
Systems Planning and The Initial Investigation
No ratings yet
Systems Planning and The Initial Investigation
17 pages
Main Report Online Tour Travels
No ratings yet
Main Report Online Tour Travels
49 pages
Chapter 15 - Relational Database Design Algorithms and Further Dependencies
No ratings yet
Chapter 15 - Relational Database Design Algorithms and Further Dependencies
6 pages
Synopsis: Stock Agent - A Java Stock Market Trading Program
No ratings yet
Synopsis: Stock Agent - A Java Stock Market Trading Program
27 pages
Secondary Market DR S Sreenivasa Murthy
No ratings yet
Secondary Market DR S Sreenivasa Murthy
33 pages
Class X Icse Syllabus
100% (1)
Class X Icse Syllabus
8 pages
Standard American Accent Worksheets
No ratings yet
Standard American Accent Worksheets
10 pages
Atwood - 1984 - Molten Salt Technology
100% (1)
Atwood - 1984 - Molten Salt Technology
536 pages
Session 2: Personal Professional Development: Pre-Test
No ratings yet
Session 2: Personal Professional Development: Pre-Test
9 pages
What Is The UK Spouse Visa 2023
No ratings yet
What Is The UK Spouse Visa 2023
6 pages
1.1 Apogamy, Apospory and Parthenogenesis
No ratings yet
1.1 Apogamy, Apospory and Parthenogenesis
21 pages
A Guilted Age Apologies For The Past Ashraf A H Rushdy PDF Download
No ratings yet
A Guilted Age Apologies For The Past Ashraf A H Rushdy PDF Download
77 pages
1711954353
No ratings yet
1711954353
58 pages
BC 304 PI Past Papers
No ratings yet
BC 304 PI Past Papers
29 pages
Originators Guide Rules v2.3 Nov 06
No ratings yet
Originators Guide Rules v2.3 Nov 06
171 pages
Education 101 PPT, Jun 2023
No ratings yet
Education 101 PPT, Jun 2023
19 pages
Elektronik Ders Prog 2024 2025guz
No ratings yet
Elektronik Ders Prog 2024 2025guz
24 pages
Johnson and Lester 2021 - Mental Health in Academia - Hacks For Cultivating and Sustaining Wellbeing
100% (1)
Johnson and Lester 2021 - Mental Health in Academia - Hacks For Cultivating and Sustaining Wellbeing
13 pages
Haiku News (Edited by Laurence Stacey and Dick Whyte)
No ratings yet
Haiku News (Edited by Laurence Stacey and Dick Whyte)
152 pages
Business Plan of Rapido Deliveries
No ratings yet
Business Plan of Rapido Deliveries
85 pages
IDE Faith Sharing
No ratings yet
IDE Faith Sharing
9 pages
PAHS 055 Session 4 Disaster Management - 1
No ratings yet
PAHS 055 Session 4 Disaster Management - 1
27 pages
Analysis of Soft Drink
No ratings yet
Analysis of Soft Drink
9 pages
Holiday Homework 8th-1
No ratings yet
Holiday Homework 8th-1
4 pages
DNA Technology PDF
No ratings yet
DNA Technology PDF
7 pages
RCU-75 Remote Controlled Tracked Carriers FAE EN
No ratings yet
RCU-75 Remote Controlled Tracked Carriers FAE EN
2 pages
Legal Basis of International Relation
No ratings yet
Legal Basis of International Relation
4 pages
IPE 4715 Material Handling and Maintenance
No ratings yet
IPE 4715 Material Handling and Maintenance
2 pages
Lista de Libros 2024
No ratings yet
Lista de Libros 2024
2 pages
UA SmartLife Brochure-English
No ratings yet
UA SmartLife Brochure-English
10 pages
Units 15 16 - Exercises
No ratings yet
Units 15 16 - Exercises
4 pages
Daily Express Friday April 29 2011
No ratings yet
Daily Express Friday April 29 2011
80 pages

Parallel Database: Architecture For Parallel Databases. Parallel Query Evaluation Parallelizing Individual Operations

Uploaded by

Parallel Database: Architecture For Parallel Databases. Parallel Query Evaluation Parallelizing Individual Operations

Uploaded by

Parallel database

• More and More Data!

We have databases that hold a high amount of

• Faster and Faster Access!

We have data applications that need to process

10,000s transactions per second!

SINGLE-PROCESSOR DBMS AREN’T UP TO THE JOB!

 Motivation for Parallel DB

 Improves Response Time.

– 1 second to scan a DB of 1,000 records using 1 CPU

 There are many opportunities for parallelism in RDBMS.

 3 main architectures have been proposed for building parallel DBMSs.

Linear speed-up (ideal)

5 CPUs 10 CPUs 16 CPUs

1000/Sec Linear scale-up (ideal)

A relational query execution plan is graph/tree of

 If there are n processors, the i th tuple is assigned to processor i mod n in

 Round-robin partitioning is suitable for efficiently evaluating queries that

Good for equijoins, Good for equijoins, Good to spread load

2. The basic idea is to use parallel data streams.

3. Streams are merged as needed to provide the inputs for a relational

4. The output of an operator is split as needed to parallelize subsequent

5. A parallel evaluation plan consists of a dataflow network of

 bulk loading: if a relation has associated indexes, any sorting of data

 Problem: Data skew

 Solution: “sample” the data at the outset to determine good

A particularly important application of parallel sorting is sorting the data

 Each processor carry out a local join.

 In this case the number of partitions k is chosen to be equal to the number of

The advantage that the output is available in sorted order

Good use of split/merge makes it easier to build parallel versions of sequential

You might also like