0% found this document useful (0 votes)

78 views6 pages

Assignment - 10 Parallel Sorting Techniques: Range-Partitioning Sort

Range-partitioning sort involves two steps: 1) range partitioning the relation across multiple processors, where each processor stores a partition on local disk, and 2) each processor locally sorts its partition without interaction. Parallel external merge sort first locally sorts partitions on each disk, then range partitions and merges the sorted runs in parallel across processors to produce the final sorted output. Both techniques distribute the sorting workload in parallel across multiple processors to improve sorting performance for large relations.

Uploaded by

project mission

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

78 views6 pages

Assignment - 10 Parallel Sorting Techniques: Range-Partitioning Sort

Uploaded by

project mission

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 6

ASSIGNMENT – 10

PARALLEL SORTING TECHNIQUES

Range-Partitioning Sort

Range-partitioning sort works in two steps: ﬁrst range partitioning the relation, then sorting each partition
separately. When we sort by range partitioning the relation, it is not necessary to range-partition the relation on the
same set of processors or disks as those on which that relation is stored. Suppose that we choose processors P 0, P1,
…., Pm, where m < n, to sort the relation. There are two steps involved in this operation:

1. Redistribute the tuples in the relation, using a range-partition strategy, so that all tuples that lie within the i th
range are sent to processor P i, which stores the relation temporarily on disk D i. To implement range partitioning, in
parallel every processor reads the tuples from its disk and sends the tuples to their destination processors. Each
processor P0, P1,..., Pm also receives tuples belonging to its partition, and stores them locally. This step requires disk
I/O and communication overhead.

2. Each of the processors sorts its partition of the relation locally, without interaction with the other
processors. Each processor executes the same operation—namely, sorting—on a different dataset. (Execution of the
same operation in parallel on different sets of data is called data parallelism.) The ﬁnal merge operation is trivial,
because the range partitioning in the ﬁrst phase ensures that, for 1 ≤ i < j ≤ m, the key values in processor P i are all
less than the key values in Pj.

We must do range partitioning with a good range-partition vector,so that each partition will have approximately the
same number of tuples. Virtual processor partitioning can also be used to reduce skew.

Ex:

Employee (Emp_ID, EName, Salary)

Assume that relation Employee is permanently partitioned using Round-robin technique into 3 disks D0, D1, and
D2 which are associated with processors P0, P1, and P2. At processors P0, P1, and P2, the relations are named
Employee0, Employee1 and Employee2 respectively.

SELECT * FROM Employee ORDER BY Salary;

Step 1:
At first we have to identify a range vector v on the Salary attribute. The range vector is of the form v[v0, v1, …, vn-2].
For our example, let us assume the following range vector v[14000, 24000]
This range vector represents 3 ranges, range 0 (14000 and less), range 1 (14001 to 24000) and range 2 (24001 and
more).
Redistribute the relations Employee0, Employee1 and Employee2 using these range vectors into 3 disks temporarily.
After this distribution disk 0 will have range 0 records (i.e, records with salary value less than or equal to 14000), disk
1 will have range 1 records (i.e, records with salary value greater than 14000 and less than or equal to 24000), and
disk 2 will have range 2 records (i.e, records with salary value greater than 24000).
This redistribution according to range vector v is represented in Figure 2 as links to all the disks from all the relations.
Temp_Employee0, Temp_Employee1, and Temp_Employee2, are the relations after successful redistribution. These
tables are stored temporarily in disks D 0, D1, and D2. (They can also be stored in Main memories (M 0, M1, M2) if they
fit into RAM).

Step 2:
Now, we got temporary relations at all the disks after redistribution.
At this point, all the processors sort the data assigned to them in ascending order of Salary individually. The process
of performing the same operation in parallel on different sets of data is called Data Parallelism.

Final Result:
After the processors completed the sorting, we can simply collect the data from different processors and merge
them. This merge process is straightforward as we have data already sorted for every range. Hence, collecting sorted
records from partition 0, partition 1 and partition 2 and merging them will give us final sorted output.

Parallel External Merge-Sort

Parallelexternalsort–mergeisanalternativetorangepartitioning.Supposethat
arelationhasalreadybeenpartitionedamongdisks D 0, D1,...,Dn−1 (itdoesnot matter how the relation has been
partitioned). Parallel external sort–merge then works this way:

1. Each processor Pi locally sorts the data on disk Di.

2. The system then merges the sorted runs on each processor to get the ﬁnal sorted output.

The mergingof the sorted runs in step 2 can be parallelizedby this sequence of actions:

1. The system range-partitions the sorted partitions at each processor Pi (all

bythesamepartitionvector)acrosstheprocessors P0, P1,...,Pm−1.Itsends
thetuplesinsortedorder,sothateachprocessorreceivesthetuplesinsorted streams.
2. Each processor Pi performs a merge on the streams as they are received, to get a single sorted run.
3. The system concatenates the sorted runs on processors P 0, P1,...,Pm−1 to get the ﬁnal result.

As described, this sequence of actions results in an interesting form of execution skew, since at ﬁrst every processor
sends all blocks of partition 0 to P 0, then every processor sends all blocks of partition 1 to P 1, and so on. Thus, while
sending happens in parallel, receiving tuples becomes sequential. First only P 0 receives tuples, then only P 1 receives
tuples, and so on. To avoid this problem, eachprocessorrepeatedlysendsablockofdatatoeachpartition.Inotherwords,
each processors ends the ﬁrst block of every partition, then sends the second block of every partition, and so on. As
a result, all processors receive data in parallel. Some machines, such as the Teradata Purpose-Built Platform Family
machines, use specialized hardware to perform merging. The BYNET interconnection network in the Tera data
machines can merge output from multiple processors to give a single sorted output.

Ex:

Employee (Emp_ID, EName, Salary)

Assume that relation Employee is permanently partitioned using Round-robin technique into 3 disks D0, D1, and
D2 which are associated with processors P 0, P1, and P2. At processors P0, P1, and P2, the relations are named
Employee0, Employee1 and Employee2 respectively.

SELECT * FROM Employee ORDER BY Salary;

Step 1:
Sort the data stored in every partition (every disk) using the ordering attribute Salary. (Sorting of data in every
partition is done temporarily). At this stage every Employee i contains salary values of range minimum to maximum.

Step 2:
We have to identify a range vector v on the Salary attribute. The range vector is of the form v[v0, v1, …, vn-2]. For
our example, let us assume the following range vector v[14000, 24000]

This range vector represents 3 ranges, range 0 (14000 and less), range 1 (14001 to 24000) and range 2 (24001 and
more).

Redistribute every partition (Employee0, Employee1 and Employee2) using these range vectors into 3 disks
temporarily.

Step 3:
Actually, the above said distribution is executed at all processors in parallel such that processors P0, P1, and P2 are
sending the first partition of Employee 0, 1, and 2 to disk 0. Upon receiving the records from various partitions, the
receiving processor P0 merges the sorted data. This is shown in Figure 4.
The above said process is done at all processors for different partitions.

Step 4:
The final concatenation of sorted data from all the disks is trivial.
Conclusion:

In this way we have learn parallel sorting techniques.

2 Parallel Databases
No ratings yet
2 Parallel Databases
44 pages
Lecture 1 Parallel Databases
No ratings yet
Lecture 1 Parallel Databases
30 pages
Inter and Intra Query Parallelism
No ratings yet
Inter and Intra Query Parallelism
1 page
PDB Partitioning
No ratings yet
PDB Partitioning
11 pages
Intraquery Parallelism Intraoperation Parallelism Interoperation Parallelism Design of Parallel Systems
No ratings yet
Intraquery Parallelism Intraoperation Parallelism Interoperation Parallelism Design of Parallel Systems
29 pages
Unit 2 - 2.2 (Basic Algorithms)
No ratings yet
Unit 2 - 2.2 (Basic Algorithms)
8 pages
Parallel Quicksort Implementation Using Mpi and Pthreads: Puneet Kataria RUID - 117004233
No ratings yet
Parallel Quicksort Implementation Using Mpi and Pthreads: Puneet Kataria RUID - 117004233
14 pages
PDC
No ratings yet
PDC
14 pages
HPC2
No ratings yet
HPC2
22 pages
Chapter - 4 - Algorithms For Query Processing and Optimization
No ratings yet
Chapter - 4 - Algorithms For Query Processing and Optimization
119 pages
Parallel Database: Architecture For Parallel Databases. Parallel Query Evaluation Parallelizing Individual Operations
No ratings yet
Parallel Database: Architecture For Parallel Databases. Parallel Query Evaluation Parallelizing Individual Operations
27 pages
CH 22
No ratings yet
CH 22
34 pages
CH14
No ratings yet
CH14
43 pages
A Cooperative Sort Algorithm Based On Indexing
No ratings yet
A Cooperative Sort Algorithm Based On Indexing
6 pages
Parallel & Distributed Databases: C S 5 6 1 - S P R I N G 2 0 1 2 Wpi, Mohamed Eltabakh
No ratings yet
Parallel & Distributed Databases: C S 5 6 1 - S P R I N G 2 0 1 2 Wpi, Mohamed Eltabakh
23 pages
ParallelDBs PDF
No ratings yet
ParallelDBs PDF
23 pages
Unit I
No ratings yet
Unit I
43 pages
CAS CS 460/660 Introduction To Database Systems Query Evaluation I
No ratings yet
CAS CS 460/660 Introduction To Database Systems Query Evaluation I
32 pages
Parallel Databases
No ratings yet
Parallel Databases
19 pages
Query Processing + Optimization: Outline: Operator Evaluation Strategies
No ratings yet
Query Processing + Optimization: Outline: Operator Evaluation Strategies
53 pages
Notes On DBMS Internals: Preamble
No ratings yet
Notes On DBMS Internals: Preamble
20 pages
Ch22a ParallelDBs
No ratings yet
Ch22a ParallelDBs
23 pages
Parallel Quick Sort Without Merge
No ratings yet
Parallel Quick Sort Without Merge
19 pages
Parallel Merge Sort With MPI
No ratings yet
Parallel Merge Sort With MPI
12 pages
Datastage Fundamentals: January 2008 Module 01: Introduction Slide 1-1
No ratings yet
Datastage Fundamentals: January 2008 Module 01: Introduction Slide 1-1
42 pages
Notes On DBMS Internals: Preamble
No ratings yet
Notes On DBMS Internals: Preamble
27 pages
Sorting & Aggregations: Intro To Database Systems Andy Pavlo
No ratings yet
Sorting & Aggregations: Intro To Database Systems Andy Pavlo
57 pages
10 Sorting
No ratings yet
10 Sorting
20 pages
Group A Assignment No: 2 (B) : Title of The Assignment
No ratings yet
Group A Assignment No: 2 (B) : Title of The Assignment
6 pages
M.C.a. (Sem - IV) Paper - IV - Adavanced Database Techniques
No ratings yet
M.C.a. (Sem - IV) Paper - IV - Adavanced Database Techniques
114 pages
Introduction To Query Processing and Query Optimization Techniques
No ratings yet
Introduction To Query Processing and Query Optimization Techniques
77 pages
Daa Miniproject
No ratings yet
Daa Miniproject
20 pages
Chapter - 3 Algorithms For Query Processing and Optimization PDF
No ratings yet
Chapter - 3 Algorithms For Query Processing and Optimization PDF
100 pages
Algorithms For Query Processing and Optimization
No ratings yet
Algorithms For Query Processing and Optimization
77 pages
10 Sorting
No ratings yet
10 Sorting
3 pages
Spark Pyspark Day 21
No ratings yet
Spark Pyspark Day 21
22 pages
Parallel Databases
No ratings yet
Parallel Databases
11 pages
I/O Parallelism Interquery Parallelism Intraquery Parallelism Intraoperation Parallelism Interoperation Parallelism Design of Parallel Systems
No ratings yet
I/O Parallelism Interquery Parallelism Intraquery Parallelism Intraoperation Parallelism Interoperation Parallelism Design of Parallel Systems
42 pages
Parallel Algorithm & Sorting in Parallel Programming: Submitted By:-Submitted To: - Dalpat Songra
No ratings yet
Parallel Algorithm & Sorting in Parallel Programming: Submitted By:-Submitted To: - Dalpat Songra
42 pages
Linear Array: Jyotika Jain
No ratings yet
Linear Array: Jyotika Jain
22 pages
Lecture 10: Parallel Query Evaluation: CS 838: Foundations of Data Management Spring 2016
No ratings yet
Lecture 10: Parallel Query Evaluation: CS 838: Foundations of Data Management Spring 2016
4 pages
Fast Parallel Sort Sub-Rows Qs
No ratings yet
Fast Parallel Sort Sub-Rows Qs
4 pages
BCS Topic
No ratings yet
BCS Topic
66 pages
Parallel Quick Sort Algorithm
No ratings yet
Parallel Quick Sort Algorithm
8 pages
DAA Lab Manual
No ratings yet
DAA Lab Manual
46 pages
HPC Report
No ratings yet
HPC Report
13 pages
Pquick
No ratings yet
Pquick
19 pages
Online Instructions For Chapter 2: Divide-And-Conquer: Algorithms Analysis and Design (CO3031)
No ratings yet
Online Instructions For Chapter 2: Divide-And-Conquer: Algorithms Analysis and Design (CO3031)
16 pages
Introduction To DBMS
No ratings yet
Introduction To DBMS
37 pages
QueryProcessing Sorting
No ratings yet
QueryProcessing Sorting
44 pages
Updated DAA Mini Project - Docx (A)
No ratings yet
Updated DAA Mini Project - Docx (A)
24 pages
Performance Analysis of Parallel Sorting Algorithms Using MPI
No ratings yet
Performance Analysis of Parallel Sorting Algorithms Using MPI
6 pages
Third Year Engineering: 21BTCS604 - Advanced DBMS
No ratings yet
Third Year Engineering: 21BTCS604 - Advanced DBMS
51 pages
Parallel Sorting Algorithms
100% (1)
Parallel Sorting Algorithms
7 pages
TDD: Topics in Distributed Databases: Parallel Database Management Systems
No ratings yet
TDD: Topics in Distributed Databases: Parallel Database Management Systems
38 pages
Application of Defuzzification in Cloud Computing: Method: Center of Gravity (COG) / Centroid of Area (COA) Method
No ratings yet
Application of Defuzzification in Cloud Computing: Method: Center of Gravity (COG) / Centroid of Area (COA) Method
5 pages
Stack
No ratings yet
Stack
1 page
Bytedance Ai Lab Ava Challenge 2019 Technical Report
No ratings yet
Bytedance Ai Lab Ava Challenge 2019 Technical Report
2 pages
Assignment 11
No ratings yet
Assignment 11
2 pages
Assi 11
No ratings yet
Assi 11
2 pages
Genetic Algorithm
No ratings yet
Genetic Algorithm
6 pages
Title: Install and Demonstrate Oracle Parallel Database: Oracle Real
No ratings yet
Title: Install and Demonstrate Oracle Parallel Database: Oracle Real
2 pages
Static Keyword in Java: Static Is A Non-Access Modifier in Java Which Is Applicable For
No ratings yet
Static Keyword in Java: Static Is A Non-Access Modifier in Java Which Is Applicable For
6 pages
Is Your Browser Running HTTP Version 1
No ratings yet
Is Your Browser Running HTTP Version 1
2 pages
Format of Report
No ratings yet
Format of Report
9 pages
Database Basic
No ratings yet
Database Basic
1 page
Aneka Platform: Shyam Krishna Khadka
No ratings yet
Aneka Platform: Shyam Krishna Khadka
27 pages
Amazon Redshift: Database - PRN NO-2017BTECS00041
No ratings yet
Amazon Redshift: Database - PRN NO-2017BTECS00041
9 pages
Database Basic
No ratings yet
Database Basic
1 page
Varchar2 (40), Grade Number (2), Salary Number (10,2), Date - of - Joining Date)
No ratings yet
Varchar2 (40), Grade Number (2), Salary Number (10,2), Date - of - Joining Date)
1 page
Application of Data Structure
No ratings yet
Application of Data Structure
2 pages
Experiment No 8: Study and Implementation of Stored Procedures
No ratings yet
Experiment No 8: Study and Implementation of Stored Procedures
4 pages
Experiment-5 Solution: 10. Select D - Name From Doctor Where D - Specs "Cardiologist" 11
No ratings yet
Experiment-5 Solution: 10. Select D - Name From Doctor Where D - Specs "Cardiologist" 11
2 pages
Modue 2 - APPLICATIONS OF BPN
No ratings yet
Modue 2 - APPLICATIONS OF BPN
40 pages
Module 1.6
No ratings yet
Module 1.6
53 pages
Covid News: Indigenous (Native) Wisdom
No ratings yet
Covid News: Indigenous (Native) Wisdom
4 pages
Anadaman White Toothed Shrew: 1. Classification
No ratings yet
Anadaman White Toothed Shrew: 1. Classification
2 pages
D 64 B CFC Carpooling
No ratings yet
D 64 B CFC Carpooling
13 pages
eSEC01 NetSec
No ratings yet
eSEC01 NetSec
24 pages
Offensive Security Certified Professional OSCP
No ratings yet
Offensive Security Certified Professional OSCP
1 page
Security
No ratings yet
Security
235 pages
Planning Chemical Syntheses With Deep Neural Networks and Symbolic AI - Presentation
No ratings yet
Planning Chemical Syntheses With Deep Neural Networks and Symbolic AI - Presentation
18 pages
X20CP13xx-en V1.28
No ratings yet
X20CP13xx-en V1.28
58 pages
Solutions To COMP9334 Week 5 Sample Problems: Problem 1
No ratings yet
Solutions To COMP9334 Week 5 Sample Problems: Problem 1
8 pages
An Introduction To Biometric Recognition
No ratings yet
An Introduction To Biometric Recognition
17 pages
Free Valentine Homework Pass Printable
100% (1)
Free Valentine Homework Pass Printable
5 pages
Resume: Professional Overview
No ratings yet
Resume: Professional Overview
4 pages
68009640001-BF Enus MOTOTRBO DEP 450 Non Keypad Portable Radio User Guide
No ratings yet
68009640001-BF Enus MOTOTRBO DEP 450 Non Keypad Portable Radio User Guide
205 pages
OAT Unit-5
No ratings yet
OAT Unit-5
8 pages
360 Ring of Light Error Codes Simplified RevA Oct20'05
No ratings yet
360 Ring of Light Error Codes Simplified RevA Oct20'05
7 pages
Cloud - Business Profile
No ratings yet
Cloud - Business Profile
14 pages
Ecotruck Api-Instrukcja
No ratings yet
Ecotruck Api-Instrukcja
5 pages
Cyberstalking and Cyberbullying: Effects and Prevention Measures
No ratings yet
Cyberstalking and Cyberbullying: Effects and Prevention Measures
8 pages
3.2.8-Packet-Tracer - Investigate-A-Vlan-Implementation
No ratings yet
3.2.8-Packet-Tracer - Investigate-A-Vlan-Implementation
3 pages
EDA - Ciclo2022 - 2 - EstadisticaByvariada
No ratings yet
EDA - Ciclo2022 - 2 - EstadisticaByvariada
9 pages
JD For Grey Orange
No ratings yet
JD For Grey Orange
1 page
190-01499-00 - F - Pilot Guide
No ratings yet
190-01499-00 - F - Pilot Guide
34 pages
Protecting Personal Data in Epidemiological Research: Datashield and Uk Law
No ratings yet
Protecting Personal Data in Epidemiological Research: Datashield and Uk Law
9 pages
Adobe Media Encoder Log-Last
No ratings yet
Adobe Media Encoder Log-Last
2 pages
SUPPLEMENT System of Linear Equation by Graphing Method
No ratings yet
SUPPLEMENT System of Linear Equation by Graphing Method
4 pages
Hsslive-XI-CS Chap1-The Discipline of Computing
No ratings yet
Hsslive-XI-CS Chap1-The Discipline of Computing
3 pages
Wonder
100% (3)
Wonder
220 pages
E SBC l1 GLP External
No ratings yet
E SBC l1 GLP External
91 pages
A. Two Subsequences: Codeforces Round #751 (Div. 2)
No ratings yet
A. Two Subsequences: Codeforces Round #751 (Div. 2)
4 pages
Extensive Study of Cloud Computing Technologies, Threats and Solutions Prospective
No ratings yet
Extensive Study of Cloud Computing Technologies, Threats and Solutions Prospective
15 pages
Logcat CSC Update Log
No ratings yet
Logcat CSC Update Log
2,493 pages
Eproc Tenders
No ratings yet
Eproc Tenders
104 pages
UNIT-4 Introduction To IPR (IPR-Enginering)
No ratings yet
UNIT-4 Introduction To IPR (IPR-Enginering)
18 pages

Assignment - 10 Parallel Sorting Techniques: Range-Partitioning Sort

Uploaded by

Assignment - 10 Parallel Sorting Techniques: Range-Partitioning Sort

Uploaded by

ASSIGNMENT – 10

PARALLEL SORTING TECHNIQUES

Employee (Emp_ID, EName, Salary)

SELECT * FROM Employee ORDER BY Salary;

Parallel External Merge-Sort

1. Each processor Pi locally sorts the data on disk Di.

1. The system range-partitions the sorted partitions at each processor Pi (all

Employee (Emp_ID, EName, Salary)

SELECT * FROM Employee ORDER BY Salary;

In this way we have learn parallel sorting techniques.

You might also like