0% found this document useful (0 votes)

113 views5 pages

Partitioning PDF

This document discusses different techniques for partitioning data in a database to improve performance. It describes horizontal partitioning which divides a table into subsets based on a key, including range partitioning which separates data into partitions by date ranges and round robin partitioning which assigns data randomly. It also discusses vertical partitioning which splits a table into separate tables based on attributes. The goal of partitioning is to enable parallel processing of distinct data subsets to increase query speed.

Uploaded by

Alexandru Moldovan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

113 views5 pages

Partitioning PDF

Uploaded by

Alexandru Moldovan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

Prof.

Hasso Plattner

A Course in
In-Memory Data Management
The Inner Mechanics
of In-Memory Databases

August 30, 2013

This learning material is part of the reading material for Prof.

Plattner’s online lecture "In-Memory Data Management" taking place at
www.openHPI.de. If you have any questions or remarks regarding the
online lecture or the reading material, please give us a note at openhpi-
[email protected]. We are glad to further improve the material.
Chapter 9
Partitioning

9.1 Definition and Classification

Partitioning is the process of dividing a logical database into distinct inde-

pendent datasets. Partitions are database objects itself and can be managed
independently. The main reason to apply data partitioning is to achieve
data-level parallelism. Data-level parallelism enables performance gains, a
classic example for that is to use a multi-core CPU to process several dis-
tinct data areas in parallel, whereas each core works on a separate partition.
Since partitioning is applied as a technical step to increase the query speed,
it should be transparent1 to the user. In order to ensure the transparency of
the applied partitioning for the end user, a view showing the complete table
as a union of all query results from all involved partitions is required. With
data-level parallelism it is possible to increase performance, availability, or
manageability of datasets. Which of these sometimes contradicting goals is
favored usually depends on the actual use case. Two short examples are
given in Section 9.4. Because data partitioning is a classical NP-complete2
problem, finding the best partition is a complicated task, even if the desired
goal has been clearly outlined [Kar72]. There are mainly two types of data
partitioning: horizontal and vertical partitioning, which will be covered in
detail in the following.

1 Transparent in IT means that something is completely invisible to the user, not that
the user can inspect the implementation through the cover. Except of their e↵ects like
improvements in speed or usability, transparent components should not be noticeable at
all.
2 NP-complete means that the problem can not be solved in polynomial time.

63
64 9 Partitioning

9.2 Vertical Partitioning

Vertical partitioning results in splitting the data into attribute groups with
replicated primary keys. These groups are then distributed across two (or
more) tables. Attributes that are usually accessed together should be in the
same table, in order to increase join and materialization performance. Such
optimizations can only be applied if actual usage data exists, which is one
point why application development should always be based on real customer
data and workloads.

First Last
ID DoB Gender City Country
Name Name

First Last
ID DoB Gender ID City Country
Name Name

Fig. 9.1: Vertical Partitioning

In row-based databases, vertical partitioning is possible in general, but

is not a common approach. Column-based databases automatically support
vertical partitioning, since each column can be regarded as a possible parti-
tion.

9.3 Horizontal Partitioning

Horizontal Partitioning is used more often in classic row-oriented databases.

To apply this partitioning, the table is split into disjoint tuple groups by some
condition. There are several sub-types of horizontal partitioning:
The first partitioning approach we present here is range partitioning , which
separates tables into partitions by a predefined partitioning key, which deter-
mines how individual data rows are distributed to di↵erent partitions. The
partition key can consist of a single key column or multiple key columns.
For example, customers could be partitioned based on their date of birth. If
one is aiming for a number of four partitions, each partition would cover a
range of about 25 years3 . Because the implications of the chosen partition
key depend on the workload, it is not trivial to find the optimal solution.
The second horizontal partitioning type is round robin partitioning. With
round robin, a partitioning server does not use any tuple information as
partitioning criteria, so there is no explicit partition key. The algorithm simply

3Based on the assumption that the companies’ customers mainly live nowadays and are
between 0 and 100 years old
9.3 Horizontal Partitioning 65

Par$$on)1)
Partition 1 Par$$on)3)
Partition 2

First Last First Last

ID DoB Gender City Country ID DoB Gender City Country
Name Name Name Name

3 Nina Burg 1952/12/12 w London UK

Par$$on)2)
Partition 3 Par$$on)4)
Partition 4

First Last First Last

ID DoB Gender City Country ID DoB Gender City Country
Name Name Name Name

1 John Dillan 1943/05/12 m Berlin Germany 2 Peter Black 1982/06/02 m Austin USA

4 Lucy Sehan 1990/01/20 w Jerusalem Israel

Par$$oning)along)the)age:) )Par$$on)1:)) )76))–))100) 5 Ariel Shiva 1984/07/18 w Tokio Japan

) ) )Par$$on)2:)) )51))–))))75)
) ) )Par$$on)3:)) )26))–))))50)
) ) )Par$$on)4:)))) )))))0))–))))25)) 6 Sharon Lokida 1982/02/24 m Madrid Spain

Fig. 9.2: Range Partitioning

assigns tuples turn by turn to each partition, which automatically leads to

an even distribution of entries and should support load-balancing to some
extent.
However, since specific entries might be accessed way more often than
others, an even workload distribution can not be guaranteed. Improvements
from intelligent data co-location or appropriate data-placement are not lever-
aged, because the data distribution is not dependent on the data, but only
on the insertion order.

Partition 1 Partition 3

First Last First Last

ID DoB Gender City Country ID DoB Gender City Country
Name Name Name Name

1 John Dillan 1943/05/12 m Berlin Germany 3 Nina Burg 1952/12/12 w London UK

5 Ariel Shiva 1984/07/18 w Tokio Japan

Partition 2 Partition 4

First Last First Last

ID DoB Gender City Country ID DoB Gender City Country
Name Name Name Name

2 Peter Black 1982/06/02 m Austin USA 4 Lucy Sehan 1990/01/20 w Jerusalem Israel

6 Sharon Lokida 1982/02/24 m Madrid Spain

Fig. 9.3: Round Robin Partitioning

The third horizontal partitioning type is hash-based partitioning. Hash par-

titioning uses a hash function4 to specify the partition assignment for each
row.
The main challenge for hash-based partitioning is to choose a good hash
function, that implicitly achieves locality or access improvements.

4 A hash function maps a potentially large amount of data with often variable length to
a smaller value of fixed length. In the figurative sense, hash functions generate a digital
fingerprint of the input data.
66 REFERENCES

Partition 1 Partition 3

First Last First Last

ID DoB Gender City Country hash(Country) ID DoB Gender City Country hash(Country)
Name Name Name Name

4 Lucy Sehan 1990/01/20 w Jerusalem Israel 0x00 3 Nina Burg 1952/12/12 w London UK 0x03

Partition 2 Partition 4

First Last First Last

ID DoB Gender City Country hash(Country) ID DoB Gender City Country hash(Country)
Name Name Name Name

1 John Dillan 1943/05/12 m Berlin Germany 0x01 2 Peter Black 1982/06/02 m Austin USA 0x02

5 Ariel Shiva 1984/07/18 w Tokio Japan 0x02

Fig. 9.4: Hash-Based Partitioning

The last partitioning type is semantic partitioning. It uses knowledge about

the application to split the data. For example, a database can be partitioned
according to the life-cycle of a sales order. All tables required for the sales
order represent one or more di↵erent life-cycle steps, such as creation, pur-
chase, release, delivery, or dunning of a product. One possibility for suitable
partitioning is to put all tables that belong to a certain life-cycle step into a
separate partition.

9.4 Choosing a Suitable Partitioning Strategy

There are number of di↵erent optimization goals to be considered while

choosing a suitable partitioning strategy. For instance, when optimizing for
performance, it makes sense to have tuples of di↵erent tables, that are likely
to be joined for further processing, on one server. This way the join can be
done much faster due to optimal data locality, because there is no delay for
transferring the data across the network. In contrast, for statistical queries
like counts, tuples from one table should be distributed across as many nodes
as possible in order to benefit from parallel processing.
To sum up, the best partitioning strategy depends very much on the
specific use case.

9.5 References

[Kar72] R. Karp. Reducibility among combinatorial problems. In R. Miller

and J. Thatcher, editors, Complexity of Computer Computations, pages
85–103. Plenum Press, 1972.

PaperCrafter - Issue 168, February 2022
100% (4)
PaperCrafter - Issue 168, February 2022
92 pages
K P P Abhilash Emergency Medicine Best Practices at CMC EMAC 2018
100% (1)
K P P Abhilash Emergency Medicine Best Practices at CMC EMAC 2018
531 pages
White Topping Report
73% (11)
White Topping Report
21 pages
Partitioning in Oracle 9i
100% (8)
Partitioning in Oracle 9i
19 pages
Boom Placer Spare Parts Manual Sp1420 RMC - 80009056 - 0, Edition - Dec '18
No ratings yet
Boom Placer Spare Parts Manual Sp1420 RMC - 80009056 - 0, Edition - Dec '18
104 pages
Database Partitioning With MySQL
No ratings yet
Database Partitioning With MySQL
6 pages
Oracle Partitioning - Yesterday, Today, and Tomorrow
100% (1)
Oracle Partitioning - Yesterday, Today, and Tomorrow
53 pages
MSS 064 Rev.00 Final
No ratings yet
MSS 064 Rev.00 Final
33 pages
Ab Initio - V1.6
No ratings yet
Ab Initio - V1.6
50 pages
Parallel Databases
No ratings yet
Parallel Databases
19 pages
TOS TLE 8 Agricrop For Sharing
No ratings yet
TOS TLE 8 Agricrop For Sharing
2 pages
Class Actvity 1 Answers
55% (11)
Class Actvity 1 Answers
10 pages
Data Mining Questions
No ratings yet
Data Mining Questions
9 pages
Informatica Partitions
No ratings yet
Informatica Partitions
11 pages
Oracle Partitioning For Developers
No ratings yet
Oracle Partitioning For Developers
70 pages
Oracle Partitioning in Oracle Database 11g
No ratings yet
Oracle Partitioning in Oracle Database 11g
47 pages
Data Partitioning
No ratings yet
Data Partitioning
5 pages
A Comprehensive Study of Main-Memory Partitioning and Its Application To Large-Scale Comparison - and Radix-Sort (Sigmod14i)
No ratings yet
A Comprehensive Study of Main-Memory Partitioning and Its Application To Large-Scale Comparison - and Radix-Sort (Sigmod14i)
12 pages
Where To Leave The Data ?: - Parallel Systems - Scalable Distributed Data Structures - Dynamic Hash Table (P2P)
No ratings yet
Where To Leave The Data ?: - Parallel Systems - Scalable Distributed Data Structures - Dynamic Hash Table (P2P)
39 pages
Where To Leave The Data ?: - Parallel Systems - Scalable Distributed Data Structures - Dynamic Hash Table (P2P)
No ratings yet
Where To Leave The Data ?: - Parallel Systems - Scalable Distributed Data Structures - Dynamic Hash Table (P2P)
39 pages
Things You Always Wanted To Know About Oracle Partitioning
No ratings yet
Things You Always Wanted To Know About Oracle Partitioning
43 pages
Slide 5
No ratings yet
Slide 5
43 pages
A Comprehensive Guide To Oracle Partitioning With Samples
No ratings yet
A Comprehensive Guide To Oracle Partitioning With Samples
36 pages
2: Data Model: Creating An E Cient Data Model For Highly-Loaded Applications
No ratings yet
2: Data Model: Creating An E Cient Data Model For Highly-Loaded Applications
83 pages
5 Partitioning
No ratings yet
5 Partitioning
23 pages
p64 Stonebraker PDF
No ratings yet
p64 Stonebraker PDF
8 pages
Partitioning Strategy
No ratings yet
Partitioning Strategy
17 pages
Partitioning
No ratings yet
Partitioning
8 pages
Partitions: Creating A Range-Partitioned Table
No ratings yet
Partitions: Creating A Range-Partitioned Table
3 pages
Table Partitioning:: Secret Weapon For Big Data Problems
No ratings yet
Table Partitioning:: Secret Weapon For Big Data Problems
46 pages
Third Year Engineering: 21BTCS604 - Advanced DBMS
No ratings yet
Third Year Engineering: 21BTCS604 - Advanced DBMS
51 pages
O9ir2 Partitioning TWP
No ratings yet
O9ir2 Partitioning TWP
7 pages
Partitioning - DW
No ratings yet
Partitioning - DW
14 pages
20762C 03
No ratings yet
20762C 03
29 pages
Learn How To Partition in Oracle 9i Release 2: Title Slide
No ratings yet
Learn How To Partition in Oracle 9i Release 2: Title Slide
31 pages
Data Partition Survey
No ratings yet
Data Partition Survey
23 pages
2 Parallel Databases
No ratings yet
2 Parallel Databases
44 pages
Partitioning Method
No ratings yet
Partitioning Method
8 pages
Ads Mse
No ratings yet
Ads Mse
22 pages
Database Partitioning A Review Paper
No ratings yet
Database Partitioning A Review Paper
4 pages
Deep Dive Dynamo DB
No ratings yet
Deep Dive Dynamo DB
88 pages
Erfo Rma Nce With L5. 1 An D5. 5 Tion Ing: Giuseppe Maxia Mysql Community Team Lead Sun Microsystems
No ratings yet
Erfo Rma Nce With L5. 1 An D5. 5 Tion Ing: Giuseppe Maxia Mysql Community Team Lead Sun Microsystems
103 pages
An Optimized Scheme For Vertical Partitioning of A
No ratings yet
An Optimized Scheme For Vertical Partitioning of A
8 pages
DB Partitioning
No ratings yet
DB Partitioning
11 pages
Lec 18 Notes
No ratings yet
Lec 18 Notes
1 page
Unit 4 - Notes-1
No ratings yet
Unit 4 - Notes-1
17 pages
3 RD Unit Partioning
No ratings yet
3 RD Unit Partioning
3 pages
Performance Tuning - Partitioning
No ratings yet
Performance Tuning - Partitioning
11 pages
RIGZONE - How Does Coiled Tubing Work
No ratings yet
RIGZONE - How Does Coiled Tubing Work
2 pages
Basics of Partitioning
100% (1)
Basics of Partitioning
2 pages
CH14
No ratings yet
CH14
43 pages
U4 - 5 I o Parallelism
No ratings yet
U4 - 5 I o Parallelism
8 pages
Partitioning
No ratings yet
Partitioning
3 pages
How To Partition PostgreSQL Database
No ratings yet
How To Partition PostgreSQL Database
8 pages
IO Parallelism
No ratings yet
IO Parallelism
4 pages
Oracle Performance Tuning - Oracle Partitioning - Introduction
No ratings yet
Oracle Performance Tuning - Oracle Partitioning - Introduction
57 pages
ADB25 Lab 5
No ratings yet
ADB25 Lab 5
6 pages
NATO's Cyber Strategies and Wireless Warfare in The Information Age
No ratings yet
NATO's Cyber Strategies and Wireless Warfare in The Information Age
7 pages
Unit I
No ratings yet
Unit I
43 pages
Partitioning For Database Performance
No ratings yet
Partitioning For Database Performance
3 pages
Esports Year Book PDF
No ratings yet
Esports Year Book PDF
147 pages
Esports Year Book PDF
No ratings yet
Esports Year Book PDF
147 pages
Unit 4
No ratings yet
Unit 4
18 pages
I/O Parallelism Interquery Parallelism Intraquery Parallelism Intraoperation Parallelism Interoperation Parallelism Design of Parallel Systems
No ratings yet
I/O Parallelism Interquery Parallelism Intraquery Parallelism Intraoperation Parallelism Interoperation Parallelism Design of Parallel Systems
42 pages
American Manufacturing Aw1122bcd Parts Book
100% (1)
American Manufacturing Aw1122bcd Parts Book
6 pages
CDA C2 R 074 en File 68.en
No ratings yet
CDA C2 R 074 en File 68.en
3 pages
Simulation and Performance Evaluation of Battery Based Stand-Alone Photovoltaic Systems of Malawi
No ratings yet
Simulation and Performance Evaluation of Battery Based Stand-Alone Photovoltaic Systems of Malawi
89 pages
Oracle 11g Partitioning
No ratings yet
Oracle 11g Partitioning
11 pages
Partitioning in Oracle
No ratings yet
Partitioning in Oracle
5 pages
Resources Tab
No ratings yet
Resources Tab
4 pages
Bus Ethics q3 Mod3 Code of Ethics in Business Final
No ratings yet
Bus Ethics q3 Mod3 Code of Ethics in Business Final
30 pages
Changes in Hardware: 4.1 Memory Cells
No ratings yet
Changes in Hardware: 4.1 Memory Cells
11 pages
Cyber Crime Laboratory Manual 2022
No ratings yet
Cyber Crime Laboratory Manual 2022
7 pages
Engineering The Mind
No ratings yet
Engineering The Mind
9 pages
Module1 - Magnetism
No ratings yet
Module1 - Magnetism
35 pages
ANCHORE
No ratings yet
ANCHORE
2 pages
Tax Problems
No ratings yet
Tax Problems
3 pages
Measuring Marketing Productivity: Current Knowledge and Future Directions
No ratings yet
Measuring Marketing Productivity: Current Knowledge and Future Directions
14 pages
Result
No ratings yet
Result
1 page
Major Assignment 1
No ratings yet
Major Assignment 1
4 pages
Message Analyzer FAQ and Known Issues
No ratings yet
Message Analyzer FAQ and Known Issues
11 pages
1 s2.0 S0925527307001892 Main PDF
No ratings yet
1 s2.0 S0925527307001892 Main PDF
17 pages
Medical Appointment Application: Acta Electronica Malaysia (AEM)
No ratings yet
Medical Appointment Application: Acta Electronica Malaysia (AEM)
5 pages
1 s2.0 S0925527311003872 Main PDF
No ratings yet
1 s2.0 S0925527311003872 Main PDF
9 pages
Spectroscopic Techniques
No ratings yet
Spectroscopic Techniques
38 pages
New Requirements For Enterprise Computing: 2.1 Processing of Event Data
No ratings yet
New Requirements For Enterprise Computing: 2.1 Processing of Event Data
8 pages
FV - Pitch Deck - Company Name
No ratings yet
FV - Pitch Deck - Company Name
12 pages
Update PDF
No ratings yet
Update PDF
6 pages
A Course in In-Memory Data Management: Prof. Hasso Plattner
No ratings yet
A Course in In-Memory Data Management: Prof. Hasso Plattner
12 pages
Lesson-Plan 1
No ratings yet
Lesson-Plan 1
2 pages
A Course in In-Memory Data Management: Prof. Hasso Plattner
No ratings yet
A Course in In-Memory Data Management: Prof. Hasso Plattner
9 pages
Insert PDF
No ratings yet
Insert PDF
7 pages
A Course in In-Memory Data Management: Prof. Hasso Plattner
No ratings yet
A Course in In-Memory Data Management: Prof. Hasso Plattner
6 pages
A Course in In-Memory Data Management: Prof. Hasso Plattner
No ratings yet
A Course in In-Memory Data Management: Prof. Hasso Plattner
6 pages
A Course in In-Memory Data Management: Prof. Hasso Plattner
No ratings yet
A Course in In-Memory Data Management: Prof. Hasso Plattner
5 pages
Satyam Cnlu Torts Roughdraft
No ratings yet
Satyam Cnlu Torts Roughdraft
4 pages
A Course in In-Memory Data Management: Prof. Hasso Plattner
No ratings yet
A Course in In-Memory Data Management: Prof. Hasso Plattner
4 pages
Enterprise Application Characteristics: 3.1 Diverse Applications
No ratings yet
Enterprise Application Characteristics: 3.1 Diverse Applications
4 pages
Key To Corrections - LEVEL 2 MODULE 3
No ratings yet
Key To Corrections - LEVEL 2 MODULE 3
10 pages
What Is Budgetary Cycle
No ratings yet
What Is Budgetary Cycle
6 pages
Visual Effects (VFX) Market 2034: Forecast & Analysis
No ratings yet
Visual Effects (VFX) Market 2034: Forecast & Analysis
10 pages
All About Bohol
No ratings yet
All About Bohol
5 pages
Master Copy - ARCH 2023-2024
No ratings yet
Master Copy - ARCH 2023-2024
1 page
AI-Powered Bitcoin Trading: Developing an Investment Strategy with Artificial Intelligence
From Everand
AI-Powered Bitcoin Trading: Developing an Investment Strategy with Artificial Intelligence
Eoghan Leahy
No ratings yet

Partitioning PDF

Uploaded by

Partitioning PDF

Uploaded by

Prof.

August 30, 2013

This learning material is part of the reading material for Prof.

9.1 Definition and Classification

Partitioning is the process of dividing a logical database into distinct inde-

9.2 Vertical Partitioning

Fig. 9.1: Vertical Partitioning

In row-based databases, vertical partitioning is possible in general, but

9.3 Horizontal Partitioning

Horizontal Partitioning is used more often in classic row-oriented databases.

First Last First Last

3 Nina Burg 1952/12/12 w London UK

First Last First Last

4 Lucy Sehan 1990/01/20 w Jerusalem Israel

Par$$oning)along)the)age:) )Par$$on)1:)) )76))–))100) 5 Ariel Shiva 1984/07/18 w Tokio Japan

Fig. 9.2: Range Partitioning

assigns tuples turn by turn to each partition, which automatically leads to

First Last First Last

1 John Dillan 1943/05/12 m Berlin Germany 3 Nina Burg 1952/12/12 w London UK

5 Ariel Shiva 1984/07/18 w Tokio Japan

First Last First Last

6 Sharon Lokida 1982/02/24 m Madrid Spain

Fig. 9.3: Round Robin Partitioning

The third horizontal partitioning type is hash-based partitioning. Hash par-

First Last First Last

First Last First Last

5 Ariel Shiva 1984/07/18 w Tokio Japan 0x02

Fig. 9.4: Hash-Based Partitioning

The last partitioning type is semantic partitioning. It uses knowledge about

9.4 Choosing a Suitable Partitioning Strategy

There are number of di↵erent optimization goals to be considered while

[Kar72] R. Karp. Reducibility among combinatorial problems. In R. Miller

You might also like