0% found this document useful (0 votes)
39 views36 pages

PostgresChina2018 刘东明 PostgreSQL并行查询

The document discusses parallel query in PostgreSQL. It covers parallel features introduced over PostgreSQL versions 9.4 to 11, including background workers, dynamic shared memory, executor nodes for parallel operations. It then discusses how parallel databases can improve performance by parallelizing queries across multiple CPUs. Finally, it explains some key parallel queries in PostgreSQL like parallel sequential scan, parallel index scan, parallel bitmap heap scan, and parallel hash join.

Uploaded by

Thoa Nhu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
39 views36 pages

PostgresChina2018 刘东明 PostgreSQL并行查询

The document discusses parallel query in PostgreSQL. It covers parallel features introduced over PostgreSQL versions 9.4 to 11, including background workers, dynamic shared memory, executor nodes for parallel operations. It then discusses how parallel databases can improve performance by parallelizing queries across multiple CPUs. Finally, it explains some key parallel queries in PostgreSQL like parallel sequential scan, parallel index scan, parallel bitmap heap scan, and parallel hash join.

Uploaded by

Thoa Nhu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 36

2018PostgreSQL中国技术大会

PostgreSQL Parallel Query


Liu Dongming
[email protected]
Alibaba Cloud Database Technology
2018年PostgreSQL中国技术大会

About Me
• Liu Dongming (刘东明)
• DRDS / PostgreSQL
• Alibaba Cloud PostgreSQL Group
2018年PostgreSQL中国技术大会

Parallel Features
• PostgreSQL 9.4, 9.5 [2014, 2015]
– Backgound workers
– Dynamic shared memory(DSM)
– Shared memory queues

• PostgreSQL 9.6 [2016]


– Executor nodes:Gather,Parallel Seq Scan,Partial Aggregate,Finalize Aggregate

• PostgreSQL 10 [2017]
– Partitions
– Executor nodes:Gather Merge,Parallel Index Scan,Parallel Bitmap Heap Scan

• PostgreSQL 11 [2018]
– Executor nodes:Parallel Append,Parallel Hash Join
– Planner:Partition-wise joins
– Parallel Create Index
2018年PostgreSQL中国技术大会

The Free Lunch Is Over

https://www.karlrupp.net/2018/02/42-years-of-microprocessor-trend-data/
https://en.wikipedia.org/wiki/Herb_Sutter#The_Free_Lunch_Is_Over
2018年PostgreSQL中国技术大会

Parallel Database System


Parallel Database Systems: The Future of High Performance Database Systems 1992
Authors: David Dewitt and Jim Gray

Why Parallel Databases?


• Relational Data Model – Relational queries are ideal candidates for parallelization
• Multiprocessor systems using inexpensive microprocessors provide more power and scalability than
expensive mainframe counterparts

• Shared-memory – All processors have equal access to a global memory and all disks
• Shared-disk – Each processor has its own private memory, but has equal access to all disks
• Shared-nothing – Each processor has its own private memory and disk(s)
2018年PostgreSQL中国技术大会

For Example

SELECT COUNT(*)
FROM people
WHERE inpgconn2018 = 'Y';
2018年PostgreSQL中国技术大会

EXPLAIN ANALYZE SELECT COUNT(*) FROM people WHERE atpgconn2018 = 'Y';

Aggregate (cost=169324.73..169324.74 rows=1 width=8) (actual time=983.729..983.730 rows=1 loops=1)


-> Seq Scan on people (cost=0.00..169307.23 rows=7001 width=0) (actual time=981.723..983.051 rows=9999 loops=1)
Filter: (atpgconn2018 = 'Y'::bpchar)
Rows Removed by Filter: 9990001
Planning Time: 0.066 ms
Execution Time: 983.760 ms max_parallel_workers_per_gather = 0

Finalize Aggregate (cost=97389.77..97389.78 rows=1 width=8) (actual time=384.848..384.848 rows=1 loops=1)


-> Gather (cost=97389.55..97389.76 rows=2 width=8) (actual time=384.708..386.486 rows=3 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Partial Aggregate (cost=96389.55..96389.56 rows=1 width=8) (actual time=379.597..379.597 rows=1 loops=3)
-> Parallel Seq Scan on people (cost=0.00..96382.26 rows=2917 width=0)
(actual time=378.831..379.341 rows=3333 loops=3)
Filter: (atpgconn2018 = 'Y'::bpchar)
Rows Removed by Filter: 3330000
Planning Time: 0.063 ms
Execution Time: 386.532 ms max_parallel_workers_per_gather = 2
2018年PostgreSQL中国技术大会

Parallel Plan
Worker(W):each worker runs a Finalize
copy of the plan fragment beneath of Aggregate
the Gather node.

Leader(L):leader runs the Gather node


and the plan fragment on top of the Gather
Gather,may also run the plan fragment
beneath of the Gather node.
Partial Partial Partial
Aggregate Aggregate Aggregate

Parallel Seq Parallel Seq Parallel Seq


Scan Scan Scan

W L W
2018年PostgreSQL中国技术大会

PostgreSQL Query Architecture

Parser

Rewriter

Planner

Executor

Processes Memory IPC IO


Infrastructure for parallelism
2018年PostgreSQL中国技术大会

Background worker processes

postgres
(server process)
fork
fork kill
background
postgres worker
Client
(backend process) background
worker
2018年PostgreSQL中国技术大会

Dynamic Shared Memory


• Traditionaly, PostgreSQL has a Shared Memory
fixed-size shared memory mapped
at the same address in all processes,
inherited from the postmaster Buffer Pool
process.

• For parallel query execution, dynamic


shared memory segments are used;
they are extra shared memory, mapped
at an arbitrary address in each backend, L L
and unmapped at the end of the query. DSM
W

W
2018年PostgreSQL中国技术大会

IPC and Message Propagation


L
Shared memory queues (shm_mq) for
control messages and tuples .

If the background worker generates an


tuple error tuple error
ERROR, WARNING, or other message, it queue queue queue queue
DSM
can send that message to the master,
and the master can receive it.

W W
How parallel queries are executed?
2018年PostgreSQL中国技术大会

parallel-aware node
Node with Parallel prefix can be called parallel-aware operators.
Parallel-oblivious node is one where the node is unaware that it is part of a parallel plan.

Parallel
Parallel Seq Parallel Index Parallel Hash
Bitmap Heap
Scan Scan Join
Scan

Bitmap Heap
Seq Scan Index Scan Hash Join
Scan
2018年PostgreSQL中国技术大会

Parallel Seq Scan

L W W

8kb 8kb 8kb 8kb ...

next

How to allocate work for workers and leader?


Block-By-Block,each process advances a shared next block pointer to choose a block to scan.
2018年PostgreSQL中国技术大会

Parallel Index Scan


• Parallel index scans are supported only
for btree indexes

• Each process advances a shared next


block pointer to choose an index block
and will scan and return all tuples
referenced by that block

next
L W W

8kb 8kb 8kb ...


2018年PostgreSQL中国技术大会

Parallel Bitmap Heap Scan

Parallel
• Similar to Parallel Seq Scan, but scan only pages that were found Bitmap Heap
to potentially contain interesting tuples Scan

• The bitmap is currently built by a single processes; only the


actual Parallel Bitmap Heap Scan is parallel-aware Bitmap Index
Scan
2018年PostgreSQL中国技术大会

Nest Loop Join


The inner side is always non-
parallel. Although it is executed in
full, this is efficient if the inner side
Gather
is an index scan.

Nest Loop Nest Loop Nest Loop


Join Join Join

Parallel Seq Parallel Seq Parallel Seq


Index Scan Index Scan Index Scan
Scan Scan Scan
2018年PostgreSQL中国技术大会

Merge Join
The inner side is always a non-
parallel plan and therefore
executed in full. Gather

Merge Join Merge Join Merge Join

Parallel Seq Parallel Seq Parallel Seq


Index Scan Index Scan Index Scan
Scan Scan Scan
2018年PostgreSQL中国技术大会

The merge join may be inefficient,


Merge Join
especially if a sort must be performed,
because the work and resulting data
are duplicated in every cooperating Gather
process.

Merge Join Merge Join Merge Join

Parallel Seq Parallel Seq Parallel Seq


Sort Sort Sort
Scan Scan Scan

Seq Scan Seq Scan Seq Scan


2018年PostgreSQL中国技术大会

Hash Join
The inner side is executed in full by
every cooperating process to build
identical copies of the hash table.
This may be inefficient if the hash Gather
table is large or the plan is
expensive.

Hash Join Hash Join Hash Join

Parallel Seq Parallel Seq Parallel Seq


Scan Scan Scan
Hash Hash Hash
HashTable HashTable HashTable

Seq Scan Seq Scan Seq Scan


2018年PostgreSQL中国技术大会

Parallel Hash Join


The parallel hash divides the work of
building a shared hash table over the
cooperating processes.
Gather

Parallel Hash Parallel Hash Parallel Hash


Join Join Join

Parallel Seq Parallel Seq Parallel Seq


Scan Scan Scan
Parallel Hash Parallel Hash Parallel Hash
HashTable

Parallel Seq Parallel Seq Parallel Seq


Scan Scan Scan
2018年PostgreSQL中国技术大会

Execution time of different hash join


hash
Seq Scan on inner Seq Scan on outter
table

hash Parallel Seq


Seq Scan on inner
table Scan on outter
hash Parallel Seq
Seq Scan on inner
table Scan on outter
hash Parallel Seq
Seq Scan on inner Hash Join
table Scan on outter

Parallel Seq Parallel Seq


Scan on inner Scan on outter
hash table

Parallel Seq Parallel Seq


Scan on inner Scan on outter
Parallel Seq Scan Parallel Seq
on inner Scan on outter Parallel Hash Join
2018年PostgreSQL中国技术大会

Partition-wise join
Divide and conquer for joins between partitioned table.

Append

Nest Loop
Merge Join Hash Join
Join

ta_p0 tb_p0 ta_p1 tb_p1 ta_p2 tb_p2


2018年PostgreSQL中国技术大会

Parallel Append

Parallel
Parallel
Parallel
Append
Append
Append

Nest Loop
Merge Join Hash Join
Join

ta_p0 tb_p0 ta_p1 tb_p1 ta_p2 tb_p2


How parallel queries are planned?
2018年PostgreSQL中国技术大会

Cost-base Planner

① Think of all ways we could execute a query
② Estimate the runtime of each path, than
path path path path ...
choose the cheapest path
③ Convert path into a plan ready for execution

• For parallel query, introduce parallel-aware chepest path


node and partial paths

• For partial paths, generate
Gather/GatherMerge on top of them plan for
execution
2018年PostgreSQL中国技术大会

Parallel path

Gather

Nest Loop

VS
Join
Partial Nest
Loop Join

Seq Scan Index Scan

Partial Seq
Index Scan
Scan
2018年PostgreSQL中国技术大会

Rule-based parallel degree


2018年PostgreSQL中国技术大会

Costs
• SET parallel_setup_cost = 1000
– Cost of setting up shared memory for parallelism, and launching
workers.
– Discourage parallel query for short queries

• SET parallel_tuple_cost = 0.1


– Cost of CPU time to pass a tuple from worker to leader process
– Discourage parallel query if large amouts of results have to be sent
back
2018年PostgreSQL中国技术大会

Parallelism cannot be used in the following cases


• Query writes any data or locks any database rows
• CTE(with...)
• FULL OUTER JOINs
• SERIALIZABLE transaction isolation
• Use functions marked PARALLEL UNSAFE
• DECLARE CURSOR
2018年PostgreSQL中国技术大会

Future work
• More operators support parallelism, such as sort
• Dynamic repartitioning
• Cost-based planning of parallel degree?
2018年PostgreSQL中国技术大会

References
• https://speakerdeck.com/macdice/parallelism-in-postgresql-11

• https://www.postgresql.org/docs/11/parallel-plans.html#PARALLEL-JOINS

• http://rhaas.blogspot.com/2013/10/parallelism-progress.html

• http://ashutoshpg.blogspot.com/2017/12/partition-wise-joins-divide-and-conquer.html

• http://www.gotw.ca/publications/concurrency-ddj.htm

• https://www.enterprisedb.com/blog/parallel-hash-postgresql

• https://write-skew.blogspot.com/2018/01/parallel-hash-for-postgresql.html

• http://amitkapila16.blogspot.com/2015/11/parallel-sequential-scans-in-play.html

• https://blog.2ndquadrant.com/parallel-aggregate/
Welcome to Alibaba Cloud
Database Technology Group.

[email protected]
Thanks

You might also like