SlideShare a Scribd company logo
PostgreSQL query planning and tuning
Federico Campoli
03 Mar 2016
Federico Campoli PostgreSQL query planning and tuning 03 Mar 2016 1 / 47
Table of contents
1 Jargon
2 I find your lack of plan disturbing
3 I love it when a plan comes together
4 EXPLAIN! EXPLAIN!
5 I fight for the users!
6 Wrap up
Federico Campoli PostgreSQL query planning and tuning 03 Mar 2016 2 / 47
Jargon
Table of contents
1 Jargon
2 I find your lack of plan disturbing
3 I love it when a plan comes together
4 EXPLAIN! EXPLAIN!
5 I fight for the users!
6 Wrap up
Federico Campoli PostgreSQL query planning and tuning 03 Mar 2016 3 / 47
Jargon
Jargon
source https://commons.wikimedia.org/wiki/File:Klingon Empire Flag.svg
Federico Campoli PostgreSQL query planning and tuning 03 Mar 2016 4 / 47
Jargon
Jargon
Jargon
OID Object ID, 4 bytes integer. Used to identify the system objects (tables,
indices etc)
class any relational object, table, index, view, sequence...
attribute the table fields
execution plan the steps required for executing a query
plan nodes execution plan’s steps
CBO cost based optimizer
cost arbitrary value used to determine a score for the plan nodes
Federico Campoli PostgreSQL query planning and tuning 03 Mar 2016 5 / 47
I find your lack of plan disturbing
Table of contents
1 Jargon
2 I find your lack of plan disturbing
3 I love it when a plan comes together
4 EXPLAIN! EXPLAIN!
5 I fight for the users!
6 Wrap up
Federico Campoli PostgreSQL query planning and tuning 03 Mar 2016 6 / 47
I find your lack of plan disturbing
I find your lack of plan disturbing
source http://apbialek.deviantart.com/art/Darth-Vader-171921375
Federico Campoli PostgreSQL query planning and tuning 03 Mar 2016 7 / 47
I find your lack of plan disturbing
Query stages
The query execution requires four stages
Syntax validation
Query tree generation
Plan estimation
Execution
Federico Campoli PostgreSQL query planning and tuning 03 Mar 2016 8 / 47
I find your lack of plan disturbing Syntax validation
Syntax validation
The query parser validates the query syntax using fixed rules.
Any error at this step will cause the execution to stop, returning a syntax error.
This early stage doesn’t requires a system catalogue access.
The parser returns a normalised parse tree for the next step.
Federico Campoli PostgreSQL query planning and tuning 03 Mar 2016 9 / 47
I find your lack of plan disturbing Query tree generation
The query tree
The query parser in the second stage look up the system catalogue and translates
the parse tree into the query tree.
The query tree is the query’s logical representation where any object involved is
unique and described using the object id mapping.
Federico Campoli PostgreSQL query planning and tuning 03 Mar 2016 10 / 47
I find your lack of plan disturbing Query tree generation
The query tree
The query tree
Federico Campoli PostgreSQL query planning and tuning 03 Mar 2016 11 / 47
I find your lack of plan disturbing Query tree generation
The planner
The planner stage
The next stage is the query planner. The parser sends the generated query tree to
the planner. The query planner reads the tree and generates all the possible
execution plans. The planner, using the internal statistics,determines the
estimated costs for each plan.
The execution plan with the minimum estimated cost is sent to the executor.
Old or missing statistics will result not-optimal execution plans and therefore slow
queries.
Federico Campoli PostgreSQL query planning and tuning 03 Mar 2016 12 / 47
I find your lack of plan disturbing Query tree generation
The executor
The executor
The executor performs the plan nodes in the execution plan generated by the
planner.
Federico Campoli PostgreSQL query planning and tuning 03 Mar 2016 13 / 47
I find your lack of plan disturbing Query tree generation
The workflow
Figure : The query stages
Federico Campoli PostgreSQL query planning and tuning 03 Mar 2016 14 / 47
I love it when a plan comes together
Table of contents
1 Jargon
2 I find your lack of plan disturbing
3 I love it when a plan comes together
4 EXPLAIN! EXPLAIN!
5 I fight for the users!
6 Wrap up
Federico Campoli PostgreSQL query planning and tuning 03 Mar 2016 15 / 47
I love it when a plan comes together
I love it when a plan comes together
source http://clipperdata.com/blogpost/i-love-it-when-a-plan-comes-together/i-
love-it-when-a-plan-comes-together/
Federico Campoli PostgreSQL query planning and tuning 03 Mar 2016 16 / 47
I love it when a plan comes together
The plan nodes
Scan nodes used by the executor to retrieve data from the relations
Join nodes used by the executor to perform joins of the data streams
Federico Campoli PostgreSQL query planning and tuning 03 Mar 2016 17 / 47
I love it when a plan comes together
seq scan
seq scan: reads sequentially the table and discards the unmatched rows. The
output is a data stream. Returns unsorted rows.
Federico Campoli PostgreSQL query planning and tuning 03 Mar 2016 18 / 47
I love it when a plan comes together
index scan
index scan: read the index tree with random disk read and gets the heap blocks
pointed by the index. Returns sorted rows.
Federico Campoli PostgreSQL query planning and tuning 03 Mar 2016 19 / 47
I love it when a plan comes together
index only scan
index only scan: read the index tree with random disk read and returns the data
without accessing the heap page. Returns sorted rows.
Federico Campoli PostgreSQL query planning and tuning 03 Mar 2016 20 / 47
I love it when a plan comes together
bitmap index/heap scan
bitmap index/heap scan: read the index sequentially generating a bitmap used to
recheck on heap pages. It’s a good compromise between seq scan and a full index
scan. Returns unsorted rows.
Federico Campoli PostgreSQL query planning and tuning 03 Mar 2016 21 / 47
I love it when a plan comes together
sort
sort : reads the rows and returns them in an ordered way like in queries with an
ORDER BY
Federico Campoli PostgreSQL query planning and tuning 03 Mar 2016 22 / 47
I love it when a plan comes together
nested loop
nested loop join: The right relation is scanned once for every row found in the left
relation. This strategy is easy to implement but can be very time consuming.
However, if the right relation can be scanned with an index scan, this can be a
good strategy. It is possible to use values from the current row of the left relation
as keys for the index scan of the right.
Federico Campoli PostgreSQL query planning and tuning 03 Mar 2016 23 / 47
I love it when a plan comes together
hash join
hash join: the right relation is first scanned and loaded into a hash table, using its
join attributes as hash keys. Next the left relation is scanned and the appropriate
values of every row found are used as hash keys to locate the matching rows in
the table.
Federico Campoli PostgreSQL query planning and tuning 03 Mar 2016 24 / 47
I love it when a plan comes together
merge join
merge join: Each relation is sorted on the join attributes before the join starts.
Then the two relations are scanned in parallel, and matching rows are combined to
form join rows. This kind of join is more attractive because each relation has to
be scanned only once. The required sorting might be achieved either by an explicit
sort step, or by scanning the relation in the proper order using an index on the
join key.
Federico Campoli PostgreSQL query planning and tuning 03 Mar 2016 25 / 47
EXPLAIN! EXPLAIN!
Table of contents
1 Jargon
2 I find your lack of plan disturbing
3 I love it when a plan comes together
4 EXPLAIN! EXPLAIN!
5 I fight for the users!
6 Wrap up
Federico Campoli PostgreSQL query planning and tuning 03 Mar 2016 26 / 47
EXPLAIN! EXPLAIN!
EXPLAIN! EXPLAIN!
source http://keepcalmandwatchdoctorwho.tumblr.com/post/5450959631/the-
daleks-personally-i-like-it-when-they-shout
Federico Campoli PostgreSQL query planning and tuning 03 Mar 2016 27 / 47
EXPLAIN! EXPLAIN!
EXPLAIN
Prepending EXPLAIN to any query will display the query’s estimated
execution plan .
Federico Campoli PostgreSQL query planning and tuning 03 Mar 2016 28 / 47
EXPLAIN! EXPLAIN!
EXPLAIN
Prepending EXPLAIN to any query will display the query’s estimated
execution plan .
The ANALYZE clause executes the query, discards the results and returns
the real execution plan.
Federico Campoli PostgreSQL query planning and tuning 03 Mar 2016 28 / 47
EXPLAIN! EXPLAIN!
EXPLAIN
Prepending EXPLAIN to any query will display the query’s estimated
execution plan .
The ANALYZE clause executes the query, discards the results and returns
the real execution plan.
Using EXPLAIN ANALYZE with the DML queries will change the data. It is
safe to wrap the EXPLAIN ANALYZE between BEGIN; and ROLLBACK;
Federico Campoli PostgreSQL query planning and tuning 03 Mar 2016 28 / 47
EXPLAIN! EXPLAIN!
Explain in action
For our example we’ll create a test table with two fields, a serial and character
varying.
test =# CREATE TABLE t_test
(
i_id serial ,
v_value character varying (50)
)
;
CREATE TABLE
Federico Campoli PostgreSQL query planning and tuning 03 Mar 2016 29 / 47
EXPLAIN! EXPLAIN!
Explain in action
Now let’s add some rows to our table
Listing 1: Insert in table
test =# INSERT INTO t_test
(v_value)
SELECT
v_value
FROM
(
SELECT
generate_series (1 ,1000) as i_cnt ,
md5(random ():: text) as v_value
) t_gen
;
INSERT 0 1000
Federico Campoli PostgreSQL query planning and tuning 03 Mar 2016 30 / 47
EXPLAIN! EXPLAIN!
Explain in action
Let’s generate the estimated plan for one row result
test =# EXPLAIN SELECT * FROM t_test WHERE i_id =20;
QUERY PLAN
-- -------------------------------------------------------
Seq Scan on t_test (cost =0.00..16.62 rows =3 width =122)
Filter: (i_id = 20)
(2 rows)
The cost is just an arbitrary value
The values after the cost are the the startup and total cost
The start up cost is the cost to deliver the first row to the next step
The total cost is cost to deliver all the rows to the next step
The value in rows is the planner’s estimation for the total rows returned by
the plan node
The value in width is the estimated average row width in bytes
Federico Campoli PostgreSQL query planning and tuning 03 Mar 2016 31 / 47
EXPLAIN! EXPLAIN!
EXPLAIN ANALYZE
Now let’s generate the real execution plan for one row result
test =# EXPLAIN ANALYZE SELECT * FROM t_test WHERE i_id =20;
QUERY PLAN
--
---------------------------------------------------------------------------
Seq Scan on t_test (cost =0.00..21.50 rows =1 width =37) (actual time
=0.022..0.262 rows =1 loops =1)
Filter: (i_id = 20)
Rows Removed by Filter: 999
Planning time: 0.066 ms
Execution time: 0.286 ms
(5 rows)
The values in actual time are the time, in milliseconds, required for the
startup and total cost
The value in rows is the number of rows returned by the step
The value loops value is the number of times the step is executed
On the bottom we have the planning time and the total execution time
Federico Campoli PostgreSQL query planning and tuning 03 Mar 2016 32 / 47
EXPLAIN! EXPLAIN!
Indices
Let’s add an index on the i id field
test =# CREATE INDEX idx_i_id ON t_test (i_id);
CREATE INDEX
Federico Campoli PostgreSQL query planning and tuning 03 Mar 2016 33 / 47
EXPLAIN! EXPLAIN!
Indices
test =# EXPLAIN ANALYZE SELECT * FROM t_test WHERE i_id =20;
QUERY PLAN
--
---------------------------------------------------------------------------
Index Scan using idx_i_id on t_test (cost =0.28..8.29 rows =1 width =37) (
actual time =0.035..0.036 rows =1 loops =1)
Index Cond: (i_id = 20)
Planning time: 0.252 ms
Execution time: 0.058 ms
(4 rows)
The query is several times faster.
Federico Campoli PostgreSQL query planning and tuning 03 Mar 2016 34 / 47
EXPLAIN! EXPLAIN!
Controlling the planner
The cost based optimiser becomes cleverer for each major release. For example, if
the query’s filter returns almost the entire table, the database will choose the
sequential scan which is by default 4 times cheaper than a full index scan.
test =# EXPLAIN ANALYZE SELECT * FROM t_test WHERE i_id >2;
QUERY PLAN
--
---------------------------------------------------------------------------
Seq Scan on t_test (cost =0.00..21.50 rows =999 width =37) (actual time
=0.012..0.467 rows =998 loops =1)
Filter: (i_id > 2)
Rows Removed by Filter: 2
Planning time: 0.142 ms
Execution time: 0.652 ms
(5 rows)
Federico Campoli PostgreSQL query planning and tuning 03 Mar 2016 35 / 47
EXPLAIN! EXPLAIN!
Controlling the planner
test =# SET enable_seqscan =’off’;
SET
test =# EXPLAIN ANALYZE SELECT * FROM t_test WHERE i_id >2;
QUERY PLAN
--
---------------------------------------------------------------------------
Index Scan using idx_i_id on t_test (cost =0.28..49.76 rows =999 width =37) (
actual time =0.029..0.544 rows =998 loops =1)
Index Cond: (i_id > 2)
Planning time: 0.145 ms
Execution time: 0.741 ms
(4 rows)
Federico Campoli PostgreSQL query planning and tuning 03 Mar 2016 36 / 47
I fight for the users!
Table of contents
1 Jargon
2 I find your lack of plan disturbing
3 I love it when a plan comes together
4 EXPLAIN! EXPLAIN!
5 I fight for the users!
6 Wrap up
Federico Campoli PostgreSQL query planning and tuning 03 Mar 2016 37 / 47
I fight for the users!
I fight for the users!
source http://tron.wikia.com/wiki/Tron
Federico Campoli PostgreSQL query planning and tuning 03 Mar 2016 38 / 47
I fight for the users!
ANALYZE
ANALYZE
ANALYZE gather statistics runs random reads on the tables
The statistics are stored in the pg statistic system table
The view pg stats translates the statistics in human readable format
The parameter default statistics target sets the limit for the random read
The statistic target can be fine tuned per column
Federico Campoli PostgreSQL query planning and tuning 03 Mar 2016 39 / 47
I fight for the users!
Table statistics
Before starting the performance analysis check the statistics are up to date
querying the view pg stat all tables
test =# SELECT * FROM pg_stat_all_tables WHERE relname=’t_test ’;
-[ RECORD 1 ]-- -----+------------------------------
relid | 16546
schemaname | public
relname | t_test
seq_scan | 6
seq_tup_read | 5000
idx_scan | 7
idx_tup_fetch | 2980
n_tup_ins | 1000
n_tup_upd | 0
n_tup_del | 0
n_tup_hot_upd | 0
n_live_tup | 1000
n_dead_tup | 0
n_mod_since_analyze | 0
last_vacuum |
last_autovacuum |
last_analyze |
last_autoanalyze | 2015 -09 -24 04:51:51.622906+00
vacuum_count | 0
autovacuum_count | 0
analyze_count | 0
autoanalyze_count | 1
Federico Campoli PostgreSQL query planning and tuning 03 Mar 2016 40 / 47
I fight for the users!
index statistics
Before starting the performance analysis check the indices are used querying the
view pg stat all indexes
test =# SELECT * FROM pg_stat_all_indexes WHERE relname=’t_test ’;
-[ RECORD 1 ]-+ ---------
relid | 16546
indexrelid | 16552
schemaname | public
relname | t_test
indexrelname | idx_i_id
idx_scan | 7
idx_tup_read | 2980
idx_tup_fetch | 2980
Federico Campoli PostgreSQL query planning and tuning 03 Mar 2016 41 / 47
I fight for the users!
Controlling the planner
enable bitmapscan
enable hashagg
enable hashjoin
enable indexscan
enable indexonlyscan
enable material
enable mergejoin
enable nestloop
enable seqscan
enable sort
enable tidscan
Federico Campoli PostgreSQL query planning and tuning 03 Mar 2016 42 / 47
Wrap up
Table of contents
1 Jargon
2 I find your lack of plan disturbing
3 I love it when a plan comes together
4 EXPLAIN! EXPLAIN!
5 I fight for the users!
6 Wrap up
Federico Campoli PostgreSQL query planning and tuning 03 Mar 2016 43 / 47
Wrap up
Questions
Questions?
Federico Campoli PostgreSQL query planning and tuning 03 Mar 2016 44 / 47
Wrap up
Boring legal stuff
The join nodes descriptions are excerpts from the PostgreSQL 9.4 on line manual.
Copyright by PostgreSQL Global Development Group.
http://www.postgresql.org/
The scan node images are derived from the pgadmin3 scan nodes. Copyright by
pgadmin development group. http://www.pgadmin.org/
All the images copyright is owned by the respective authors. The sources the
author’s attribution is provided with a link alongside with image.
Federico Campoli PostgreSQL query planning and tuning 03 Mar 2016 45 / 47
Wrap up
Contacts and license
Twitter: 4thdoctor scarf
Blog:http://www.pgdba.co.uk
Brighton PostgreSQL Meetup:
http://www.meetup.com/Brighton-PostgreSQL-Meetup/
This document is distributed under the terms of the Creative Commons
Federico Campoli PostgreSQL query planning and tuning 03 Mar 2016 46 / 47
Wrap up
PostgreSQL query planning and tuning
Federico Campoli
03 Mar 2016
Federico Campoli PostgreSQL query planning and tuning 03 Mar 2016 47 / 47

More Related Content

PPTX
PostGreSQL Performance Tuning
Maven Logix
 
PDF
Postgresql database administration volume 1
Federico Campoli
 
PDF
PostgreSQL Performance Tuning
elliando dias
 
PDF
How the Postgres Query Optimizer Works
EDB
 
PDF
Mastering PostgreSQL Administration
EDB
 
PDF
SQLServer Database Structures
Antonios Chatzipavlis
 
PDF
PostgreSQL WAL for DBAs
PGConf APAC
 
PPTX
Introduction to SQL Server Internals: How to Think Like the Engine
Brent Ozar
 
PostGreSQL Performance Tuning
Maven Logix
 
Postgresql database administration volume 1
Federico Campoli
 
PostgreSQL Performance Tuning
elliando dias
 
How the Postgres Query Optimizer Works
EDB
 
Mastering PostgreSQL Administration
EDB
 
SQLServer Database Structures
Antonios Chatzipavlis
 
PostgreSQL WAL for DBAs
PGConf APAC
 
Introduction to SQL Server Internals: How to Think Like the Engine
Brent Ozar
 

What's hot (20)

PPTX
PostgreSQL- An Introduction
Smita Prasad
 
PDF
Fine Tuning and Enhancing Performance of Apache Spark Jobs
Databricks
 
PDF
5 Steps to PostgreSQL Performance
Command Prompt., Inc
 
PDF
What is new in PostgreSQL 14?
Mydbops
 
PDF
Understanding PostgreSQL LW Locks
Jignesh Shah
 
PDF
Linux tuning to improve PostgreSQL performance
PostgreSQL-Consulting
 
PDF
InnoDB Performance Optimisation
Mydbops
 
ODP
The PostgreSQL Query Planner
Command Prompt., Inc
 
ODP
PostgreSQL Administration for System Administrators
Command Prompt., Inc
 
PDF
Get to know PostgreSQL!
Oddbjørn Steffensen
 
PDF
Top 5 Mistakes When Writing Spark Applications
Spark Summit
 
PDF
PGConf.ASIA 2017 Logical Replication Internals (English)
Noriyoshi Shinoda
 
PDF
Flexible Indexing with Postgres
EDB
 
PDF
Solving PostgreSQL wicked problems
Alexander Korotkov
 
PDF
Introduction to Cassandra
Gokhan Atil
 
PDF
Spark SQL: Another 16x Faster After Tungsten: Spark Summit East talk by Brad ...
Spark Summit
 
PDF
Deep Dive into the New Features of Apache Spark 3.0
Databricks
 
PDF
How to Design Indexes, Really
Karwin Software Solutions LLC
 
PDF
Vacuum in PostgreSQL
Rafia Sabih
 
PPTX
Internal Hive
Recruit Technologies
 
PostgreSQL- An Introduction
Smita Prasad
 
Fine Tuning and Enhancing Performance of Apache Spark Jobs
Databricks
 
5 Steps to PostgreSQL Performance
Command Prompt., Inc
 
What is new in PostgreSQL 14?
Mydbops
 
Understanding PostgreSQL LW Locks
Jignesh Shah
 
Linux tuning to improve PostgreSQL performance
PostgreSQL-Consulting
 
InnoDB Performance Optimisation
Mydbops
 
The PostgreSQL Query Planner
Command Prompt., Inc
 
PostgreSQL Administration for System Administrators
Command Prompt., Inc
 
Get to know PostgreSQL!
Oddbjørn Steffensen
 
Top 5 Mistakes When Writing Spark Applications
Spark Summit
 
PGConf.ASIA 2017 Logical Replication Internals (English)
Noriyoshi Shinoda
 
Flexible Indexing with Postgres
EDB
 
Solving PostgreSQL wicked problems
Alexander Korotkov
 
Introduction to Cassandra
Gokhan Atil
 
Spark SQL: Another 16x Faster After Tungsten: Spark Summit East talk by Brad ...
Spark Summit
 
Deep Dive into the New Features of Apache Spark 3.0
Databricks
 
How to Design Indexes, Really
Karwin Software Solutions LLC
 
Vacuum in PostgreSQL
Rafia Sabih
 
Internal Hive
Recruit Technologies
 
Ad

Viewers also liked (20)

PDF
Pg chameleon MySQL to PostgreSQL replica
Federico Campoli
 
PDF
Backup recovery with PostgreSQL
Federico Campoli
 
PDF
The ninja elephant, scaling the analytics database in Transwerwise
Federico Campoli
 
PDF
a look at the postgresql engine
Federico Campoli
 
PDF
PostgreSQL, performance for queries with grouping
Alexey Bashtanov
 
PDF
Don't panic! - Postgres introduction
Federico Campoli
 
PDF
Streaming replication
Federico Campoli
 
PDF
Pg big fast ugly acid
Federico Campoli
 
PDF
Life on a_rollercoaster
Federico Campoli
 
PDF
PostgreSQL - backup and recovery with large databases
Federico Campoli
 
PDF
The ninja elephant, scaling the analytics database in Transwerwise
Federico Campoli
 
DOCX
Nclex 5
cpate35
 
PDF
Is Display Advertising Still a Healthy Form of Marketing?
Rise Interactive
 
PPTX
MuCEM
Oban_
 
PPS
Su strade final
Sanjeev J. Nair
 
PDF
PostgreSQL, The Big, The Fast and The Ugly
Federico Campoli
 
PPTX
καστορια
Katia Sleazy-Snake
 
PDF
Vacuum precision positioning systems brochure
John Mike
 
PPTX
Menopause nine by sanjana
sanjukpt92
 
Pg chameleon MySQL to PostgreSQL replica
Federico Campoli
 
Backup recovery with PostgreSQL
Federico Campoli
 
The ninja elephant, scaling the analytics database in Transwerwise
Federico Campoli
 
a look at the postgresql engine
Federico Campoli
 
PostgreSQL, performance for queries with grouping
Alexey Bashtanov
 
Don't panic! - Postgres introduction
Federico Campoli
 
Streaming replication
Federico Campoli
 
Pg big fast ugly acid
Federico Campoli
 
Life on a_rollercoaster
Federico Campoli
 
PostgreSQL - backup and recovery with large databases
Federico Campoli
 
The ninja elephant, scaling the analytics database in Transwerwise
Federico Campoli
 
Nclex 5
cpate35
 
Is Display Advertising Still a Healthy Form of Marketing?
Rise Interactive
 
MuCEM
Oban_
 
Su strade final
Sanjeev J. Nair
 
PostgreSQL, The Big, The Fast and The Ugly
Federico Campoli
 
καστορια
Katia Sleazy-Snake
 
Vacuum precision positioning systems brochure
John Mike
 
Menopause nine by sanjana
sanjukpt92
 
Ad

Similar to PostgreSql query planning and tuning (15)

PDF
Hitchikers guide handout
Federico Campoli
 
DOCX
Predictive performance analysis using sql pattern matching
Horia Berca
 
DOCX
Learning ObjectivesGain some experience using dynamic data structu.docx
jesseniasaddler
 
PDF
MachineLearning_MPI_vs_Spark
Xudong Brandon Liang
 
PDF
Understanding the firebird optimizer
Marius Adrian Popa
 
PDF
Ontop: Answering SPARQL Queries over Relational Databases
Guohui Xiao
 
PPTX
Cassandra Summit Sept 2015 - Real Time Advanced Analytics with Spark and Cass...
Chris Fregly
 
DOC
1.2M .pdf
butest
 
PPTX
SPSS statistic basic guide.pptx
Roshina Rabail
 
PDF
readme.pdf
ssuserbf6a4f
 
PDF
C interview-questions-techpreparation
Kushaal Singla
 
DOCX
Parallel programming Comparisions
Muhammad Bilal Khan
 
PPTX
Word extraction
Aditya Jagtap
 
PDF
Relational Theory for Budding Einsteins -- LonestarPHP 2016
Dave Stokes
 
RTF
Informatica faq's
Rajasekhar Cherukuri
 
Hitchikers guide handout
Federico Campoli
 
Predictive performance analysis using sql pattern matching
Horia Berca
 
Learning ObjectivesGain some experience using dynamic data structu.docx
jesseniasaddler
 
MachineLearning_MPI_vs_Spark
Xudong Brandon Liang
 
Understanding the firebird optimizer
Marius Adrian Popa
 
Ontop: Answering SPARQL Queries over Relational Databases
Guohui Xiao
 
Cassandra Summit Sept 2015 - Real Time Advanced Analytics with Spark and Cass...
Chris Fregly
 
1.2M .pdf
butest
 
SPSS statistic basic guide.pptx
Roshina Rabail
 
readme.pdf
ssuserbf6a4f
 
C interview-questions-techpreparation
Kushaal Singla
 
Parallel programming Comparisions
Muhammad Bilal Khan
 
Word extraction
Aditya Jagtap
 
Relational Theory for Budding Einsteins -- LonestarPHP 2016
Dave Stokes
 
Informatica faq's
Rajasekhar Cherukuri
 

More from Federico Campoli (6)

PDF
Pg chameleon, mysql to postgresql replica made easy
Federico Campoli
 
PDF
pg_chameleon MySQL to PostgreSQL replica made easy
Federico Campoli
 
PDF
pg_chameleon a MySQL to PostgreSQL replica
Federico Campoli
 
PDF
The hitchhiker's guide to PostgreSQL
Federico Campoli
 
PDF
PostgreSQL, the big the fast and the (NOSQL on) Acid
Federico Campoli
 
PDF
A couple of things about PostgreSQL...
Federico Campoli
 
Pg chameleon, mysql to postgresql replica made easy
Federico Campoli
 
pg_chameleon MySQL to PostgreSQL replica made easy
Federico Campoli
 
pg_chameleon a MySQL to PostgreSQL replica
Federico Campoli
 
The hitchhiker's guide to PostgreSQL
Federico Campoli
 
PostgreSQL, the big the fast and the (NOSQL on) Acid
Federico Campoli
 
A couple of things about PostgreSQL...
Federico Campoli
 

Recently uploaded (20)

PDF
Tea4chat - another LLM Project by Kerem Atam
a0m0rajab1
 
PDF
Software Development Company | KodekX
KodekX
 
PDF
Accelerating Oracle Database 23ai Troubleshooting with Oracle AHF Fleet Insig...
Sandesh Rao
 
PDF
CIFDAQ's Market Wrap : Bears Back in Control?
CIFDAQ
 
PDF
A Day in the Life of Location Data - Turning Where into How.pdf
Precisely
 
PDF
Trying to figure out MCP by actually building an app from scratch with open s...
Julien SIMON
 
PDF
Unlocking the Future- AI Agents Meet Oracle Database 23ai - AIOUG Yatra 2025.pdf
Sandesh Rao
 
PDF
Using Anchore and DefectDojo to Stand Up Your DevSecOps Function
Anchore
 
PPTX
Comunidade Salesforce São Paulo - Desmistificando o Omnistudio (Vlocity)
Francisco Vieira Júnior
 
PDF
Cloud-Migration-Best-Practices-A-Practical-Guide-to-AWS-Azure-and-Google-Clou...
Artjoker Software Development Company
 
PDF
Data_Analytics_vs_Data_Science_vs_BI_by_CA_Suvidha_Chaplot.pdf
CA Suvidha Chaplot
 
PPTX
The-Ethical-Hackers-Imperative-Safeguarding-the-Digital-Frontier.pptx
sujalchauhan1305
 
PDF
Structs to JSON: How Go Powers REST APIs
Emily Achieng
 
PDF
Doc9.....................................
SofiaCollazos
 
PDF
Google I/O Extended 2025 Baku - all ppts
HusseinMalikMammadli
 
PPTX
New ThousandEyes Product Innovations: Cisco Live June 2025
ThousandEyes
 
PPTX
ChatGPT's Deck on The Enduring Legacy of Fax Machines
Greg Swan
 
PPTX
cloud computing vai.pptx for the project
vaibhavdobariyal79
 
PPTX
IoT Sensor Integration 2025 Powering Smart Tech and Industrial Automation.pptx
Rejig Digital
 
PPTX
How to Build a Scalable Micro-Investing Platform in 2025 - A Founder’s Guide ...
Third Rock Techkno
 
Tea4chat - another LLM Project by Kerem Atam
a0m0rajab1
 
Software Development Company | KodekX
KodekX
 
Accelerating Oracle Database 23ai Troubleshooting with Oracle AHF Fleet Insig...
Sandesh Rao
 
CIFDAQ's Market Wrap : Bears Back in Control?
CIFDAQ
 
A Day in the Life of Location Data - Turning Where into How.pdf
Precisely
 
Trying to figure out MCP by actually building an app from scratch with open s...
Julien SIMON
 
Unlocking the Future- AI Agents Meet Oracle Database 23ai - AIOUG Yatra 2025.pdf
Sandesh Rao
 
Using Anchore and DefectDojo to Stand Up Your DevSecOps Function
Anchore
 
Comunidade Salesforce São Paulo - Desmistificando o Omnistudio (Vlocity)
Francisco Vieira Júnior
 
Cloud-Migration-Best-Practices-A-Practical-Guide-to-AWS-Azure-and-Google-Clou...
Artjoker Software Development Company
 
Data_Analytics_vs_Data_Science_vs_BI_by_CA_Suvidha_Chaplot.pdf
CA Suvidha Chaplot
 
The-Ethical-Hackers-Imperative-Safeguarding-the-Digital-Frontier.pptx
sujalchauhan1305
 
Structs to JSON: How Go Powers REST APIs
Emily Achieng
 
Doc9.....................................
SofiaCollazos
 
Google I/O Extended 2025 Baku - all ppts
HusseinMalikMammadli
 
New ThousandEyes Product Innovations: Cisco Live June 2025
ThousandEyes
 
ChatGPT's Deck on The Enduring Legacy of Fax Machines
Greg Swan
 
cloud computing vai.pptx for the project
vaibhavdobariyal79
 
IoT Sensor Integration 2025 Powering Smart Tech and Industrial Automation.pptx
Rejig Digital
 
How to Build a Scalable Micro-Investing Platform in 2025 - A Founder’s Guide ...
Third Rock Techkno
 

PostgreSql query planning and tuning

  • 1. PostgreSQL query planning and tuning Federico Campoli 03 Mar 2016 Federico Campoli PostgreSQL query planning and tuning 03 Mar 2016 1 / 47
  • 2. Table of contents 1 Jargon 2 I find your lack of plan disturbing 3 I love it when a plan comes together 4 EXPLAIN! EXPLAIN! 5 I fight for the users! 6 Wrap up Federico Campoli PostgreSQL query planning and tuning 03 Mar 2016 2 / 47
  • 3. Jargon Table of contents 1 Jargon 2 I find your lack of plan disturbing 3 I love it when a plan comes together 4 EXPLAIN! EXPLAIN! 5 I fight for the users! 6 Wrap up Federico Campoli PostgreSQL query planning and tuning 03 Mar 2016 3 / 47
  • 4. Jargon Jargon source https://commons.wikimedia.org/wiki/File:Klingon Empire Flag.svg Federico Campoli PostgreSQL query planning and tuning 03 Mar 2016 4 / 47
  • 5. Jargon Jargon Jargon OID Object ID, 4 bytes integer. Used to identify the system objects (tables, indices etc) class any relational object, table, index, view, sequence... attribute the table fields execution plan the steps required for executing a query plan nodes execution plan’s steps CBO cost based optimizer cost arbitrary value used to determine a score for the plan nodes Federico Campoli PostgreSQL query planning and tuning 03 Mar 2016 5 / 47
  • 6. I find your lack of plan disturbing Table of contents 1 Jargon 2 I find your lack of plan disturbing 3 I love it when a plan comes together 4 EXPLAIN! EXPLAIN! 5 I fight for the users! 6 Wrap up Federico Campoli PostgreSQL query planning and tuning 03 Mar 2016 6 / 47
  • 7. I find your lack of plan disturbing I find your lack of plan disturbing source http://apbialek.deviantart.com/art/Darth-Vader-171921375 Federico Campoli PostgreSQL query planning and tuning 03 Mar 2016 7 / 47
  • 8. I find your lack of plan disturbing Query stages The query execution requires four stages Syntax validation Query tree generation Plan estimation Execution Federico Campoli PostgreSQL query planning and tuning 03 Mar 2016 8 / 47
  • 9. I find your lack of plan disturbing Syntax validation Syntax validation The query parser validates the query syntax using fixed rules. Any error at this step will cause the execution to stop, returning a syntax error. This early stage doesn’t requires a system catalogue access. The parser returns a normalised parse tree for the next step. Federico Campoli PostgreSQL query planning and tuning 03 Mar 2016 9 / 47
  • 10. I find your lack of plan disturbing Query tree generation The query tree The query parser in the second stage look up the system catalogue and translates the parse tree into the query tree. The query tree is the query’s logical representation where any object involved is unique and described using the object id mapping. Federico Campoli PostgreSQL query planning and tuning 03 Mar 2016 10 / 47
  • 11. I find your lack of plan disturbing Query tree generation The query tree The query tree Federico Campoli PostgreSQL query planning and tuning 03 Mar 2016 11 / 47
  • 12. I find your lack of plan disturbing Query tree generation The planner The planner stage The next stage is the query planner. The parser sends the generated query tree to the planner. The query planner reads the tree and generates all the possible execution plans. The planner, using the internal statistics,determines the estimated costs for each plan. The execution plan with the minimum estimated cost is sent to the executor. Old or missing statistics will result not-optimal execution plans and therefore slow queries. Federico Campoli PostgreSQL query planning and tuning 03 Mar 2016 12 / 47
  • 13. I find your lack of plan disturbing Query tree generation The executor The executor The executor performs the plan nodes in the execution plan generated by the planner. Federico Campoli PostgreSQL query planning and tuning 03 Mar 2016 13 / 47
  • 14. I find your lack of plan disturbing Query tree generation The workflow Figure : The query stages Federico Campoli PostgreSQL query planning and tuning 03 Mar 2016 14 / 47
  • 15. I love it when a plan comes together Table of contents 1 Jargon 2 I find your lack of plan disturbing 3 I love it when a plan comes together 4 EXPLAIN! EXPLAIN! 5 I fight for the users! 6 Wrap up Federico Campoli PostgreSQL query planning and tuning 03 Mar 2016 15 / 47
  • 16. I love it when a plan comes together I love it when a plan comes together source http://clipperdata.com/blogpost/i-love-it-when-a-plan-comes-together/i- love-it-when-a-plan-comes-together/ Federico Campoli PostgreSQL query planning and tuning 03 Mar 2016 16 / 47
  • 17. I love it when a plan comes together The plan nodes Scan nodes used by the executor to retrieve data from the relations Join nodes used by the executor to perform joins of the data streams Federico Campoli PostgreSQL query planning and tuning 03 Mar 2016 17 / 47
  • 18. I love it when a plan comes together seq scan seq scan: reads sequentially the table and discards the unmatched rows. The output is a data stream. Returns unsorted rows. Federico Campoli PostgreSQL query planning and tuning 03 Mar 2016 18 / 47
  • 19. I love it when a plan comes together index scan index scan: read the index tree with random disk read and gets the heap blocks pointed by the index. Returns sorted rows. Federico Campoli PostgreSQL query planning and tuning 03 Mar 2016 19 / 47
  • 20. I love it when a plan comes together index only scan index only scan: read the index tree with random disk read and returns the data without accessing the heap page. Returns sorted rows. Federico Campoli PostgreSQL query planning and tuning 03 Mar 2016 20 / 47
  • 21. I love it when a plan comes together bitmap index/heap scan bitmap index/heap scan: read the index sequentially generating a bitmap used to recheck on heap pages. It’s a good compromise between seq scan and a full index scan. Returns unsorted rows. Federico Campoli PostgreSQL query planning and tuning 03 Mar 2016 21 / 47
  • 22. I love it when a plan comes together sort sort : reads the rows and returns them in an ordered way like in queries with an ORDER BY Federico Campoli PostgreSQL query planning and tuning 03 Mar 2016 22 / 47
  • 23. I love it when a plan comes together nested loop nested loop join: The right relation is scanned once for every row found in the left relation. This strategy is easy to implement but can be very time consuming. However, if the right relation can be scanned with an index scan, this can be a good strategy. It is possible to use values from the current row of the left relation as keys for the index scan of the right. Federico Campoli PostgreSQL query planning and tuning 03 Mar 2016 23 / 47
  • 24. I love it when a plan comes together hash join hash join: the right relation is first scanned and loaded into a hash table, using its join attributes as hash keys. Next the left relation is scanned and the appropriate values of every row found are used as hash keys to locate the matching rows in the table. Federico Campoli PostgreSQL query planning and tuning 03 Mar 2016 24 / 47
  • 25. I love it when a plan comes together merge join merge join: Each relation is sorted on the join attributes before the join starts. Then the two relations are scanned in parallel, and matching rows are combined to form join rows. This kind of join is more attractive because each relation has to be scanned only once. The required sorting might be achieved either by an explicit sort step, or by scanning the relation in the proper order using an index on the join key. Federico Campoli PostgreSQL query planning and tuning 03 Mar 2016 25 / 47
  • 26. EXPLAIN! EXPLAIN! Table of contents 1 Jargon 2 I find your lack of plan disturbing 3 I love it when a plan comes together 4 EXPLAIN! EXPLAIN! 5 I fight for the users! 6 Wrap up Federico Campoli PostgreSQL query planning and tuning 03 Mar 2016 26 / 47
  • 27. EXPLAIN! EXPLAIN! EXPLAIN! EXPLAIN! source http://keepcalmandwatchdoctorwho.tumblr.com/post/5450959631/the- daleks-personally-i-like-it-when-they-shout Federico Campoli PostgreSQL query planning and tuning 03 Mar 2016 27 / 47
  • 28. EXPLAIN! EXPLAIN! EXPLAIN Prepending EXPLAIN to any query will display the query’s estimated execution plan . Federico Campoli PostgreSQL query planning and tuning 03 Mar 2016 28 / 47
  • 29. EXPLAIN! EXPLAIN! EXPLAIN Prepending EXPLAIN to any query will display the query’s estimated execution plan . The ANALYZE clause executes the query, discards the results and returns the real execution plan. Federico Campoli PostgreSQL query planning and tuning 03 Mar 2016 28 / 47
  • 30. EXPLAIN! EXPLAIN! EXPLAIN Prepending EXPLAIN to any query will display the query’s estimated execution plan . The ANALYZE clause executes the query, discards the results and returns the real execution plan. Using EXPLAIN ANALYZE with the DML queries will change the data. It is safe to wrap the EXPLAIN ANALYZE between BEGIN; and ROLLBACK; Federico Campoli PostgreSQL query planning and tuning 03 Mar 2016 28 / 47
  • 31. EXPLAIN! EXPLAIN! Explain in action For our example we’ll create a test table with two fields, a serial and character varying. test =# CREATE TABLE t_test ( i_id serial , v_value character varying (50) ) ; CREATE TABLE Federico Campoli PostgreSQL query planning and tuning 03 Mar 2016 29 / 47
  • 32. EXPLAIN! EXPLAIN! Explain in action Now let’s add some rows to our table Listing 1: Insert in table test =# INSERT INTO t_test (v_value) SELECT v_value FROM ( SELECT generate_series (1 ,1000) as i_cnt , md5(random ():: text) as v_value ) t_gen ; INSERT 0 1000 Federico Campoli PostgreSQL query planning and tuning 03 Mar 2016 30 / 47
  • 33. EXPLAIN! EXPLAIN! Explain in action Let’s generate the estimated plan for one row result test =# EXPLAIN SELECT * FROM t_test WHERE i_id =20; QUERY PLAN -- ------------------------------------------------------- Seq Scan on t_test (cost =0.00..16.62 rows =3 width =122) Filter: (i_id = 20) (2 rows) The cost is just an arbitrary value The values after the cost are the the startup and total cost The start up cost is the cost to deliver the first row to the next step The total cost is cost to deliver all the rows to the next step The value in rows is the planner’s estimation for the total rows returned by the plan node The value in width is the estimated average row width in bytes Federico Campoli PostgreSQL query planning and tuning 03 Mar 2016 31 / 47
  • 34. EXPLAIN! EXPLAIN! EXPLAIN ANALYZE Now let’s generate the real execution plan for one row result test =# EXPLAIN ANALYZE SELECT * FROM t_test WHERE i_id =20; QUERY PLAN -- --------------------------------------------------------------------------- Seq Scan on t_test (cost =0.00..21.50 rows =1 width =37) (actual time =0.022..0.262 rows =1 loops =1) Filter: (i_id = 20) Rows Removed by Filter: 999 Planning time: 0.066 ms Execution time: 0.286 ms (5 rows) The values in actual time are the time, in milliseconds, required for the startup and total cost The value in rows is the number of rows returned by the step The value loops value is the number of times the step is executed On the bottom we have the planning time and the total execution time Federico Campoli PostgreSQL query planning and tuning 03 Mar 2016 32 / 47
  • 35. EXPLAIN! EXPLAIN! Indices Let’s add an index on the i id field test =# CREATE INDEX idx_i_id ON t_test (i_id); CREATE INDEX Federico Campoli PostgreSQL query planning and tuning 03 Mar 2016 33 / 47
  • 36. EXPLAIN! EXPLAIN! Indices test =# EXPLAIN ANALYZE SELECT * FROM t_test WHERE i_id =20; QUERY PLAN -- --------------------------------------------------------------------------- Index Scan using idx_i_id on t_test (cost =0.28..8.29 rows =1 width =37) ( actual time =0.035..0.036 rows =1 loops =1) Index Cond: (i_id = 20) Planning time: 0.252 ms Execution time: 0.058 ms (4 rows) The query is several times faster. Federico Campoli PostgreSQL query planning and tuning 03 Mar 2016 34 / 47
  • 37. EXPLAIN! EXPLAIN! Controlling the planner The cost based optimiser becomes cleverer for each major release. For example, if the query’s filter returns almost the entire table, the database will choose the sequential scan which is by default 4 times cheaper than a full index scan. test =# EXPLAIN ANALYZE SELECT * FROM t_test WHERE i_id >2; QUERY PLAN -- --------------------------------------------------------------------------- Seq Scan on t_test (cost =0.00..21.50 rows =999 width =37) (actual time =0.012..0.467 rows =998 loops =1) Filter: (i_id > 2) Rows Removed by Filter: 2 Planning time: 0.142 ms Execution time: 0.652 ms (5 rows) Federico Campoli PostgreSQL query planning and tuning 03 Mar 2016 35 / 47
  • 38. EXPLAIN! EXPLAIN! Controlling the planner test =# SET enable_seqscan =’off’; SET test =# EXPLAIN ANALYZE SELECT * FROM t_test WHERE i_id >2; QUERY PLAN -- --------------------------------------------------------------------------- Index Scan using idx_i_id on t_test (cost =0.28..49.76 rows =999 width =37) ( actual time =0.029..0.544 rows =998 loops =1) Index Cond: (i_id > 2) Planning time: 0.145 ms Execution time: 0.741 ms (4 rows) Federico Campoli PostgreSQL query planning and tuning 03 Mar 2016 36 / 47
  • 39. I fight for the users! Table of contents 1 Jargon 2 I find your lack of plan disturbing 3 I love it when a plan comes together 4 EXPLAIN! EXPLAIN! 5 I fight for the users! 6 Wrap up Federico Campoli PostgreSQL query planning and tuning 03 Mar 2016 37 / 47
  • 40. I fight for the users! I fight for the users! source http://tron.wikia.com/wiki/Tron Federico Campoli PostgreSQL query planning and tuning 03 Mar 2016 38 / 47
  • 41. I fight for the users! ANALYZE ANALYZE ANALYZE gather statistics runs random reads on the tables The statistics are stored in the pg statistic system table The view pg stats translates the statistics in human readable format The parameter default statistics target sets the limit for the random read The statistic target can be fine tuned per column Federico Campoli PostgreSQL query planning and tuning 03 Mar 2016 39 / 47
  • 42. I fight for the users! Table statistics Before starting the performance analysis check the statistics are up to date querying the view pg stat all tables test =# SELECT * FROM pg_stat_all_tables WHERE relname=’t_test ’; -[ RECORD 1 ]-- -----+------------------------------ relid | 16546 schemaname | public relname | t_test seq_scan | 6 seq_tup_read | 5000 idx_scan | 7 idx_tup_fetch | 2980 n_tup_ins | 1000 n_tup_upd | 0 n_tup_del | 0 n_tup_hot_upd | 0 n_live_tup | 1000 n_dead_tup | 0 n_mod_since_analyze | 0 last_vacuum | last_autovacuum | last_analyze | last_autoanalyze | 2015 -09 -24 04:51:51.622906+00 vacuum_count | 0 autovacuum_count | 0 analyze_count | 0 autoanalyze_count | 1 Federico Campoli PostgreSQL query planning and tuning 03 Mar 2016 40 / 47
  • 43. I fight for the users! index statistics Before starting the performance analysis check the indices are used querying the view pg stat all indexes test =# SELECT * FROM pg_stat_all_indexes WHERE relname=’t_test ’; -[ RECORD 1 ]-+ --------- relid | 16546 indexrelid | 16552 schemaname | public relname | t_test indexrelname | idx_i_id idx_scan | 7 idx_tup_read | 2980 idx_tup_fetch | 2980 Federico Campoli PostgreSQL query planning and tuning 03 Mar 2016 41 / 47
  • 44. I fight for the users! Controlling the planner enable bitmapscan enable hashagg enable hashjoin enable indexscan enable indexonlyscan enable material enable mergejoin enable nestloop enable seqscan enable sort enable tidscan Federico Campoli PostgreSQL query planning and tuning 03 Mar 2016 42 / 47
  • 45. Wrap up Table of contents 1 Jargon 2 I find your lack of plan disturbing 3 I love it when a plan comes together 4 EXPLAIN! EXPLAIN! 5 I fight for the users! 6 Wrap up Federico Campoli PostgreSQL query planning and tuning 03 Mar 2016 43 / 47
  • 46. Wrap up Questions Questions? Federico Campoli PostgreSQL query planning and tuning 03 Mar 2016 44 / 47
  • 47. Wrap up Boring legal stuff The join nodes descriptions are excerpts from the PostgreSQL 9.4 on line manual. Copyright by PostgreSQL Global Development Group. http://www.postgresql.org/ The scan node images are derived from the pgadmin3 scan nodes. Copyright by pgadmin development group. http://www.pgadmin.org/ All the images copyright is owned by the respective authors. The sources the author’s attribution is provided with a link alongside with image. Federico Campoli PostgreSQL query planning and tuning 03 Mar 2016 45 / 47
  • 48. Wrap up Contacts and license Twitter: 4thdoctor scarf Blog:http://www.pgdba.co.uk Brighton PostgreSQL Meetup: http://www.meetup.com/Brighton-PostgreSQL-Meetup/ This document is distributed under the terms of the Creative Commons Federico Campoli PostgreSQL query planning and tuning 03 Mar 2016 46 / 47
  • 49. Wrap up PostgreSQL query planning and tuning Federico Campoli 03 Mar 2016 Federico Campoli PostgreSQL query planning and tuning 03 Mar 2016 47 / 47