Zafin Learn Session - PostgreSQL Performance For Application Developers
Zafin Learn Session - PostgreSQL Performance For Application Developers
Zafin, Ottawa
19-Feb-2025
Introducing PostgreSQL
● Object-relational database
● Open source with a liberal license
● SQL Standard compliant
● ACID compliant (atomicity, consistency, isolation, durability)
● Supports structured as well unstructured data
● Highly extensible
A bit of a history lesson …
● pgAdmin
● pgBadger
● Prometheus with Grafana
● Commercial tools
○ Datadog
○ DBeaver
○ New Relic
EXPLAIN plan is your friend
COMMIT; Locks
Query & SQL Optimization
Key takeaways:
● Monitor your queries
● Analyze the execution
● Code to avoid locks
Scaling
● Query & SQL Optimization
PostgreSQL ● Architectural Improvements
● Performance Features
Some tips for developers ● Parameter Tuning
Load Balancing
Application Application
Standby 1 Standby 2
Write
Read
Replicate
Load balancing
Single Node SELECTs Load Balanced 3-node Cluster
transaction type: <builtin: select only> transaction type: <builtin: select only>
scaling factor: 10 scaling factor: 10
query mode: simple query mode: simple
number of clients: 25 number of clients: 25
number of threads: 1 number of threads: 1
maximum number of tries: 1 maximum number of tries: 1
duration: 60 s duration: 60 s
number of transactions actually processed: 19139 number of transactions actually processed: 24885
number of failed transactions: 0 (0.000%) number of failed transactions: 0 (0.000%)
latency average = 67.215 ms latency average = 51.449 ms
initial connection time = 8620.897 ms initial connection time = 8896.110 ms
tps = 371.939402 (without initial connection time) tps = 485.918972 (without initial connection time)
+30%
Partitioning
Application Application
Q1 Q2 Q3 Q4
Partitioning
select * from foo where month = ‘Aug’ select * from foo where month = ‘Aug’
Application Application
Q1 Q2 Q3 Q4
Architectural Improvements
Key takeaway:
● Don’t overload a single node!
Scaling
● Query & SQL Optimization
PostgreSQL ● Architectural Improvements
● Performance Features
Some tips for developers ● Parameter Tuning
Indexes
A few examples …
● Parallel queries
○ The query planner decides if it can use multiple CPU cores to execute a single query
○ There are tuning parameters that you can adjust
● Heap-Only Tuples (HOT)
○ Avoids index updates if changes don’t impact an indexed column
● Incremental sort
○ Don’t start from scratch, sort only what is not yet sorted
● Autovacuum
○ Gets rid of dead tuples to clear out the table bloat
○ There are tuning parameters that you can adjust
Performance Features
Key takeaways:
● Indexes are a powerful ally
● … but you shouldn’t overuse them
● Let PostgreSQL do its job
Scaling
● Query & SQL Optimization
PostgreSQL ● Architectural Improvements
● Performance Features
Some tips for developers ● Parameter Tuning
Parameter tuning
● shared_buffers
○ Cache for frequently accessed data
○ Default is 128MB
○ Recommended is between 25% and 40% of system memory
● wal_buffers
○ Shared memory not yet written to disk
○ Default is 3% of shared_buffers
○ A value of up to 16MB can improve performance in high concurrency commits
● work_mem
○ Memory available for a query operation
○ Default is 4MB
○ High I/O activity for a query is an indicator that an increase in work_mem can help
○ Each parallel operation is allowed to use memory up to this value
Easily tuned database parameters - Costs
● cpu_tuple_cost
○ Cost of processing a single row of data, including operations like WHERE and JOIN
○ Default is 0.01
○ Lower values encourage query planner to process more rows, helpful for I/O bound operations
○ Higher values encourage query planner to process less rows, helpful for CPU bound operations
● random_page_cost
○ Cost of non-sequential disk page access
○ Default is 4.0
○ Lower values imply low cost for random access, encouraging index scans
○ Higher values imply high cost for random access, encouraging sequential scans
● effective_cache_size
○ Expected size of database cache, including shared buffers and OS cache
○ Default is 4GB (this is not an allocation)
○ Higher values imply more data in cache, encouraging index scans
○ Lower values imply less data in cache, encouraging sequential scans
Parameter Tuning
Key takeaways:
● Tweak configuration parameters based on your
hardware and workload
● This will require some experimentation till you nail it
down
● Makes an ideal candidate for AI-based tuning
2!
ke
Ta
PostgreSQL Performance
for Application Developers
Because There is no Magic Button
Zafin, Ottawa
03-Mar-2025
Scaling
● Query & SQL Optimization
PostgreSQL ● Architectural Improvements
● Performance Features
Some tips for developers ● Parameter Tuning
● Vacuum and Dead Tuples
● Best practices for JOINs
Tips for Zafin ● INSERT performance
● Reading EXPLAIN plans
● Vacuum and Dead Tuples
Tips for Zafin ●
●
Best practices for JOINs
INSERT performance
● Reading EXPLAIN plans
The Vacuum Process in PostgreSQL
Credit: https://www.cs.cmu.edu/~pavlo/blog/2023/04/the-part-of-postgresql-we-hate-the-most.html
Mismanaging the Autovacuum Process
● Table bloat
● Poor performance
● Storage creep
● Inaccurate database statistics
Vacuum and Dead Tuples
Key takeaways:
● Never turn autovacuum off
● ‘Idle in transaction’ holds a lock that prevents vacuum
● Long running transactions are a killer for vacuum
● There is a problem in autovacuum configuration if:
○ Vacuum process is taking too long to complete
○ Vacuum process is taking up too many resources
● Vacuum and Dead Tuples
● Best practices for JOINs
Tips for Zafin ● INSERT performance
● Reading EXPLAIN plans
JOINs Explained
Common Pitfalls in JOINs
1. https://explain.depesz.com/
2. https://explain.dalibo.com/
Conclusion
Database performance involves a
lot of variables. Optimize how
data is accessed before scaling
by credit card!
Questions?
pg_umair