PostgreSQL When It's Not Your Job
PostgreSQL When It's Not Your Job
Christophe Pettus
PostgreSQL Experts, Inc.
PGConf US 2017
Welcome!
• Christophe Pettus
• CEO of PostgreSQL Experts, Inc.
• Based in sunny Alameda, California.
• Technical blog: thebuild.com
• Twitter: @xof
• [email protected]
What is this?
• … use them!
• Provides platform-specific scripting, etc.
• RedHat-flavor and Debian-flavor have their
own repositories.
• Other OSes have a variety of packaging
systems.
If you use packages…
• Introduced in 9.3.
• Maintains a checksum for data pages.
• Very small performance hit. Use it.
• initdb option.
• Can add in /etc/postgresql-common/
createcluster.conf for Debian packaging.
Examples
• Using initdb:
• initdb-D /data/9.5/ -k -E UTF8 \
--locale=en_US.UTF-8
• Using pg_createcluster:
• pg_createcluster9.5 main -D /data/9.5/main \
-E UTF8 --locale=en_US.UTF-8 -- -k
Other Important Things.
• Logging.
• Memory.
• Checkpoints.
• Planner.
• You’re done.
• No, really, you’re done!
Logging.
log_destination = 'csvlog'
log_directory = 'pg_log'
logging_collector = on
log_filename = 'postgres-%Y-%m-%d_%H%M%S'
log_rotation_age = 1d
log_rotation_size = 1GB
log_min_duration_statement = 250ms
log_checkpoints = on
log_connections = on
log_disconnections = on
log_lock_waits = on
log_temp_files = 0
Memory configuration
• shared_buffers
• work_mem
• maintenance_work_mem
shared_buffers
wal_buffers = 16MB
checkpoint_completion_target = 0.9
checkpoint_segments = 32 # To start.
Checkpoint settings, 9.5 and later.
wal_buffers = 16MB
checkpoint_completion_target = 0.9
min_wal_size = 512MB
max_wal_size = 2GB
Checkpoint settings, 9.4 and earlier.
• fsync = on
• Never change this.
• synchronous_commit = on
• Change this, but only if you understand
the data loss potential.
Changing settings.
• archive_command
• Runs a command each time a WAL
segment is complete.
• This command can do whatever you want.
• What you want is to move the WAL
segment to someplace safe…
• … on a different system.
Getting started with PITR.
• SELECT pg_start_backup(...);
• Copy the disk image and any WAL files that
are created.
• SELECT pg_stop_backup();
• Make sure you have all the WAL segments.
• The disk image + WAL segments are your
backup.
WAL-E
• http://github.com/wal-e/wal-e
• Provides a full set of appropriate scripting.
• Automates create PITR backups into AWS
S3.
• Highly recommended!
PITR Restore
• Point-in-time recovery.
• You don’t have to replay the entire WAL
stream.
• It can be stopped at a particular timestamp,
or transaction ID.
• Very handy for application-level problems!
Disaster recovery.
• WAL-E
• repmgr
• barman
• backrest
• Use a packaged solution; don't roll your
own unless you must.
Replication!
Replication.
• Highly configurable.
• Can push part or all of the tables; don’t
have to replicate everything.
• Multi-master setups possible (Bucardo).
Trigger-based rep: The bad.
BEGIN;
INSERT INTO transactions(account_id, value, offset_id)
VALUES (11, 120.00, 14);
INSERT INTO transactions(account_id, value, offset_id)
VALUES (14, -120.00, 11);
COMMIT;
Transaction Properties.
• PostgreSQL supports:
• READ COMMITTED — The default.
• REPEATABLE READ
• SERIALIZABLE
• It does not support:
• READ UNCOMMITTED (“dirty read”)
Higher isolation modes.
• It probably is.
• The database generally stabilize at 20% to
50% bloat. That’s acceptable.
• If you see autovacuum workers running,
that’s generally not a problem.
“No, really, VACUUMs not working!”
• Use UTF-8.
• Just. Do. It.
• There is no compelling reason to use any
other character encoding.
• One edge case: the bottleneck is sorting
text strings. This is very, very rare.
Time Representation.
• B-Tree.
• Hash.
• GiST.
• SP-GiST.
• GIN.
B-Tree Indexes.
• Single column.
• Multiple column (composite).
• Expression (“functional”) indexes.
Single Column B-Trees
• Indexes on an expression.
• PostgreSQL can recognize when you are
querying on that expression and use the
index.
• Can be expensive to create, but very fast to
execute.
• Make sure PostgreSQL is really using it!
Partial Indexes.
• A B-tree of B-trees.
• Tokens organized into B-trees.
• Row pointers also organized into B-trees.
• On-disk footprint can be quite large.
• Recent versions have major optimizations
here.
“Why isn’t it using my indexes?”
• pg_stat_user_indexes
• Reports the number of times an index is
used.
• If non-constraint indexes are not being
used, drop them.
• Indexes are very expensive to maintain.
And finally…
• Do this promptly!
• Only requires installing new binaries.
• If using packages, often as easy as just an
apt-get / yum upgrade.
• Very small amount of downtime.
Major version upgrade.
• pgbadger
• The only choice now for monitoring text
logs.
• pg_stat_statements
• Maintains a buffer of data on statements
executed, within PostgreSQL.
Monitor, monitor, monitor.
• https://wiki.postgresql.org/images/6/6a/
Dba_toolbelt_2017.pdf
Thank you!