SlideShare a Scribd company logo
© 2013 EnterpriseDB Corporation. All rights reserved. 1
WAL for DBAs – Everything you
want to know
Devrim Gündüz
Principal Systems Engineer @ EnterpriseDB
devrim.gunduz@EnterpriseDB.com
Twitter : @DevrimGunduz
PGDay.Asia 2017
WAL for DBAs
2
●
Who is this guy?
– Using Red Hat (and then Fedora) since 1996.
– Using PostgreSQL since 1998.
– Responsible for PostgreSQL YUM repository.
●
Used to break website, but recently gave up.
– Started some work on PostgreSQL Dockerfiles
recently. https://www.pgdocker.org
– Working at EnterpriseDB since 2011.
– The Guy With The PostgreSQL Tattoo! (imitations may
exist :) )
– Istanbul, Turkiye.
About me
FOSDEM PGDAY 2017
WAL for DBAs
3
●
Please tweet!
– #PostgreSQL
– #pgday
– #pgdayasia
Social media
FOSDEM PGDAY 2017
WAL for DBAs
4
(Did you tweet? Thanks!)
Social media
FOSDEM PGDAY 2017
WAL for DBAs
5
(Did you tweet? Thanks!)
Alternative hashtag:
#blamemagnus
Social media
FOSDEM PGDAY 2017
WAL for DBAs
6
June 26-28, Boston
We want to see great speakers who
can talk to the technical aspects of
using Postgres in the enterprise.
http://postgresvision.com/
Postgres Vision 2017
PGDay.Asia 2017
WAL for DBAs
7
Agenda (in random order)
•
What is WAL?
•
What does it include?
•
How to read it?
•
What about wal_level ?
•
Replication and WAL
•
Backup and WAL
•
PITR and WAL
•
Other topics
FOSDEM PGDAY 2017
WAL for DBAs
8
Please do not delete WAL files
manually.
Please.
Before we actually start:
FOSDEM PGDAY 2017
WAL for DBAs
9
Please do not delete WAL files
manually.
Please.
Please.
Before we actually start:
FOSDEM PGDAY 2017
WAL for DBAs
10
Please do not delete WAL files
manually.
Please.
Please.
PLEASE.
Before we actually start:
PGDay.Asia 2017
WAL for DBAs
11
What is WAL?
•
Write Ahead Log:
•
Logging of transactions
•
a.k.a. xlog (transaction log),
•
16 MB in most of the installations (can be configured, --with-
wal-segsize)
•
8 kB page size (can be configured, --with-wal-blocksize)
•
pg_xlog (<= 9.6)→ pg_wal (10+)
•
Because people deleted files under “log” directory.
FOSDEM PGDAY 2017
WAL for DBAs
12
MAKE WAL GREAT AGAIN!
So:
PGDay.Asia 2017
WAL for DBAs
13
What is WAL?
•
Designed to prevent data loss in most of the situations
•
OS crash, hardware failure, PostgreSQL crash.
•
Write transactions are written to WAL
•
Before transaction result is sent to the client
•
Data files are not changed on each transaction
•
Performance benefit
•
Should be kept in a separate drive.
•
Initdb, or symlink
PGDay.Asia 2017
WAL for DBAs
14
What is WAL?
•
Built-in feature
•
Life before WAL (not before B.C., though):
•
All changes go to durable storage (eventually), but:
•
Data page is loaded to shared_buffers
•
Changes are made there
•
Dirty buffers!
•
But not timely!
•
Crash → Data loss!
PGDay.Asia 2017
WAL for DBAs
15
What is WAL?
•
Life after WAL:
•
Almost all “modifications” are “logged” to WAL files (xlog
record)
•
Even if the transaction is aborted (ROLLBACK)
•
wal_buffers → WAL segments (files)
•
Ability to recover data after a crash
•
Checkpoint!
PGDay.Asia 2017
WAL for DBAs
16
Where is it used?
•
Transaction logging!
•
Replication
•
PITR
•
REDO
•
Sequentially availability is a must.
•
REDO vs UNDO
•
No REDO for temp tables and unlogged tables.
PGDay.Asia 2017
WAL for DBAs
17
WAL file naming
•
24 chars, hex.
•
1st 8 chars: timelineID
•
00000001 is the timelineID created by initdb
•
2nd
8 chars: logical WAL file
•
3rd
8 chars: WAL segment name (physical WAL file)
•
000000010000000000000001 → 000000010000000000000002
•
… 0000000100000000000000FF →
000000010000000100000000
•
Use PostgreSQL’s internal tools to manage them
•
pg_archivecleanup
•
pg_resetxlog
•
pg_xlogdump
•
....
PGDay.Asia 2017
WAL for DBAs
18
pg_xlogdump
•
We are all human.
•
Use pg_xlogdump, if you want to see contents of WAL files
•
rmgr --help to get list of all resource names, -f for follow, -n for limit. -z for stats.
•
pg_xlogdump -n 20 -f 000000010000000700000033
•
rmgr: Heap len (rec/tot): 3/ 59, tx: 389744, lsn: 7/33B66228, prev 7/33B661F0, desc: INSERT+INIT off 1,
blkref #0: rel 1663/13326/190344 blk 0
•
rmgr: Heap len (rec/tot): 3/ 59, tx: 389744, lsn: 7/33B66268, prev 7/33B66228, desc: INSERT off 2, blkref
#0: rel 1663/13326/190344 blk 0
•
rmgr: Transaction len (rec/tot): 8/ 34, tx: 389744, lsn: 7/33B662A8, prev 7/33B66268, desc: COMMIT 2017-02-
03 03:03:49.482223 +03
•
rmgr: Heap len (rec/tot): 14/ 69, tx: 389745, lsn: 7/33B662D0, prev 7/33B662A8, desc: HOT_UPDATE off
1 xmax 389745 ; new off 3 xmax 0, blkref #0: rel 1663/13326/190344 blk 0
•
rmgr: Transaction len (rec/tot): 8/ 34, tx: 389745, lsn: 7/33B66318, prev 7/33B662D0, desc: COMMIT 2017-02-
03 03:03:54.091645 +03
•
rmgr: XLOG len (rec/tot): 80/ 106, tx: 0, lsn: 7/33B66340, prev 7/33B66318, desc:
CHECKPOINT_ONLINE redo 7/33B66340; tli 1; prev tli 1; fpw true; xid 0/389746; oid 198532; multi 1; offset 0; oldest
xid 1866 in DB 129795; oldest multi 1 in DB 90123; oldest/newest commit timestamp xid: 388437/389745; oldest
running xid 0; online
•
rmgr: XLOG len (rec/tot): 0/ 24, tx: 0, lsn: 7/33B663B0, prev 7/33B66340, desc: SWITCH
•
PGDay.Asia 2017
WAL for DBAs
19
Shared Buffers, Bgwriter and checkpointer
•
shared_buffers in PostgreSQL
•
Dirty buffers
•
This is where transactions are performed
•
Side effect: Causes inconsistency(?) on durable storage, due
to dirty buffers.
•
Bgwriter: Background writer
•
LRU
•
Checkpointer
•
Pushing all dirty buffers to durable storage
•
Triggered automatically or manually
•
Backends may also write data to heap
PGDay.Asia 2017
WAL for DBAs
20
WAL: LSN
•
Log Sequence Number
•
Position of the record in WAL file.
•
Provides uniqueness for each xlog record.
•
Per docs: “Pointer to a location in WAL file”
•
During recovery, LSN on the page and LSN in the WAL file are
compared.
•
The larger one wins.
PGDay.Asia 2017
WAL for DBAs
21
●
Probably not the last one in ls list!
WAL: Finding current WAL file
•
postgres=# SELECT * from pg_current_xlog_location();
pg_current_xlog_location
--------------------------
40E6/2C85AC10
•
postgres=# SELECT pg_xlogfile_name(‘40E6/2C85AC10');
pg_xlogfile_name
--------------------------
00000003000040E60000002C
So:
•
postgres=# SELECT pg_xlogfile_name(pg_current_xlog_location());
pg_xlogfile_name
--------------------------
00000003000040E60000002C
PGDay.Asia 2017
WAL for DBAs
22
Checkpoint, and pg_control
●
As soon as the checkpoint starts, REDO point is stored in shared buffers.
●
A WAL record is created referencing checkpoint start, and it is first written to WAL
buffers, and then eventually to pg_control.
– pg_control is under $PGDATA/global
●
Unlike bgwriter, checkpointer writes all of the data in the shared_buffers to
durable storage.
●
PostgreSQL knows the latest REDO point, by looking at pg_control file.
PGDay.Asia 2017
WAL for DBAs
23
Checkpoint, and pg_control
●
pg_controldata:
– Latest checkpoint location: 40E7/E43B16B8
– Prior checkpoint location: 40E7/D8689090
They are LSN.
●
When checkpoint is completed, pg_control is updated with the position of
checkpoint.
●
After checkpoint, old WAL files are either recycled, or removed.
●
An “estimation” is done while recycling (based on previous checkpoint cycles)
●
9.5+: In minimum, min_wal_size WAL files are always recycled for future usage
PGDay.Asia 2017
WAL for DBAs
24
pg_control and REDO
●
postmaster reads pg_control on startup.
/usr/pgsql-10/bin/pg_controldata -D /var/lib/pgsql/10/data | grep state
– “Database cluster state”:
●
starting up
●
shut down
●
shut down in recovery
●
shutting down
●
in crash recovery
●
in archive recovery
●
in production
●
If pg_control says “in production”, but db server is not running, then this instance
is eligible for a recovery!
PGDay.Asia 2017
WAL for DBAs
25
pg_control and REDO
●
pg_control is the critical piece
– Should not be corrupted
– Per docs: “...theoretically a weak spot”
●
REDO: All WAL files must be sequentially available for complete recovery.
PGDay.Asia 2017
WAL for DBAs
26
Moving to the new WAL
●
A WAL segment may be full
●
PostgreSQL archiver will switch to the new xlog, if PostgreSQL reaches
archive_timeout value.
●
DBA issues pg_switch_xlog() function.
PGDay.Asia 2017
WAL for DBAs
27
WAL: Archiving
•
Replication, backup, PITR
•
archive_mode
•
archive_command
•
archive_timeout
PGDay.Asia 2017
WAL for DBAs
28
WAL: Point-In-Time Recovery (PITR)
•
A base backup (pg_basebackup!) and the WAL files are needed.
•
WAL files must be sequentially complete – otherwise PITR won’t
be finished.
•
“Roll-forward recovery”
PGDay.Asia 2017
WAL for DBAs
29
WAL: Point-In-Time Recovery (PITR)
•
PITR: Replaying WAL files on base backups, until recovery
target.
•
recovery_target_{time, xid, name, lsn}
•
If not specified, all archived WAL files are replayed.
•
recovery.conf and backup_label: Enters recovery mode.
•
restore_command,
recovery_target_XXX,recovery_target_inclusive
•
backup_label: Also includes checkpoint location (starting point of
recovery)
•
Almost like regular recovery process (WAL replay)
•
Up to recovery_target_XXX is replayed.
PGDay.Asia 2017
WAL for DBAs
30
WAL: Point-In-Time Recovery (PITR)
•
After recovery process, timelineID is increased by 1 (also
physical WAL file name is also increased by 1)
•
A .history file is created.
•
$ cat 00000003.history
1 403F/58000098 no recovery target specified
2 4048/43000098 before 2017-01-28 11:13:21.124512+03
“WAL files were replayed until the given time above, and theit
replay location is 4048/43000098.
•
•
•
PGDay.Asia 2017
WAL for DBAs
31
Full page writes
●
A WAL record cannot be replayed on a page which is corrupted during
bgwriter and/or checkpointer, because of hardware failure, OS crash,
kernel failure, etc.
●
Full page writes IYF.
●
Enabled by default.
– Please turn it off, if you want to throw a lot of money to PostgreSQL
support companies. Otherwise, don’t do so ;)
●
PostgreSQL writes header data + the entire page as XLOG record,
when a page changes after every checkpoint.
– Increasing checkpoint_timeout helps.
– Full-page image, backup block.
●
PostgreSQL can even recover itself from write failures (not hw failures,
though)
PGDay.Asia 2017
WAL for DBAs
32
WAL parameters
•
wal_level: Minimal, replica or logical
•
Must be > minimal for archiver to be able to run
•
fsync : Always on, please.
•
synchronous_commit: May lose some of the latest transactions
•
Server returns success to the client
•
Server waits a bit to flush the data to durable storage.
•
Less risky than fsync
•
wal_sync_method : fdatasync is usually better. Use
pg_test_fsync for testing.
PGDay.Asia 2017
WAL for DBAs
33
WAL parameters
•
wal_log_hints: When this value is set to on , the server writes the entire content of
each disk page to WAL after a checkpoint and during the first modification of that
page, even for non-critical modifications of so-called hint bits.
•
wal_compression: off by default. Less WAL files, more CPU overhead.
•
wal_buffers: -1: Automatic tuning of wal buffers: 1/32 of shared_buffers (not less than
64kB or no more than 16 MB (1 WAL file)
•
wal_writer_delay : Rounds between WAL writer flushes WAL.
•
wal_writer_flush_after: New in 9.6
FOSDEM PGDAY 2017
WAL for DBAs
34
Questions, comments?
© 2013 EnterpriseDB Corporation. All rights reserved. 35
WAL for DBAs – Everything you
want to know
Devrim Gündüz
Principal Systems Engineer @ EnterpriseDB
devrim.gunduz@EnterpriseDB.com
Twitter : @DevrimGunduz

More Related Content

PDF
Linux tuning to improve PostgreSQL performance
PostgreSQL-Consulting
 
PPTX
Mendix Cloud Hosting on CloudFoundry
Jasper van der Hoek
 
PDF
PostgreSQL and RAM usage
Alexey Bashtanov
 
PPTX
Postgresql
NexThoughts Technologies
 
PDF
Postgresql tutorial
Ashoka Vanjare
 
PPTX
分散システムについて語らせてくれ
Kumazaki Hiroki
 
PDF
Postgresql database administration volume 1
Federico Campoli
 
PDF
PostgreSQL Deep Internal
EXEM
 
Linux tuning to improve PostgreSQL performance
PostgreSQL-Consulting
 
Mendix Cloud Hosting on CloudFoundry
Jasper van der Hoek
 
PostgreSQL and RAM usage
Alexey Bashtanov
 
Postgresql tutorial
Ashoka Vanjare
 
分散システムについて語らせてくれ
Kumazaki Hiroki
 
Postgresql database administration volume 1
Federico Campoli
 
PostgreSQL Deep Internal
EXEM
 

What's hot (20)

PDF
Vacuum徹底解説
Masahiko Sawada
 
PDF
アーキテクチャから理解するPostgreSQLのレプリケーション
Masahiko Sawada
 
PDF
5 Steps to PostgreSQL Performance
Command Prompt., Inc
 
PDF
pg_hint_planを知る(第37回PostgreSQLアンカンファレンス@オンライン 発表資料)
NTT DATA Technology & Innovation
 
PDF
Introduction VAUUM, Freezing, XID wraparound
Masahiko Sawada
 
PDF
Standard Edition 2でも使えるOracle Database 12c Release 2オススメ新機能
Ryota Watabe
 
PDF
トランザクション処理可能な分散DB 「YugabyteDB」入門(Open Source Conference 2022 Online/Fukuoka 発...
NTT DATA Technology & Innovation
 
PPTX
PostgreSQLの統計情報について(第26回PostgreSQLアンカンファレンス@オンライン 発表資料)
NTT DATA Technology & Innovation
 
PPTX
PostgreSQLのfull_page_writesについて(第24回PostgreSQLアンカンファレンス@オンライン 発表資料)
NTT DATA Technology & Innovation
 
PDF
PostgreSQLのリカバリ超入門(もしくはWAL、CHECKPOINT、オンラインバックアップの仕組み)
Hironobu Suzuki
 
PDF
あなたの知らないPostgreSQL監視の世界
Yoshinori Nakanishi
 
PPTX
PostgreSQLクエリ実行の基礎知識 ~Explainを読み解こう~
Miki Shimogai
 
PDF
PostgreSQL 15の新機能を徹底解説
Masahiko Sawada
 
PDF
PostgreSQL replication
NTT DATA OSS Professional Services
 
PDF
Oracle db performance tuning
Simon Huang
 
PDF
統計情報のリセットによるautovacuumへの影響について(第39回PostgreSQLアンカンファレンス@オンライン 発表資料)
NTT DATA Technology & Innovation
 
PDF
まずやっとくPostgreSQLチューニング
Kosuke Kida
 
PDF
YugabyteDBを使ってみよう(NewSQL/分散SQLデータベースよろず勉強会 #1 発表資料)
NTT DATA Technology & Innovation
 
PDF
Oracle Performance Tuning Fundamentals
Enkitec
 
PDF
より深く知るオプティマイザとそのチューニング
Yuto Hayamizu
 
Vacuum徹底解説
Masahiko Sawada
 
アーキテクチャから理解するPostgreSQLのレプリケーション
Masahiko Sawada
 
5 Steps to PostgreSQL Performance
Command Prompt., Inc
 
pg_hint_planを知る(第37回PostgreSQLアンカンファレンス@オンライン 発表資料)
NTT DATA Technology & Innovation
 
Introduction VAUUM, Freezing, XID wraparound
Masahiko Sawada
 
Standard Edition 2でも使えるOracle Database 12c Release 2オススメ新機能
Ryota Watabe
 
トランザクション処理可能な分散DB 「YugabyteDB」入門(Open Source Conference 2022 Online/Fukuoka 発...
NTT DATA Technology & Innovation
 
PostgreSQLの統計情報について(第26回PostgreSQLアンカンファレンス@オンライン 発表資料)
NTT DATA Technology & Innovation
 
PostgreSQLのfull_page_writesについて(第24回PostgreSQLアンカンファレンス@オンライン 発表資料)
NTT DATA Technology & Innovation
 
PostgreSQLのリカバリ超入門(もしくはWAL、CHECKPOINT、オンラインバックアップの仕組み)
Hironobu Suzuki
 
あなたの知らないPostgreSQL監視の世界
Yoshinori Nakanishi
 
PostgreSQLクエリ実行の基礎知識 ~Explainを読み解こう~
Miki Shimogai
 
PostgreSQL 15の新機能を徹底解説
Masahiko Sawada
 
PostgreSQL replication
NTT DATA OSS Professional Services
 
Oracle db performance tuning
Simon Huang
 
統計情報のリセットによるautovacuumへの影響について(第39回PostgreSQLアンカンファレンス@オンライン 発表資料)
NTT DATA Technology & Innovation
 
まずやっとくPostgreSQLチューニング
Kosuke Kida
 
YugabyteDBを使ってみよう(NewSQL/分散SQLデータベースよろず勉強会 #1 発表資料)
NTT DATA Technology & Innovation
 
Oracle Performance Tuning Fundamentals
Enkitec
 
より深く知るオプティマイザとそのチューニング
Yuto Hayamizu
 
Ad

Similar to PostgreSQL WAL for DBAs (20)

PDF
Postgres Vision 2018: WAL: Everything You Want to Know
EDB
 
PPTX
M|18 Battle of the Online Schema Change Methods
MariaDB plc
 
PDF
More Than Just The Tip Of The Iceberg.pdf
Michal Gancarski
 
PDF
The Accidental DBA
PostgreSQL Experts, Inc.
 
PDF
How We Added Replication to QuestDB - JonTheBeach
javier ramirez
 
PDF
Aplicações 10x a 100x mais rápida com o postgre sql
Fabio Telles Rodriguez
 
PDF
Patroni - HA PostgreSQL made easy
Alexander Kukushkin
 
PDF
What's coming in Airflow 2.0? - NYC Apache Airflow Meetup
Kaxil Naik
 
PDF
2008 Collaborate IOUG Presentation
Biju Thomas
 
PDF
Useful PostgreSQL Extensions
EDB
 
PDF
Oracle to Postgres Migration - part 2
PgTraining
 
PDF
High performance json- postgre sql vs. mongodb
Wei Shan Ang
 
DOCX
Oracle GoldenGate
Anar Godjaev
 
PDF
PostgreSQL 9.5 - Major Features
InMobi Technology
 
PDF
GLOC 2014 NEOOUG - Oracle Database 12c New Features
Biju Thomas
 
PDF
Securefile LOBs
Martin Berger
 
PDF
COUG_AAbate_Oracle_Database_12c_New_Features
Alfredo Abate
 
PDF
OSMC 2008 | PostgreSQL Monitoring - Introduction, Internals And Monitoring S...
NETWAYS
 
PDF
Big data should be simple
Dori Waldman
 
PDF
Recent MariaDB features to learn for a happy life
Federico Razzoli
 
Postgres Vision 2018: WAL: Everything You Want to Know
EDB
 
M|18 Battle of the Online Schema Change Methods
MariaDB plc
 
More Than Just The Tip Of The Iceberg.pdf
Michal Gancarski
 
The Accidental DBA
PostgreSQL Experts, Inc.
 
How We Added Replication to QuestDB - JonTheBeach
javier ramirez
 
Aplicações 10x a 100x mais rápida com o postgre sql
Fabio Telles Rodriguez
 
Patroni - HA PostgreSQL made easy
Alexander Kukushkin
 
What's coming in Airflow 2.0? - NYC Apache Airflow Meetup
Kaxil Naik
 
2008 Collaborate IOUG Presentation
Biju Thomas
 
Useful PostgreSQL Extensions
EDB
 
Oracle to Postgres Migration - part 2
PgTraining
 
High performance json- postgre sql vs. mongodb
Wei Shan Ang
 
Oracle GoldenGate
Anar Godjaev
 
PostgreSQL 9.5 - Major Features
InMobi Technology
 
GLOC 2014 NEOOUG - Oracle Database 12c New Features
Biju Thomas
 
Securefile LOBs
Martin Berger
 
COUG_AAbate_Oracle_Database_12c_New_Features
Alfredo Abate
 
OSMC 2008 | PostgreSQL Monitoring - Introduction, Internals And Monitoring S...
NETWAYS
 
Big data should be simple
Dori Waldman
 
Recent MariaDB features to learn for a happy life
Federico Razzoli
 
Ad

More from PGConf APAC (20)

PDF
PGConf APAC 2018: Sponsored Talk by Fujitsu - The growing mandatory requireme...
PGConf APAC
 
PDF
PGConf APAC 2018: PostgreSQL 10 - Replication goes Logical
PGConf APAC
 
PDF
PGConf APAC 2018 - Lightening Talk #3: How To Contribute to PostgreSQL
PGConf APAC
 
PDF
PGConf APAC 2018 - Lightening Talk #2 - Centralizing Authorization in PostgreSQL
PGConf APAC
 
PDF
Sponsored Talk @ PGConf APAC 2018 - Choosing the right partner in your Postgr...
PGConf APAC
 
PDF
PGConf APAC 2018 - A PostgreSQL DBAs Toolbelt for 2018
PGConf APAC
 
PDF
PGConf APAC 2018 - Patroni: Kubernetes-native PostgreSQL companion
PGConf APAC
 
PDF
PGConf APAC 2018 - High performance json postgre-sql vs. mongodb
PGConf APAC
 
PDF
PGConf APAC 2018 - Monitoring PostgreSQL at Scale
PGConf APAC
 
PDF
PGConf APAC 2018 - Where's Waldo - Text Search and Pattern in PostgreSQL
PGConf APAC
 
PDF
PGConf APAC 2018 - Managing replication clusters with repmgr, Barman and PgBo...
PGConf APAC
 
PDF
PGConf APAC 2018 - PostgreSQL HA with Pgpool-II and whats been happening in P...
PGConf APAC
 
PDF
PGConf APAC 2018 - PostgreSQL performance comparison in various clouds
PGConf APAC
 
PDF
Sponsored Talk @ PGConf APAC 2018 - Migrating Oracle to EDB Postgres Approach...
PGConf APAC
 
PDF
PGConf APAC 2018 - Tale from Trenches
PGConf APAC
 
PDF
PGConf APAC 2018 Keynote: PostgreSQL goes eleven
PGConf APAC
 
PDF
Amazon (AWS) Aurora
PGConf APAC
 
PDF
Use Case: PostGIS and Agribotics
PGConf APAC
 
PDF
How to teach an elephant to rock'n'roll
PGConf APAC
 
PDF
PostgreSQL on Amazon RDS
PGConf APAC
 
PGConf APAC 2018: Sponsored Talk by Fujitsu - The growing mandatory requireme...
PGConf APAC
 
PGConf APAC 2018: PostgreSQL 10 - Replication goes Logical
PGConf APAC
 
PGConf APAC 2018 - Lightening Talk #3: How To Contribute to PostgreSQL
PGConf APAC
 
PGConf APAC 2018 - Lightening Talk #2 - Centralizing Authorization in PostgreSQL
PGConf APAC
 
Sponsored Talk @ PGConf APAC 2018 - Choosing the right partner in your Postgr...
PGConf APAC
 
PGConf APAC 2018 - A PostgreSQL DBAs Toolbelt for 2018
PGConf APAC
 
PGConf APAC 2018 - Patroni: Kubernetes-native PostgreSQL companion
PGConf APAC
 
PGConf APAC 2018 - High performance json postgre-sql vs. mongodb
PGConf APAC
 
PGConf APAC 2018 - Monitoring PostgreSQL at Scale
PGConf APAC
 
PGConf APAC 2018 - Where's Waldo - Text Search and Pattern in PostgreSQL
PGConf APAC
 
PGConf APAC 2018 - Managing replication clusters with repmgr, Barman and PgBo...
PGConf APAC
 
PGConf APAC 2018 - PostgreSQL HA with Pgpool-II and whats been happening in P...
PGConf APAC
 
PGConf APAC 2018 - PostgreSQL performance comparison in various clouds
PGConf APAC
 
Sponsored Talk @ PGConf APAC 2018 - Migrating Oracle to EDB Postgres Approach...
PGConf APAC
 
PGConf APAC 2018 - Tale from Trenches
PGConf APAC
 
PGConf APAC 2018 Keynote: PostgreSQL goes eleven
PGConf APAC
 
Amazon (AWS) Aurora
PGConf APAC
 
Use Case: PostGIS and Agribotics
PGConf APAC
 
How to teach an elephant to rock'n'roll
PGConf APAC
 
PostgreSQL on Amazon RDS
PGConf APAC
 

Recently uploaded (20)

PDF
QAware_Mario-Leander_Reimer_Architecting and Building a K8s-based AI Platform...
QAware GmbH
 
PPTX
Maximizing Revenue with Marketo Measure: A Deep Dive into Multi-Touch Attribu...
bbedford2
 
PPTX
classification of computer and basic part of digital computer
ravisinghrajpurohit3
 
PPTX
Why Use Open Source Reporting Tools for Business Intelligence.pptx
Varsha Nayak
 
PDF
Why Use Open Source Reporting Tools for Business Intelligence.pdf
Varsha Nayak
 
PDF
Protecting the Digital World Cyber Securit
dnthakkar16
 
PDF
Key Features to Look for in Arizona App Development Services
Net-Craft.com
 
PPTX
The-Dawn-of-AI-Reshaping-Our-World.pptxx
parthbhanushali307
 
PPTX
Web Testing.pptx528278vshbuqffqhhqiwnwuq
studylike474
 
PPTX
Presentation about variables and constant.pptx
safalsingh810
 
PPTX
PFAS Reporting Requirements 2026 Are You Submission Ready Certivo.pptx
Certivo Inc
 
PPTX
Presentation about variables and constant.pptx
kr2589474
 
PDF
Build Multi-agent using Agent Development Kit
FadyIbrahim23
 
PDF
Salesforce Implementation Services Provider.pdf
VALiNTRY360
 
PPTX
oapresentation.pptx
mehatdhavalrajubhai
 
PDF
lesson-2-rules-of-netiquette.pdf.bshhsjdj
jasmenrojas249
 
PDF
The Role of Automation and AI in EHS Management for Data Centers.pdf
TECH EHS Solution
 
PPTX
GALILEO CRS SYSTEM | GALILEO TRAVEL SOFTWARE
philipnathen82
 
PDF
49784907924775488180_LRN2959_Data_Pump_23ai.pdf
Abilash868456
 
PPTX
TestNG for Java Testing and Automation testing
ssuser0213cb
 
QAware_Mario-Leander_Reimer_Architecting and Building a K8s-based AI Platform...
QAware GmbH
 
Maximizing Revenue with Marketo Measure: A Deep Dive into Multi-Touch Attribu...
bbedford2
 
classification of computer and basic part of digital computer
ravisinghrajpurohit3
 
Why Use Open Source Reporting Tools for Business Intelligence.pptx
Varsha Nayak
 
Why Use Open Source Reporting Tools for Business Intelligence.pdf
Varsha Nayak
 
Protecting the Digital World Cyber Securit
dnthakkar16
 
Key Features to Look for in Arizona App Development Services
Net-Craft.com
 
The-Dawn-of-AI-Reshaping-Our-World.pptxx
parthbhanushali307
 
Web Testing.pptx528278vshbuqffqhhqiwnwuq
studylike474
 
Presentation about variables and constant.pptx
safalsingh810
 
PFAS Reporting Requirements 2026 Are You Submission Ready Certivo.pptx
Certivo Inc
 
Presentation about variables and constant.pptx
kr2589474
 
Build Multi-agent using Agent Development Kit
FadyIbrahim23
 
Salesforce Implementation Services Provider.pdf
VALiNTRY360
 
oapresentation.pptx
mehatdhavalrajubhai
 
lesson-2-rules-of-netiquette.pdf.bshhsjdj
jasmenrojas249
 
The Role of Automation and AI in EHS Management for Data Centers.pdf
TECH EHS Solution
 
GALILEO CRS SYSTEM | GALILEO TRAVEL SOFTWARE
philipnathen82
 
49784907924775488180_LRN2959_Data_Pump_23ai.pdf
Abilash868456
 
TestNG for Java Testing and Automation testing
ssuser0213cb
 

PostgreSQL WAL for DBAs

  • 1. © 2013 EnterpriseDB Corporation. All rights reserved. 1 WAL for DBAs – Everything you want to know Devrim Gündüz Principal Systems Engineer @ EnterpriseDB [email protected] Twitter : @DevrimGunduz
  • 2. PGDay.Asia 2017 WAL for DBAs 2 ● Who is this guy? – Using Red Hat (and then Fedora) since 1996. – Using PostgreSQL since 1998. – Responsible for PostgreSQL YUM repository. ● Used to break website, but recently gave up. – Started some work on PostgreSQL Dockerfiles recently. https://www.pgdocker.org – Working at EnterpriseDB since 2011. – The Guy With The PostgreSQL Tattoo! (imitations may exist :) ) – Istanbul, Turkiye. About me
  • 3. FOSDEM PGDAY 2017 WAL for DBAs 3 ● Please tweet! – #PostgreSQL – #pgday – #pgdayasia Social media
  • 4. FOSDEM PGDAY 2017 WAL for DBAs 4 (Did you tweet? Thanks!) Social media
  • 5. FOSDEM PGDAY 2017 WAL for DBAs 5 (Did you tweet? Thanks!) Alternative hashtag: #blamemagnus Social media
  • 6. FOSDEM PGDAY 2017 WAL for DBAs 6 June 26-28, Boston We want to see great speakers who can talk to the technical aspects of using Postgres in the enterprise. http://postgresvision.com/ Postgres Vision 2017
  • 7. PGDay.Asia 2017 WAL for DBAs 7 Agenda (in random order) • What is WAL? • What does it include? • How to read it? • What about wal_level ? • Replication and WAL • Backup and WAL • PITR and WAL • Other topics
  • 8. FOSDEM PGDAY 2017 WAL for DBAs 8 Please do not delete WAL files manually. Please. Before we actually start:
  • 9. FOSDEM PGDAY 2017 WAL for DBAs 9 Please do not delete WAL files manually. Please. Please. Before we actually start:
  • 10. FOSDEM PGDAY 2017 WAL for DBAs 10 Please do not delete WAL files manually. Please. Please. PLEASE. Before we actually start:
  • 11. PGDay.Asia 2017 WAL for DBAs 11 What is WAL? • Write Ahead Log: • Logging of transactions • a.k.a. xlog (transaction log), • 16 MB in most of the installations (can be configured, --with- wal-segsize) • 8 kB page size (can be configured, --with-wal-blocksize) • pg_xlog (<= 9.6)→ pg_wal (10+) • Because people deleted files under “log” directory.
  • 12. FOSDEM PGDAY 2017 WAL for DBAs 12 MAKE WAL GREAT AGAIN! So:
  • 13. PGDay.Asia 2017 WAL for DBAs 13 What is WAL? • Designed to prevent data loss in most of the situations • OS crash, hardware failure, PostgreSQL crash. • Write transactions are written to WAL • Before transaction result is sent to the client • Data files are not changed on each transaction • Performance benefit • Should be kept in a separate drive. • Initdb, or symlink
  • 14. PGDay.Asia 2017 WAL for DBAs 14 What is WAL? • Built-in feature • Life before WAL (not before B.C., though): • All changes go to durable storage (eventually), but: • Data page is loaded to shared_buffers • Changes are made there • Dirty buffers! • But not timely! • Crash → Data loss!
  • 15. PGDay.Asia 2017 WAL for DBAs 15 What is WAL? • Life after WAL: • Almost all “modifications” are “logged” to WAL files (xlog record) • Even if the transaction is aborted (ROLLBACK) • wal_buffers → WAL segments (files) • Ability to recover data after a crash • Checkpoint!
  • 16. PGDay.Asia 2017 WAL for DBAs 16 Where is it used? • Transaction logging! • Replication • PITR • REDO • Sequentially availability is a must. • REDO vs UNDO • No REDO for temp tables and unlogged tables.
  • 17. PGDay.Asia 2017 WAL for DBAs 17 WAL file naming • 24 chars, hex. • 1st 8 chars: timelineID • 00000001 is the timelineID created by initdb • 2nd 8 chars: logical WAL file • 3rd 8 chars: WAL segment name (physical WAL file) • 000000010000000000000001 → 000000010000000000000002 • … 0000000100000000000000FF → 000000010000000100000000 • Use PostgreSQL’s internal tools to manage them • pg_archivecleanup • pg_resetxlog • pg_xlogdump • ....
  • 18. PGDay.Asia 2017 WAL for DBAs 18 pg_xlogdump • We are all human. • Use pg_xlogdump, if you want to see contents of WAL files • rmgr --help to get list of all resource names, -f for follow, -n for limit. -z for stats. • pg_xlogdump -n 20 -f 000000010000000700000033 • rmgr: Heap len (rec/tot): 3/ 59, tx: 389744, lsn: 7/33B66228, prev 7/33B661F0, desc: INSERT+INIT off 1, blkref #0: rel 1663/13326/190344 blk 0 • rmgr: Heap len (rec/tot): 3/ 59, tx: 389744, lsn: 7/33B66268, prev 7/33B66228, desc: INSERT off 2, blkref #0: rel 1663/13326/190344 blk 0 • rmgr: Transaction len (rec/tot): 8/ 34, tx: 389744, lsn: 7/33B662A8, prev 7/33B66268, desc: COMMIT 2017-02- 03 03:03:49.482223 +03 • rmgr: Heap len (rec/tot): 14/ 69, tx: 389745, lsn: 7/33B662D0, prev 7/33B662A8, desc: HOT_UPDATE off 1 xmax 389745 ; new off 3 xmax 0, blkref #0: rel 1663/13326/190344 blk 0 • rmgr: Transaction len (rec/tot): 8/ 34, tx: 389745, lsn: 7/33B66318, prev 7/33B662D0, desc: COMMIT 2017-02- 03 03:03:54.091645 +03 • rmgr: XLOG len (rec/tot): 80/ 106, tx: 0, lsn: 7/33B66340, prev 7/33B66318, desc: CHECKPOINT_ONLINE redo 7/33B66340; tli 1; prev tli 1; fpw true; xid 0/389746; oid 198532; multi 1; offset 0; oldest xid 1866 in DB 129795; oldest multi 1 in DB 90123; oldest/newest commit timestamp xid: 388437/389745; oldest running xid 0; online • rmgr: XLOG len (rec/tot): 0/ 24, tx: 0, lsn: 7/33B663B0, prev 7/33B66340, desc: SWITCH •
  • 19. PGDay.Asia 2017 WAL for DBAs 19 Shared Buffers, Bgwriter and checkpointer • shared_buffers in PostgreSQL • Dirty buffers • This is where transactions are performed • Side effect: Causes inconsistency(?) on durable storage, due to dirty buffers. • Bgwriter: Background writer • LRU • Checkpointer • Pushing all dirty buffers to durable storage • Triggered automatically or manually • Backends may also write data to heap
  • 20. PGDay.Asia 2017 WAL for DBAs 20 WAL: LSN • Log Sequence Number • Position of the record in WAL file. • Provides uniqueness for each xlog record. • Per docs: “Pointer to a location in WAL file” • During recovery, LSN on the page and LSN in the WAL file are compared. • The larger one wins.
  • 21. PGDay.Asia 2017 WAL for DBAs 21 ● Probably not the last one in ls list! WAL: Finding current WAL file • postgres=# SELECT * from pg_current_xlog_location(); pg_current_xlog_location -------------------------- 40E6/2C85AC10 • postgres=# SELECT pg_xlogfile_name(‘40E6/2C85AC10'); pg_xlogfile_name -------------------------- 00000003000040E60000002C So: • postgres=# SELECT pg_xlogfile_name(pg_current_xlog_location()); pg_xlogfile_name -------------------------- 00000003000040E60000002C
  • 22. PGDay.Asia 2017 WAL for DBAs 22 Checkpoint, and pg_control ● As soon as the checkpoint starts, REDO point is stored in shared buffers. ● A WAL record is created referencing checkpoint start, and it is first written to WAL buffers, and then eventually to pg_control. – pg_control is under $PGDATA/global ● Unlike bgwriter, checkpointer writes all of the data in the shared_buffers to durable storage. ● PostgreSQL knows the latest REDO point, by looking at pg_control file.
  • 23. PGDay.Asia 2017 WAL for DBAs 23 Checkpoint, and pg_control ● pg_controldata: – Latest checkpoint location: 40E7/E43B16B8 – Prior checkpoint location: 40E7/D8689090 They are LSN. ● When checkpoint is completed, pg_control is updated with the position of checkpoint. ● After checkpoint, old WAL files are either recycled, or removed. ● An “estimation” is done while recycling (based on previous checkpoint cycles) ● 9.5+: In minimum, min_wal_size WAL files are always recycled for future usage
  • 24. PGDay.Asia 2017 WAL for DBAs 24 pg_control and REDO ● postmaster reads pg_control on startup. /usr/pgsql-10/bin/pg_controldata -D /var/lib/pgsql/10/data | grep state – “Database cluster state”: ● starting up ● shut down ● shut down in recovery ● shutting down ● in crash recovery ● in archive recovery ● in production ● If pg_control says “in production”, but db server is not running, then this instance is eligible for a recovery!
  • 25. PGDay.Asia 2017 WAL for DBAs 25 pg_control and REDO ● pg_control is the critical piece – Should not be corrupted – Per docs: “...theoretically a weak spot” ● REDO: All WAL files must be sequentially available for complete recovery.
  • 26. PGDay.Asia 2017 WAL for DBAs 26 Moving to the new WAL ● A WAL segment may be full ● PostgreSQL archiver will switch to the new xlog, if PostgreSQL reaches archive_timeout value. ● DBA issues pg_switch_xlog() function.
  • 27. PGDay.Asia 2017 WAL for DBAs 27 WAL: Archiving • Replication, backup, PITR • archive_mode • archive_command • archive_timeout
  • 28. PGDay.Asia 2017 WAL for DBAs 28 WAL: Point-In-Time Recovery (PITR) • A base backup (pg_basebackup!) and the WAL files are needed. • WAL files must be sequentially complete – otherwise PITR won’t be finished. • “Roll-forward recovery”
  • 29. PGDay.Asia 2017 WAL for DBAs 29 WAL: Point-In-Time Recovery (PITR) • PITR: Replaying WAL files on base backups, until recovery target. • recovery_target_{time, xid, name, lsn} • If not specified, all archived WAL files are replayed. • recovery.conf and backup_label: Enters recovery mode. • restore_command, recovery_target_XXX,recovery_target_inclusive • backup_label: Also includes checkpoint location (starting point of recovery) • Almost like regular recovery process (WAL replay) • Up to recovery_target_XXX is replayed.
  • 30. PGDay.Asia 2017 WAL for DBAs 30 WAL: Point-In-Time Recovery (PITR) • After recovery process, timelineID is increased by 1 (also physical WAL file name is also increased by 1) • A .history file is created. • $ cat 00000003.history 1 403F/58000098 no recovery target specified 2 4048/43000098 before 2017-01-28 11:13:21.124512+03 “WAL files were replayed until the given time above, and theit replay location is 4048/43000098. • • •
  • 31. PGDay.Asia 2017 WAL for DBAs 31 Full page writes ● A WAL record cannot be replayed on a page which is corrupted during bgwriter and/or checkpointer, because of hardware failure, OS crash, kernel failure, etc. ● Full page writes IYF. ● Enabled by default. – Please turn it off, if you want to throw a lot of money to PostgreSQL support companies. Otherwise, don’t do so ;) ● PostgreSQL writes header data + the entire page as XLOG record, when a page changes after every checkpoint. – Increasing checkpoint_timeout helps. – Full-page image, backup block. ● PostgreSQL can even recover itself from write failures (not hw failures, though)
  • 32. PGDay.Asia 2017 WAL for DBAs 32 WAL parameters • wal_level: Minimal, replica or logical • Must be > minimal for archiver to be able to run • fsync : Always on, please. • synchronous_commit: May lose some of the latest transactions • Server returns success to the client • Server waits a bit to flush the data to durable storage. • Less risky than fsync • wal_sync_method : fdatasync is usually better. Use pg_test_fsync for testing.
  • 33. PGDay.Asia 2017 WAL for DBAs 33 WAL parameters • wal_log_hints: When this value is set to on , the server writes the entire content of each disk page to WAL after a checkpoint and during the first modification of that page, even for non-critical modifications of so-called hint bits. • wal_compression: off by default. Less WAL files, more CPU overhead. • wal_buffers: -1: Automatic tuning of wal buffers: 1/32 of shared_buffers (not less than 64kB or no more than 16 MB (1 WAL file) • wal_writer_delay : Rounds between WAL writer flushes WAL. • wal_writer_flush_after: New in 9.6
  • 34. FOSDEM PGDAY 2017 WAL for DBAs 34 Questions, comments?
  • 35. © 2013 EnterpriseDB Corporation. All rights reserved. 35 WAL for DBAs – Everything you want to know Devrim Gündüz Principal Systems Engineer @ EnterpriseDB [email protected] Twitter : @DevrimGunduz