How To Set Up REPMGR With WITNESS For PostgreSQL 10 Official Pythian®® Blog
How To Set Up REPMGR With WITNESS For PostgreSQL 10 Official Pythian®® Blog
experts
Subscribe
HOW TO SET UP REPMGR WITH WITNESS FOR POSTGRESQL �� Type to find your solution
This blog post will go over how to set up and implement repmgr which is the PostgreSQL application to manage
replication between primary and replica nodes, allowing for quick and easy failover and rebuilding of replicas.
For reference, all commands are run as root. For the commands that need to be run as the Postgres user, I will
run them using su. As an example:
< https://hubs.ly/H0byTD_0>
1. su - postgres -c 'COMMAND RUN AS POSTGRES USER'
Outline
• The Setup
• SSH
• REPMGR
• PostgreSQL Primary Configuration
<
• PostgreSQL Replica Configuration
https://www.bigdatasummitca
• PostgreSQL Witness Configuration
nada.com/>
• Summary
The Setup
Contact Us
The environment that I will be working in consists of four CentOS 7 servers with a default install of PostgreSQL
10 installed using yum from the PostgreSQL 10 repo. With PostgreSQL installed on each server, the Postgres First Name*
user will already be on each server. I will configure the first server as the primary master. The second and third
First Name *
server will become replicas of the primary master. Then the fourth and final server will become a witness server
used for voting in automatic failover scenarios.
2) [ON ALL SERVERS] Create an authorized keys file and set the permissions: Company Name*
1. su - postgres -c 'touch ~/.ssh/authorized_keys' Company *
2. su - postgres -c 'chmod 600 ~/.ssh/authorized_keys'
3. su - postgres -c 'vi ~/.ssh/authorized_keys'
3) [ON ALL SERVERS] Place all of the public keys from Step 1 into the authorized_keys files created in Step 2. Area of Interest
The authorized_keys file contents looked like the following on all of my servers, with each key as a new line: Please Select
1. ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDTXxjFY8dLs2GVRpDY7asAK5SvwITPVSJN9ItnwsVtzCpZgX/Mbnkc/jHgwuIGb0srh/KthByyYJi14QViI+x7xVQm8eyuqMB
2. ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDYHmDFOtYc/VDxccNRQEnDYBTE8QDiUTMX46PX1p5tvs6qvP3VPMEccs4um0YVFXZTmbnvyeN3bBPe23NS5Pal6ySfAxIdAAO
3. ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCgQNra4/rDOVWr5uV5nSa49yPLlgAJ6crsYKpvhfRr3L5J/T48QE468Am5t3lA318Nst2FbObq7dduqBNhOQDurPlTiPd9cWQ
4. ssh-rsa Tell us about your project:
AAAAB3NzaC1yc2EAAAADAQABAAABAQC5UAsmEkw9INHwXL6cpHqOy5O8VIpvLsfklzoHYfmGxBhGqi6nZzV/+TAzpotrmAf7PIUEdzWOm1lTfii1iRU821ks1bSPN2F
4) I am now able to become the Postgres user and SSH to any of the other servers and log in without a Tell us about your
password prompt. This is an example on base-centos-1: project:
1. [root@base-centos-1 vagrant]# su - postgres
2. Last login: Mon Jul 22 18:27:34 UTC 2019 on pts/0
3. -bash-4.2$ ssh base-centos-2
4. Last login: Mon Jul 22 18:22:31 2019
5. -bash-4.2$ hostname
6. base-centos-2
Allow Pythian to send
REPMGR me occasional business
emails. I understand I can
For repmgr, I first installed the repmgr software, and then configured it on each server to make sure all of the
unsubscribe anytime.
configurations are correct. With the installation of PostgreSQL 10 on my servers, I installed the PostgreSQL
repository using the following RPM. This repo also includes repmgr, which is where I am installing the software
from. You can find the different yum repositories from https://yum.postgresql.org/repopackages.php:
1. yum install -y https://download.postgresql.org/pub/repos/yum/reporpms/EL-7-x86_64/pgdg-redhat-repo-latest.noarch.rpm
1) [ON ALL SERVERS] I installed repmgr from the PostgreSQL 10 repository. Make sure you install the correct
repmgr for the version of PosgreSQL that you are running:
1. yum install -y repmgr10.x86_64
SUBMIT
2) [ON ALL SERVERS] Next, I configured repmgr on each server with the servers specific details. Make sure to
update each according to the server and paths. As an example for each server, I updated the node_id,
node_name, and conninfo to match the servers I was configuring. If you are using a different version of
PostgreSQL you will want to make sure you update the paths for the version you are using. Do NOT assume the
paths in my example will be the same on your system. In my setup, the configuration file was located at
/etc/repmgr/10/repmgr.conf.
1. node_id=101
2. node_name='base-centos-1'
3. conninfo='host=base-centos-1 dbname=repmgr user=repmgr'
4. data_directory='/var/lib/pgsql/10/data/'
5. config_directory='/var/lib/pgsql/10/data'
6. log_file='/var/log/repmgr.log'
7. repmgrd_service_start_command = '/usr/pgsql-10/bin/repmgrd -d'
8. repmgrd_service_stop_command = 'kill `cat $(/usr/pgsql-10/bin/repmgrd --show-pid-file)`'
9. promote_command='repmgr standby promote -f /etc/repmgr/10/repmgr.conf --siblings-follow --log-to-file'
10. follow_command='repmgr standby follow -f /etc/repmgr/10/repmgr.conf --log-to-file'
11. failover=automatic
12. reconnect_attempts=3
13. reconnect_interval=5
14. ssh_options='-q -o StrictHostKeyChecking=no -o ConnectTimeout=10'
3) [ON ALL SERVERS] I created the log file that I configured in Step 2 so I would not get an error when starting
the service:
1. su - postgres -c 'touch /var/log/repmgr.log'
1) Create the repmgr user account and repmgr database that will be used for repmgr to manage the cluster. The
repmgr user account will be used for replication to the PostgreSQL replica servers to the primary master.
1. su - postgres -c 'createuser --replication --createdb --createrole --superuser repmgr'
2. su - postgres -c "psql -c 'ALTER USER repmgr SET search_path TO repmgr, \"\$user\", public;'"
3. su - postgres -c 'createdb repmgr --owner=repmgr'
2) Update pg_hba.conf to allow the repmgr account to authenticate. With trust being used, this allows the
repmgr user account in the database to authenticate without a password. If you are building a production
environment, you will want a more secure method using md5 and passwords. These changes won’t take place
until the PostgreSQL service is restarted, which I will do after the next step. In my setup this file is located at
/var/lib/pgsql/10/data/pg_hba.conf. I found that I had the best results by specifying the IPs.
1. host replication repmgr 192.168.56.101/32 trust
2. host replication repmgr 192.168.56.102/32 trust
3. host replication repmgr 192.168.56.103/32 trust
4. host replication repmgr 192.168.56.104/32 trust
5. host repmgr repmgr 192.168.56.101/32 trust
6. host repmgr repmgr 192.168.56.102/32 trust
7. host repmgr repmgr 192.168.56.103/32 trust
8. host repmgr repmgr 192.168.56.104/32 trust
3) Next, I configured the PostgreSQL configuration file to allow for replication to occur by setting the wal_level
and other settings. I also add the repmgr shared libraries into the postgresql.conf file. In my setup, the
PostgreSQL configuration file is at /var/lib/pgsql/10/data/postgresql.conf.
1. listen_addresses = '*'
2. shared_preload_libraries = 'repmgr'
3. wal_level = replica
4. archive_mode = on
5. max_wal_senders = 10
6. hot_standby = on
7. archive_command = 'cp -i %p /var/lib/pgsql/10/data/archive/%f'
4) I then created the archive directory that I specified in the PostgresSQL configuration file using the Postgres
user to make sure it had the correct permissions, and then I restarted PostgreSQL server to pick up the new
settings.
1. su - postgres -c 'mkdir /var/lib/pgsql/10/data/archive'
2. systemctl enable postgresql-10.service
3. systemctl restart postgresql-10.service
4. systemctl status postgresql-10.service
5) Now that repmgr and PostgreSQL are both configured, I will register my PostgreSQL server with repmgr and
then start the repmgr daemon service so that it monitors the status of the replication cluster.
1. su - postgres -c 'repmgr primary register'
2. su - postgres -c 'repmgr daemon start'
3. su - postgres -c 'repmgr daemon status'
4. ID | Name | Role | Status | Upstream | repmgrd | PID | Paused? | Upstream last seen
5. ----+---------------+---------+-----------+---------------+---------+------+---------+--------------------
6. 101 | base-centos-1 | primary | * running | | running | 5496 | no | n/a
2) Next, I backed up and restored the data from the primary server, and then started the PostgreSQL server and
viewed the status to make sure it was running:
1. su - postgres -c "repmgr -h base-centos-1 -U repmgr -d repmgr standby clone"
2. systemctl start postgresql-10.service
3. systemctl status postgresql-10.service
3) Then I register the replica with the repmgr cluster and start the repmgr daemon service to monitor the server
in the cluster:
1. su - postgres -c 'repmgr standby register -h base-centos-1 -U repmgr'
2. su - postgres -c 'repmgr daemon start'
3. su - postgres -c 'repmgr daemon status'
4. ID | Name | Role | Status | Upstream | repmgrd | PID | Paused? | Upstream last seen
5. ----+---------------+---------+-----------+---------------+---------+------+---------+--------------------
6. 101 | base-centos-1 | primary | * running | | running | 5496 | no | n/a
7. 102 | base-centos-2 | standby | running | base-centos-1 | running | 3414 | no | 1 second(s) ago
8. 103 | base-centos-3 | standby | running | base-centos-1 | running | 4549 | no | 1 second(s) ago
4) To verify that the replication is up and running I can run the following commands to validate the replication
status:
ON PRIMARY
1. su - postgres -c 'psql -c "select pid, usename, client_addr, backend_start, state, sync_state from pg_stat_replication;"
2. pid | usename | client_addr | backend_start | state | sync_state
3. -------+---------+----------------+-------------------------------+-----------+------------
4. 15347 | repmgr | 192.168.56.102 | 2019-07-22 18:19:31.232492+00 | streaming | async
5. 15363 | repmgr | 192.168.56.103 | 2019-07-22 18:19:36.566369+00 | streaming | async
ON REPLICAs
1. su - postgres -c 'psql --pset expanded=auto -c "select * from pg_stat_wal_receiver;"'
2. -[ RECORD 1 ]---------+------------------------------------------------------------------
3. pid | 5408
4. status | streaming
5. receive_start_lsn | 0/B000000
6. receive_start_tli | 3
7. received_lsn | 0/B0025B8
8. received_tli | 3
9. last_msg_send_time | 2019-07-22 18:22:17.942712+00
10. last_msg_receipt_time | 2019-07-22 18:22:17.943306+00
11. latest_end_lsn | 0/B0025B8
12. latest_end_time | 2019-07-22 18:19:47.617303+00
13. slot_name |
14. conninfo | user=repmgr host='base-centos-1' application_name='base-centos-3'
1) Create the repmgr user account and repmgr database that will be used for repmgr to manage the cluster. The
repmgr user account will be used for replication to the PostgreSQL replica servers to the primary master.
1. su - postgres -c 'createuser --replication --createdb --createrole --superuser repmgr'
2. su - postgres -c "psql -c 'ALTER USER repmgr SET search_path TO repmgr, \"\$user\", public;'"
3. su - postgres -c 'createdb repmgr --owner=repmgr'
2) Update pg_hba.conf to allow the repmgr account to authenticate. With trust being used, the PosgreSQL user
can authenticate without a password. If you are building a production environment, you will want a more secure
method using md5 and passwords. These changes won’t take place until PostgreSQL service has been restarted,
which I will do after the next step. In my setup, this file is located at /var/lib/pgsql/10/data/pg_hba.conf.
1. host replication repmgr 192.168.56.101/32 trust
2. host replication repmgr 192.168.56.102/32 trust
3. host replication repmgr 192.168.56.103/32 trust
4. host replication repmgr 192.168.56.104/32 trust
5. host repmgr repmgr 192.168.56.101/32 trust
6. host repmgr repmgr 192.168.56.102/32 trust
7. host repmgr repmgr 192.168.56.103/32 trust
8. host repmgr repmgr 192.168.56.104/32 trust
Summary
Now I have a fully configured repmgr cluster that allows for me to failover from the primary to one of the
replicas. As an example, I can run the following to migrate the primary role to one of the standbys. In order to
run this, the failover command has to be run on the standby that is becoming the new primary:
1. su - postgres -c 'repmgr --dry-run -h base-centos-2 standby switchover --siblings-follow'
2. su - postgres -c 'repmgr -h base-centos-2 standby switchover --siblings-follow'
3. su - postgres -c 'repmgr cluster show'
4. su - postgres -c 'repmgr cluster event'
FAILOVER TO base-centos-2
1. su - postgres -c 'repmgr -h base-centos-2 standby switchover --siblings-follow'
2. WARNING: following problems with command line parameters detected:
3. database connection parameters not required when executing UNKNOWN ACTION
4. NOTICE: executing switchover on node "base-centos-2" (ID: 102)
5. NOTICE: local node "base-centos-2" (ID: 102) will be promoted to primary; current primary "base-centos-1" (ID: 101) will be demoted to
6. NOTICE: stopping current primary node "base-centos-1" (ID: 101)
7. NOTICE: issuing CHECKPOINT
8. DETAIL: executing server command "/usr/pgsql-10/bin/pg_ctl -D '/var/lib/pgsql/10/data' -W -m fast stop"
9. INFO: checking for primary shutdown; 1 of 60 attempts ("shutdown_check_timeout")
10. INFO: checking for primary shutdown; 2 of 60 attempts ("shutdown_check_timeout")
11. NOTICE: current primary has been cleanly shut down at location 0/C000028
12. NOTICE: promoting standby to primary
13. DETAIL: promoting server "base-centos-2" (ID: 102) using "/usr/pgsql-10/bin/pg_ctl -w -D '/var/lib/pgsql/10/data' promote"
14. waiting for server to promote.... done
15. server promoted
16. NOTICE: waiting up to 60 seconds (parameter "promote_check_timeout") for promotion to complete
17. NOTICE: STANDBY PROMOTE successful
18. DETAIL: server "base-centos-2" (ID: 102) was successfully promoted to primary
19. INFO: local node 101 can attach to rejoin target node 102
20. DETAIL: local node's recovery point: 0/C000028; rejoin target node's fork point: 0/C000098
21. NOTICE: setting node 101's upstream to node 102
22. WARNING: unable to ping "host=base-centos-1 dbname=repmgr user=repmgr"
23. DETAIL: PQping() returned "PQPING_NO_RESPONSE"
24. NOTICE: starting server using "/usr/pgsql-10/bin/pg_ctl -w -D '/var/lib/pgsql/10/data' start"
25. NOTICE: NODE REJOIN successful
26. DETAIL: node 101 is now attached to node 102
27. NOTICE: executing STANDBY FOLLOW on 2 of 2 siblings
28. INFO: node 104 received notification to follow node 102
29. INFO: STANDBY FOLLOW successfully executed on all reachable sibling nodes
30. NOTICE: switchover was successful
31. DETAIL: node "base-centos-2" is now primary and node "base-centos-1" is attached as standby
32. NOTICE: STANDBY SWITCHOVER has completed successfully
CLUSTER STATUS
1. su - postgres -c 'repmgr cluster show'
2. ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string
3. -----+---------------+---------+-----------+---------------+----------+----------+----------+------------------------------------------
4. 101 | base-centos-1 | standby | running | base-centos-2 | default | 100 | 3 | host=base-centos-1 dbname=repmgr user=repmgr
5. 102 | base-centos-2 | primary | * running | | default | 100 | 4 | host=base-centos-2 dbname=repmgr user=repmgr
6. 103 | base-centos-3 | standby | running | base-centos-2 | default | 100 | 3 | host=base-centos-3 dbname=repmgr user=repmgr
7. 104 | base-centos-4 | witness | * running | base-centos-2 | default | 0 | 1 | host=base-centos-4 dbname=repmgr user=repmgr
Author
Kevin Markwardt < https://blog.pythian.com/author/kmarkwardt/>
Want to talk with an expert? Schedule a call with our team to get the conversation started <
https://www.pythian.com/contact/> .
Kevin Markwardt has twenty years of system administration experience ranging from MySQL, Linux, Windows,
and VMware. Over the last six years he has been dedicated to MySQL and Linux administration with a focus on
scripting, automation, HA, and cloud solutions. Kevin has lead and assisted with many projects focusing on
larger scale implementations of technologies, including ProxySQL, Orchestrator, Pacemaker, GCP, AWS RDS,
and MySQL. Kevin Markwardt is a certified GCP Professional Cloud Architect, and a certified AWS Solutions
Architect - Associate. Currently he is a Project Engineer at Pythian specializing in MySQL and large scale client
projects. One of his new directives is Postgres and is currently supporting multiple internal production Postgres
instances.
Reply
Great tutorial. Just missed pg_bindir in repmgr.conf. It makes error if you create new conf file. It’s really
good. Thank you Kevin.
Reply
Leave a Reply
© Copyright 2022 Pythian Services Inc. ® ALL RIGHTS RESERVED PYTHIAN® and LOVE YOUR DATA® are trademarks and registered trademarks owned by Pythian in North America and certain
other countries, and are valuable assets of our company. Other brands, product and company names on this website may be trademarks or registered trademarks of Pythian or of third parties. Use of
trademarks without permission is strictly prohibited.