0% found this document useful (0 votes)
323 views5 pages

How To Set Up REPMGR With WITNESS For PostgreSQL 10 Official Pythian®® Blog

Uploaded by

Aymen
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
323 views5 pages

How To Set Up REPMGR With WITNESS For PostgreSQL 10 Official Pythian®® Blog

Uploaded by

Aymen
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Your source for technical trends, tips, and best practices from Pythian

experts

Subscribe

Business Insights < Technical Track < Pythian Life <


https://blog.pythian.com https://blog.pythian.com https://blog.pythian.com
/business-insights/> /technical-track/> /pythian-life/>

HOW TO SET UP REPMGR WITH WITNESS FOR POSTGRESQL �� Type to find your solution 

by Kevin Markwardt < https://blog.pythian.com/author/kmarkwardt/> | July 30, 2019


dba-cloud-services
Posted in: PostgreSQL < https://blog.pythian.com/technical-track/postgresql/> , Technical Track <
https://blog.pythian.com/technical-track/>
Tags: PostgreSQL < https://blog.pythian.com/tag/postgresql/>

This blog post will go over how to set up and implement repmgr which is the PostgreSQL application to manage
replication between primary and replica nodes, allowing for quick and easy failover and rebuilding of replicas.
For reference, all commands are run as root. For the commands that need to be run as the Postgres user, I will
run them using su. As an example:
< https://hubs.ly/H0byTD_0>
1. su - postgres -c 'COMMAND RUN AS POSTGRES USER'

Outline
• The Setup
• SSH
• REPMGR
• PostgreSQL Primary Configuration
<
• PostgreSQL Replica Configuration
https://www.bigdatasummitca
• PostgreSQL Witness Configuration
nada.com/>
• Summary

The Setup
Contact Us
The environment that I will be working in consists of four CentOS 7 servers with a default install of PostgreSQL
10 installed using yum from the PostgreSQL 10 repo.  With PostgreSQL installed on each server, the Postgres First Name*
user will already be on each server. I will configure the first server as the primary master. The second and third
First Name *
server will become replicas of the primary master. Then the fourth and final server will become a witness server
used for voting in automatic failover scenarios.

base-centos-1 = primary master Last Name*

base-centos-2 = replica Last Name *


base-centos-3 = replica
base-centos-4 = witness
Work Email*
SSH
Work Email *
In order for repmgr to work correctly, the Postgres user account will need to be able to SSH to each of the other
servers with no prompt.  In order to do this I went to each server and generated an SSH key, and then collected
all of the public SSH keys and placed them in an authorzied_keys file on each server.  This allowed me to SSH Phone*
between the servers with no prompt. I have SELINUX disabled by default. If you are using SELINUX, make sure it Phone *
is set up correctly for SSH and authorized keys.

On each server, I did the following:


Job Title*
1) [ON ALL SERVERS] Become the Postgres user, create the ssh key, then cat out the public key:
Job Title *
1. su - postgres -c 'ssh-keygen'
2. su - postgres -c 'cat ~/.ssh/id_rsa.pub'

2) [ON ALL SERVERS] Create an authorized keys file and set the permissions: Company Name*
1. su - postgres -c 'touch ~/.ssh/authorized_keys' Company *
2. su - postgres -c 'chmod 600 ~/.ssh/authorized_keys'
3. su - postgres -c 'vi ~/.ssh/authorized_keys'

3) [ON ALL SERVERS] Place all of the public keys from Step 1 into the authorized_keys files created in Step 2.  Area of Interest
The authorized_keys file contents looked like the following on all of my servers, with each key as a new line: Please Select
1. ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDTXxjFY8dLs2GVRpDY7asAK5SvwITPVSJN9ItnwsVtzCpZgX/Mbnkc/jHgwuIGb0srh/KthByyYJi14QViI+x7xVQm8eyuqMB
2. ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDYHmDFOtYc/VDxccNRQEnDYBTE8QDiUTMX46PX1p5tvs6qvP3VPMEccs4um0YVFXZTmbnvyeN3bBPe23NS5Pal6ySfAxIdAAO
3. ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCgQNra4/rDOVWr5uV5nSa49yPLlgAJ6crsYKpvhfRr3L5J/T48QE468Am5t3lA318Nst2FbObq7dduqBNhOQDurPlTiPd9cWQ
4. ssh-rsa Tell us about your project:
AAAAB3NzaC1yc2EAAAADAQABAAABAQC5UAsmEkw9INHwXL6cpHqOy5O8VIpvLsfklzoHYfmGxBhGqi6nZzV/+TAzpotrmAf7PIUEdzWOm1lTfii1iRU821ks1bSPN2F

4) I am now able to become the Postgres user and SSH to any of the other servers and log in without a Tell us about your
password prompt. This is an example on base-centos-1: project:
1. [root@base-centos-1 vagrant]# su - postgres
2. Last login: Mon Jul 22 18:27:34 UTC 2019 on pts/0
3. -bash-4.2$ ssh base-centos-2
4. Last login: Mon Jul 22 18:22:31 2019
5. -bash-4.2$ hostname
6. base-centos-2
Allow Pythian to send
REPMGR me occasional business
emails. I understand I can
For repmgr, I first installed the repmgr software, and then configured it on each server to make sure all of the
unsubscribe anytime.
configurations are correct. With the installation of PostgreSQL 10 on my servers, I installed the PostgreSQL
repository using the following RPM. This repo also includes repmgr, which is where I am installing the software
from. You can find the different yum repositories from https://yum.postgresql.org/repopackages.php:
1. yum install -y https://download.postgresql.org/pub/repos/yum/reporpms/EL-7-x86_64/pgdg-redhat-repo-latest.noarch.rpm

1) [ON ALL SERVERS] I installed repmgr from the PostgreSQL 10 repository. Make sure you install the correct
repmgr for the version of PosgreSQL that you are running:
1. yum install -y repmgr10.x86_64
SUBMIT

2) [ON ALL SERVERS] Next, I configured repmgr on each server with the servers specific details. Make sure to
update each according to the server and paths. As an example for each server, I updated the node_id,
node_name, and conninfo to match the servers I was configuring. If you are using a different version of
PostgreSQL you will want to make sure you update the paths for the version you are using. Do NOT assume the
paths in my example will be the same on your system. In my setup, the configuration file was located at
/etc/repmgr/10/repmgr.conf.
1. node_id=101
2. node_name='base-centos-1'
3. conninfo='host=base-centos-1 dbname=repmgr user=repmgr'
4. data_directory='/var/lib/pgsql/10/data/'
5. config_directory='/var/lib/pgsql/10/data'
6. log_file='/var/log/repmgr.log'
7. repmgrd_service_start_command = '/usr/pgsql-10/bin/repmgrd -d'
8. repmgrd_service_stop_command = 'kill `cat $(/usr/pgsql-10/bin/repmgrd --show-pid-file)`'
9. promote_command='repmgr standby promote -f /etc/repmgr/10/repmgr.conf --siblings-follow --log-to-file'
10. follow_command='repmgr standby follow -f /etc/repmgr/10/repmgr.conf --log-to-file'
11. failover=automatic
12. reconnect_attempts=3
13. reconnect_interval=5
14. ssh_options='-q -o StrictHostKeyChecking=no -o ConnectTimeout=10'

3) [ON ALL SERVERS] I created the log file that I configured in Step 2 so I would not get an error when starting
the service:
1. su - postgres -c 'touch /var/log/repmgr.log'

PostgreSQL Primary Configuration


Next, I started working on the primary server to get it set up within repmgr. For the configuration of the
pg_bha.conf and postgresql.conf, I am only updating them on the primary master, because when I rebuild the
replicas from the master, these changes will get copied over. If your server keeps the configuration files
separate from the data directory, you will need to update these configurations on all of the servers, not just on
the primary master.

1) Create the repmgr user account and repmgr database that will be used for repmgr to manage the cluster. The
repmgr user account will be used for replication to the PostgreSQL replica servers to the primary master.
1. su - postgres -c 'createuser --replication --createdb --createrole --superuser repmgr'
2. su - postgres -c "psql -c 'ALTER USER repmgr SET search_path TO repmgr, \"\$user\", public;'"
3. su - postgres -c 'createdb repmgr --owner=repmgr'

2) Update pg_hba.conf to allow the repmgr account to authenticate. With trust being used, this allows the
repmgr user account in the database to authenticate without a password. If you are building a production
environment, you will want a more secure method using md5 and passwords. These changes won’t take place
until the PostgreSQL service is restarted, which I will do after the next step. In my setup this file is located at
/var/lib/pgsql/10/data/pg_hba.conf. I found that I had the best results by specifying the IPs.
1. host replication repmgr 192.168.56.101/32 trust
2. host replication repmgr 192.168.56.102/32 trust
3. host replication repmgr 192.168.56.103/32 trust
4. host replication repmgr 192.168.56.104/32 trust
5. host repmgr repmgr 192.168.56.101/32 trust
6. host repmgr repmgr 192.168.56.102/32 trust
7. host repmgr repmgr 192.168.56.103/32 trust
8. host repmgr repmgr 192.168.56.104/32 trust

3) Next, I configured the PostgreSQL configuration file to allow for replication to occur by setting the wal_level
and other settings. I also add the repmgr shared libraries into the postgresql.conf file. In my setup, the
PostgreSQL configuration file is at /var/lib/pgsql/10/data/postgresql.conf.
1. listen_addresses = '*'
2. shared_preload_libraries = 'repmgr'
3. wal_level = replica
4. archive_mode = on
5. max_wal_senders = 10
6. hot_standby = on
7. archive_command = 'cp -i %p /var/lib/pgsql/10/data/archive/%f'

4) I then created the archive directory that I specified in the PostgresSQL configuration file using the Postgres
user to make sure it had the correct permissions, and then I restarted PostgreSQL server to pick up the new
settings.
1. su - postgres -c 'mkdir /var/lib/pgsql/10/data/archive'
2. systemctl enable postgresql-10.service
3. systemctl restart postgresql-10.service
4. systemctl status postgresql-10.service

5) Now that repmgr and PostgreSQL are both configured, I will register my PostgreSQL server with repmgr and
then start the repmgr daemon service so that it monitors the status of the replication cluster.
1. su - postgres -c 'repmgr primary register'
2. su - postgres -c 'repmgr daemon start'
3. su - postgres -c 'repmgr daemon status'
4. ID | Name | Role | Status | Upstream | repmgrd | PID | Paused? | Upstream last seen
5. ----+---------------+---------+-----------+---------------+---------+------+---------+--------------------
6. 101 | base-centos-1 | primary | * running | | running | 5496 | no | n/a

PostgreSQL Replica Configuration


At this stage I already have SSH and regmgr installed and repmgr configured on all of the servers. Now I will
utilize repmgr to backup the databases off of the primary server and restore them onto the two replica servers
base-centos-2 and base-centos-3. Because my pg_hba.conf and postgresql.conf files are in the same directory as
my data, these files will also get copied over during the backup and restore process. I will be running the
following commands on both of the replica servers, but not the final witness server which I will configure later.

1) Stop PostgreSQL and clear the data directory:


1. systemctl stop postgresql-10.service
2. rm -rf /var/lib/pgsql/10/data/*

2) Next, I backed up and restored the data from the primary server, and then started the PostgreSQL server and
viewed the status to make sure it was running:
1. su - postgres -c "repmgr -h base-centos-1 -U repmgr -d repmgr standby clone"
2. systemctl start postgresql-10.service
3. systemctl status postgresql-10.service

3) Then I register the replica with the repmgr cluster and start the repmgr daemon service to monitor the server
in the cluster:
1. su - postgres -c 'repmgr standby register -h base-centos-1 -U repmgr'
2. su - postgres -c 'repmgr daemon start'
3. su - postgres -c 'repmgr daemon status'
4. ID | Name | Role | Status | Upstream | repmgrd | PID | Paused? | Upstream last seen
5. ----+---------------+---------+-----------+---------------+---------+------+---------+--------------------
6. 101 | base-centos-1 | primary | * running | | running | 5496 | no | n/a
7. 102 | base-centos-2 | standby | running | base-centos-1 | running | 3414 | no | 1 second(s) ago
8. 103 | base-centos-3 | standby | running | base-centos-1 | running | 4549 | no | 1 second(s) ago

4) To verify that the replication is up and running I can run the following commands to validate the replication
status:

ON PRIMARY
1. su - postgres -c 'psql -c "select pid, usename, client_addr, backend_start, state, sync_state from pg_stat_replication;"
2. pid | usename | client_addr | backend_start | state | sync_state
3. -------+---------+----------------+-------------------------------+-----------+------------
4. 15347 | repmgr | 192.168.56.102 | 2019-07-22 18:19:31.232492+00 | streaming | async
5. 15363 | repmgr | 192.168.56.103 | 2019-07-22 18:19:36.566369+00 | streaming | async

ON REPLICAs
1. su - postgres -c 'psql --pset expanded=auto -c "select * from pg_stat_wal_receiver;"'
2. -[ RECORD 1 ]---------+------------------------------------------------------------------
3. pid | 5408
4. status | streaming
5. receive_start_lsn | 0/B000000
6. receive_start_tli | 3
7. received_lsn | 0/B0025B8
8. received_tli | 3
9. last_msg_send_time | 2019-07-22 18:22:17.942712+00
10. last_msg_receipt_time | 2019-07-22 18:22:17.943306+00
11. latest_end_lsn | 0/B0025B8
12. latest_end_time | 2019-07-22 18:19:47.617303+00
13. slot_name |
14. conninfo | user=repmgr host='base-centos-1' application_name='base-centos-3'

PostgreSQL Witness Configuration


Having a separate witness is good in the event that there is a network outage and you want a server to vote on
who should become the new master. This will help prevent split-brain scenarios, with the understanding that
you cannot truly prevent split brains unless you implement a configuration management solution like consul or
zookeeper. The important thing to note when setting up a witness is that the witness has to use its own repmgr
database. You CANNOT make a standby replica server a witness. I will NOT be restoring the database from the
primary onto the witness server, so I will have to repeat a couple of the steps to create the repmgr user and
database which will be separate. I then have to configure the pg_hba settings to allow for the repmgr user from
the other servers to talk to the witness.

1) Create the repmgr user account and repmgr database that will be used for repmgr to manage the cluster. The
repmgr user account will be used for replication to the PostgreSQL replica servers to the primary master.
1. su - postgres -c 'createuser --replication --createdb --createrole --superuser repmgr'
2. su - postgres -c "psql -c 'ALTER USER repmgr SET search_path TO repmgr, \"\$user\", public;'"
3. su - postgres -c 'createdb repmgr --owner=repmgr'

2) Update pg_hba.conf to allow the repmgr account to authenticate. With trust being used, the PosgreSQL user
can authenticate without a password. If you are building a production environment, you will want a more secure
method using md5 and passwords. These changes won’t take place until PostgreSQL service has been restarted,
which I will do after the next step. In my setup, this file is located at /var/lib/pgsql/10/data/pg_hba.conf.
1. host replication repmgr 192.168.56.101/32 trust
2. host replication repmgr 192.168.56.102/32 trust
3. host replication repmgr 192.168.56.103/32 trust
4. host replication repmgr 192.168.56.104/32 trust
5. host repmgr repmgr 192.168.56.101/32 trust
6. host repmgr repmgr 192.168.56.102/32 trust
7. host repmgr repmgr 192.168.56.103/32 trust
8. host repmgr repmgr 192.168.56.104/32 trust

3) Register the witness with the primary server in the cluster:


1. su - postgres -c 'repmgr witness register -h base-centos-1 -U repmgr'
2. su - postgres -c 'repmgr daemon start'
3. su - postgres -c 'repmgr daemon status'
4. ID | Name | Role | Status | Upstream | repmgrd | PID | Paused? | Upstream last seen
5. ----+---------------+---------+-----------+---------------+---------+------+---------+--------------------
6. 101 | base-centos-1 | primary | * running | | running | 5496 | no | n/a
7. 102 | base-centos-2 | standby | running | base-centos-1 | running | 3414 | no | 1 second(s) ago
8. 103 | base-centos-3 | standby | running | base-centos-1 | running | 4549 | no | 1 second(s) ago
9. 104 | base-centos-4 | witness | * running | base-centos-1 | running | 3226 | no | 0 second(s) ago

Summary
Now I have a fully configured repmgr cluster that allows for me to failover from the primary to one of the
replicas. As an example, I can run the following to migrate the primary role to one of the standbys. In order to
run this, the failover command has to be run on the standby that is becoming the new primary:
1. su - postgres -c 'repmgr --dry-run -h base-centos-2 standby switchover --siblings-follow'
2. su - postgres -c 'repmgr -h base-centos-2 standby switchover --siblings-follow'
3. su - postgres -c 'repmgr cluster show'
4. su - postgres -c 'repmgr cluster event'

FAILOVER TO base-centos-2
1. su - postgres -c 'repmgr -h base-centos-2 standby switchover --siblings-follow'
2. WARNING: following problems with command line parameters detected:
3. database connection parameters not required when executing UNKNOWN ACTION
4. NOTICE: executing switchover on node "base-centos-2" (ID: 102)
5. NOTICE: local node "base-centos-2" (ID: 102) will be promoted to primary; current primary "base-centos-1" (ID: 101) will be demoted to
6. NOTICE: stopping current primary node "base-centos-1" (ID: 101)
7. NOTICE: issuing CHECKPOINT
8. DETAIL: executing server command "/usr/pgsql-10/bin/pg_ctl -D '/var/lib/pgsql/10/data' -W -m fast stop"
9. INFO: checking for primary shutdown; 1 of 60 attempts ("shutdown_check_timeout")
10. INFO: checking for primary shutdown; 2 of 60 attempts ("shutdown_check_timeout")
11. NOTICE: current primary has been cleanly shut down at location 0/C000028
12. NOTICE: promoting standby to primary
13. DETAIL: promoting server "base-centos-2" (ID: 102) using "/usr/pgsql-10/bin/pg_ctl -w -D '/var/lib/pgsql/10/data' promote"
14. waiting for server to promote.... done
15. server promoted
16. NOTICE: waiting up to 60 seconds (parameter "promote_check_timeout") for promotion to complete
17. NOTICE: STANDBY PROMOTE successful
18. DETAIL: server "base-centos-2" (ID: 102) was successfully promoted to primary
19. INFO: local node 101 can attach to rejoin target node 102
20. DETAIL: local node's recovery point: 0/C000028; rejoin target node's fork point: 0/C000098
21. NOTICE: setting node 101's upstream to node 102
22. WARNING: unable to ping "host=base-centos-1 dbname=repmgr user=repmgr"
23. DETAIL: PQping() returned "PQPING_NO_RESPONSE"
24. NOTICE: starting server using "/usr/pgsql-10/bin/pg_ctl -w -D '/var/lib/pgsql/10/data' start"
25. NOTICE: NODE REJOIN successful
26. DETAIL: node 101 is now attached to node 102
27. NOTICE: executing STANDBY FOLLOW on 2 of 2 siblings
28. INFO: node 104 received notification to follow node 102
29. INFO: STANDBY FOLLOW successfully executed on all reachable sibling nodes
30. NOTICE: switchover was successful
31. DETAIL: node "base-centos-2" is now primary and node "base-centos-1" is attached as standby
32. NOTICE: STANDBY SWITCHOVER has completed successfully

CLUSTER STATUS
1. su - postgres -c 'repmgr cluster show'
2. ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string
3. -----+---------------+---------+-----------+---------------+----------+----------+----------+------------------------------------------
4. 101 | base-centos-1 | standby | running | base-centos-2 | default | 100 | 3 | host=base-centos-1 dbname=repmgr user=repmgr
5. 102 | base-centos-2 | primary | * running | | default | 100 | 4 | host=base-centos-2 dbname=repmgr user=repmgr
6. 103 | base-centos-3 | standby | running | base-centos-2 | default | 100 | 3 | host=base-centos-3 dbname=repmgr user=repmgr
7. 104 | base-centos-4 | witness | * running | base-centos-2 | default | 0 | 1 | host=base-centos-4 dbname=repmgr user=repmgr

CLUSTER EVENTS FOR FAILOVER


1. su - postgres -c 'repmgr cluster event'
2. Node ID | Name | Event | OK | Timestamp | Details
3. ---------+---------------+----------------------------+----+---------------------+-----------------------------------------------------
4. 102 | base-centos-2 | child_node_new_connect | t | 2019-07-22 20:00:04 | new witness "base-centos-4" (ID: 104) has connected
5. 102 | base-centos-2 | child_node_new_connect | t | 2019-07-22 20:00:04 | new standby "base-centos-3" (ID: 103) has connected
6. 104 | base-centos-4 | witness_register | t | 2019-07-22 19:59:59 | witness registration succeeded; upstream node ID is
7. 103 | base-centos-3 | standby_follow | t | 2019-07-22 19:59:59 | standby attached to upstream node "base-centos-2" (ID:
8. 104 | base-centos-4 | repmgrd_upstream_reconnect | t | 2019-07-22 19:59:59 | witness monitoring connection to primary node
9. 102 | base-centos-2 | repmgrd_reload | t | 2019-07-22 19:59:58 | monitoring cluster primary "base-centos-2" (ID: 102)
10. 101 | base-centos-1 | repmgrd_standby_reconnect | t | 2019-07-22 19:59:58 | node has become a standby, monitoring connection to upstrea
11. 102 | base-centos-2 | standby_switchover | t | 2019-07-22 19:59:54 | node 102 promoted to primary, node 101 demoted to standby
12. 101 | base-centos-1 | node_rejoin | t | 2019-07-22 19:59:54 | node 101 is now attached to node 102
13. 102 | base-centos-2 | standby_promote | t | 2019-07-22 19:59:54 | server "base-centos-2" (ID: 102) was successfully promoted to primary

Author
Kevin Markwardt < https://blog.pythian.com/author/kmarkwardt/>

Want to talk with an expert? Schedule a call with our team to get the conversation started <
https://www.pythian.com/contact/> .

About the Author

Kevin Markwardt < https://blog.pythian.com/author/kmarkwardt/>


MySQL Database Consultant

Kevin Markwardt has twenty years of system administration experience ranging from MySQL, Linux, Windows,
and VMware. Over the last six years he has been dedicated to MySQL and Linux administration with a focus on
scripting, automation, HA, and cloud solutions. Kevin has lead and assisted with many projects focusing on
larger scale implementations of technologies, including ProxySQL, Orchestrator, Pacemaker, GCP, AWS RDS,
and MySQL. Kevin Markwardt is a certified GCP Professional Cloud Architect, and a certified AWS Solutions
Architect - Associate. Currently he is a Project Engineer at Pythian specializing in MySQL and large scale client
projects. One of his new directives is Postgres and is currently supporting multiple internal production Postgres
instances.

 2 Comments. Leave new

Jozef January 7, 2020 5:04 am

Awesome article, thank you, you made my day …

Reply

Micha? K?pczy?ski December 27, 2020 11:09 am

Great tutorial. Just missed pg_bindir in repmgr.conf. It makes error if you create new conf file. It’s really
good. Thank you Kevin.

Reply

Leave a Reply

Privacy Policy < https://pythian.com/privacy/> Terms & Conditions < https://pythian.com/terms-conditions/>

Anti-Slavery and Human Trafficking < https://pythian.com/anti-slavery-and-human-trafficking/> Pythian is AODA Compliant

© Copyright 2022 Pythian Services Inc. ® ALL RIGHTS RESERVED PYTHIAN® and LOVE YOUR DATA® are trademarks and registered trademarks owned by Pythian in North America and certain
other countries, and are valuable assets of our company. Other brands, product and company names on this website may be trademarks or registered trademarks of Pythian or of third parties. Use of
trademarks without permission is strictly prohibited.

You might also like