repmgr - Automate process of making primary to standby after failover

573 views Asked by At

I have two postgres servers running on CentOS 7 with repmgr 4.1.0-1. So far I have automated the process of promoting the standby to primary after the primary server fails but when it comes back they both act as primary and I don't think the follow_command from repmgr.conf is executed. I can do it manually by deleting the data folder and clone it from the new primary server, and then register it as standby.

repmgr.conf on server 1

node_id=1
node_name=pgdb1
conninfo='host=192.168.0.105 user=repmgr dbname=repmgr'
pg_bindir=/usr/pgsql-9.6/bin/
master_response_timeout=5
reconnect_attempts=2
reconnect_interval=2
failover=automatic
promote_command='/usr/pgsql-9.6/bin/repmgr standby promote -f /var/lib/pgsql/9.6/repmgr/repmgr.conf --log-to-file'
follow_command='/usr/pgsql-9.6/bin/repmgr standby follow -f /var/lib/pgsql/9.6/repmgr/repmgr.conf --log-to-file --upstream-node-id=2'
data_directory='/var/lib/pgsql/9.6/data'
log_file='/var/log/repmgr/repmgr.log'
log_level=DEBUG
service_start_command   = 'sudo systemctl start postgresql-9.6'
service_stop_command    = 'sudo systemctl stop postgresql-9.6'
service_restart_command = 'sudo systemctl restart postgresql-9.6'
service_reload_command  = 'sudo systemctl reload postgresql-9.6'

repmgr.conf on server 2

node_id=2
node_name=pgdb2
conninfo='host=192.168.0.106 user=repmgr dbname=repmgr'
pg_bindir=/usr/pgsql-9.6/bin/
master_response_timeout=5
reconnect_attempts=2
reconnect_interval=2
failover=automatic
promote_command='/usr/pgsql-9.6/bin/repmgr standby promote -f /var/lib/pgsql/9.6/repmgr/repmgr.conf --log-to-file'
follow_command='/usr/pgsql-9.6/bin/repmgr standby follow -f /var/lib/pgsql/9.6/repmgr/repmgr.conf --log-to-file --upstream-node-id=1'
data_directory='/var/lib/pgsql/9.6/data'
log_file='/var/log/repmgr/repmgr.log'
log_level=DEBUG
service_start_command   = 'sudo systemctl start postgresql-9.6'
service_stop_command    = 'sudo systemctl stop postgresql-9.6'
service_restart_command = 'sudo systemctl restart postgresql-9.6'
service_reload_command  = 'sudo systemctl reload postgresql-9.6'

After the server starts again it connects to itself and resumes monitoring. Here is the log

[2018-08-16 21:29:56] [DEBUG] connecting to: "user=repmgr dbname=repmgr host=192.168.0.105 connect_timeout=2 fallback_application_name=repmgr"
[2018-08-16 21:29:56] [NOTICE] reconnected to primary node after 22 seconds, resuming monitoring
[2018-08-16 21:31:33] [INFO] monitoring primary node "pgdb1" (node ID: 1) in normal state

Is there a way to automate the primary to switch to standby when it starts again or the original standby that was promoted to go back to standby? Or maybe I can redirect to a script that'll do that from the follow_command, for example: follow_command='change-to-standby.sh'

I will appreciate any help.

0

There are 0 answers