Keepalived dedicated network for heartbeat and vrrp_sync_group

1.2k views Asked by At

I have a requirement for Keepalived running VRRP on 2 load balancers (NGINX) with 2 interfaces - eth0 (external) and eth1 (internal). I am trying to configure a setup where all VRRP traffic (preferably unicast) is run via the dedicated internal eth1 interface, to reduce the risk of a split brain situation. However, the floating IP address (VRRP IP) will be on eth0 (external) network.

I was looking at [https://github.com/acassen/keepalived/issues/637] and trying to do something similar. My config is below :

global_defs {
   notification_email_from myadmin@myserver
   smtp_server localhost
   smtp_connect_timeout 30
   router_id LVS_DEVEL
}

vrrp_script check_nginx {
    script "/usr/libexec/keepalived/check_nginx.sh"
    interval 3
}

vrrp_sync_group link_instances {
    group {
        real
        stop_duplicate
    }
}

vrrp_instance real {
    state BACKUP
    interface eth0
    virtual_router_id 1
    priority 250               # This will be a lower value on the other router
    version 3                  # not necessary, but you may as well use the current protocol
    advert_int 1
    nopreempt

    track_interface {
        eth1
    }

    track_script {
        check_nginx
    }

    unicast_src_ip 115.197.1.166
    unicast_peer {
    115.197.1.167
    }

     virtual_ipaddress {
          115.197.1.170/32 dev eth0
    }
}

vrrp_instance stop_duplicate {
    state BACKUP
    interface eth1
    virtual_router_id 1
    priority 255             
    version 3
    advert_int 1
    nopreempt

    unicast_src_ip 192.168.0.3
    unicast_peer {
    192.168.0.4
    }

    virtual_ipaddress {
        192.168.0.5/29         
  }
}

The problems I have with this setup so far:

  1. On the master, I forcefully brought down eth1 (internal interface). This subsequently triggered a failover of keepalived and the state transitioned to fault. Not really a behaviour I was expecting, because I thought the idea is for it to "use" the external eth0 in this case to communicate (which is still up and running external VRRP), so this would provide resiliency. Is it possible, when eth1 is detected as down, for the failover NOT to be triggered, and instead wait for eth1 to also fail, in order for it to fail over?

  2. I got the warning that my track_script check_nginx is not used. Is the use of track_scripts still not allowed within vrrp_sync_group? Because I still need this to work in the event of NGINX failure scenario.

  3. Could I use preempt within vrrc_sync_group? Because I would like to prevent failback once it moves across.

  4. Is there a better way of doing this? What I want to achieve is: a. Ensure that the VRRP traffic is used on internal (dedicated) interface eth1 and the floating VRRP IP to reside on eth0. This is so we dont need to depend on external network for heartbeat. b. In the event of failure of eth0, DONT fail over to the BACKUP node, if eth1 is still alive. Only failover if that one is also down (as it should).
    c. If check_nginx fails however, this will trigger a failover. d. Once it fails over, it will not fail back - unless the same failure scenarios happen to it (eg NGINX being down)

Wondering if what I'm trying to achieve is possible or not??

Thanks J

0

There are 0 answers