BGP Failover and Multi-Homing from Two VPS Locations

10 min read·Matthieu·BIRD2BFDFailoverFRRBGPMultihoming|

Announce the same prefix from two locations using BGP for automatic failover. Covers LOCAL_PREF, MED, AS-path prepending, BFD, and graceful shutdown with full BIRD2 and FRR configurations.

This tutorial walks through announcing the same IP prefix from two separate VPS locations using BGP. You will configure primary/backup preference with LOCAL_PREF and MED, enable BFD for sub-second failure detection, and implement graceful shutdown for planned maintenance. All examples use BIRD2 and FRR side by side.

Prerequisites:

What is BGP multihoming and why use it on a VPS?

BGP multihoming means announcing the same IP prefix from two or more locations via eBGP. Each location maintains an independent BGP session with its upstream provider. If one location fails, the other continues announcing the prefix and absorbs all traffic automatically. Convergence time depends on hold timers (typically 180-240 seconds with default settings) or BFD (sub-second with proper configuration).

On a VPS, multihoming gives you redundancy without relying on a single data center. You run two VPS instances in different locations, both announcing your prefix. One acts as primary, the other as backup. Traffic engineering attributes (LOCAL_PREF, MED, AS-path prepending) control which path handles traffic under normal conditions.

How do you architect BGP failover across two locations?

The setup uses two Virtua VPS in different European locations, each running an eBGP session to the local upstream router. Both announce the same /24 and /48.

                    Internet
                   /        \
            Upstream A     Upstream B
            (Frankfurt)    (Amsterdam)
                |              |
           eBGP session   eBGP session
                |              |
          +-----------+  +-----------+
          |  VPS-PRI  |  |  VPS-BKP  |
          | AS 64500  |  | AS 64500  |
          | BIRD2/FRR |  | BIRD2/FRR |
          +-----------+  +-----------+
           announces       announces
          198.51.100.0/24  198.51.100.0/24
          2001:db8::/48    2001:db8::/48

Both nodes belong to your AS (AS 64500 in these examples). Replace ASN, prefixes, and peer IPs with your actual values.

Firewall rules for both nodes:

BGP uses TCP port 179. BFD uses UDP ports 3784 and 3785. Open these between your VPS and the upstream peer before proceeding.

# nftables example - adjust PEER_IP to your upstream
nft add rule inet filter input ip saddr PEER_IP tcp dport 179 accept
nft add rule inet filter input ip saddr PEER_IP udp dport { 3784, 3785 } accept

How do you control BGP path preference?

Three attributes let you influence which path carries traffic. Each operates at a different scope.

Attribute Direction Scope Sent to peers? When to use
LOCAL_PREF Outbound (your exit) Within your AS No (iBGP only) Control which of your nodes sends outbound traffic
MED Inbound (from upstream) Between you and one upstream AS Yes (to direct neighbor) Tell a single upstream which entry point to prefer
AS-path prepending Inbound (global) All ASes in the path Yes (propagated) Make a path look longer to the entire Internet

LOCAL_PREF and MED are precise. AS-path prepending is a blunt instrument but works when your locations peer with different upstreams.

How do you configure LOCAL_PREF for primary and backup paths?

LOCAL_PREF determines which exit path your AS prefers for outbound traffic. Higher value wins. Default is 100. Set 200 on the primary node and leave 100 on the backup. This only affects traffic leaving your network.

BIRD2 LOCAL_PREF configuration

On the primary node (VPS-PRI), create or modify the import filter:

# /etc/bird/bird.conf - Primary node

filter upstream_import_primary {
    bgp_local_pref = 200;
    accept;
}

protocol bgp upstream_v4 {
    local 192.0.2.2 as 64500;
    neighbor 192.0.2.1 as 64496;
    ipv4 {
        import filter upstream_import_primary;
        export where net = 198.51.100.0/24;
    };
}

protocol bgp upstream_v6 {
    local 2001:db8:1::2 as 64500;
    neighbor 2001:db8:1::1 as 64496;
    ipv6 {
        import filter upstream_import_primary;
        export where net = 2001:db8::/48;
    };
}

On the backup node (VPS-BKP), keep default LOCAL_PREF:

# /etc/bird/bird.conf - Backup node

filter upstream_import_backup {
    bgp_local_pref = 100;
    accept;
}

protocol bgp upstream_v4 {
    local 203.0.113.2 as 64500;
    neighbor 203.0.113.1 as 64497;
    ipv4 {
        import filter upstream_import_backup;
        export where net = 198.51.100.0/24;
    };
}

Reload BIRD2 and check the routes:

birdc configure
birdc show route for 0.0.0.0/0 all
0.0.0.0/0          unicast [upstream_v4 12:00:00] * (100/?) [AS64496i]
        via 192.0.2.1 on eth0
        Type: BGP univ
        BGP.origin: IGP
        BGP.as_path: 64496
        BGP.local_pref: 200

The BGP.local_pref: 200 on the primary node means it will be preferred for outbound traffic.

FRR LOCAL_PREF configuration

On the primary node:

vtysh -c "configure terminal
route-map UPSTREAM-IN permit 10
 set local-preference 200
exit
router bgp 64500
 neighbor 192.0.2.1 remote-as 64496
 address-family ipv4 unicast
  neighbor 192.0.2.1 route-map UPSTREAM-IN in
  network 198.51.100.0/24
 exit-address-family
 neighbor 2001:db8:1::1 remote-as 64496
 address-family ipv6 unicast
  neighbor 2001:db8:1::1 route-map UPSTREAM-IN in
  network 2001:db8::/48
 exit-address-family
exit
exit"

On the backup node, set set local-preference 100 (or omit the route-map since 100 is the default).

Check the routing table:

vtysh -c "show ip bgp"
   Network          Next Hop            Metric LocPrf Weight Path
*> 0.0.0.0/0        192.0.2.1                     200      0 64496 i

How do you use MED to control inbound traffic?

MED (Multi-Exit Discriminator) tells your upstream which of your entry points to prefer. Lower MED wins. Set MED 0 on the primary and MED 100 on the backup. MED is only compared between paths received from the same neighboring AS, so it works best when both locations peer with the same upstream provider.

BIRD2 MED configuration

On the primary node, set MED in the export filter:

filter upstream_export_primary {
    if net = 198.51.100.0/24 || net = 2001:db8::/48 then {
        bgp_med = 0;
        accept;
    }
    reject;
}

protocol bgp upstream_v4 {
    local 192.0.2.2 as 64500;
    neighbor 192.0.2.1 as 64496;
    ipv4 {
        import filter upstream_import_primary;
        export filter upstream_export_primary;
    };
}

On the backup node:

filter upstream_export_backup {
    if net = 198.51.100.0/24 || net = 2001:db8::/48 then {
        bgp_med = 100;
        accept;
    }
    reject;
}

FRR MED configuration

On the primary node:

vtysh -c "configure terminal
route-map UPSTREAM-OUT permit 10
 set metric 0
exit
router bgp 64500
 address-family ipv4 unicast
  neighbor 192.0.2.1 route-map UPSTREAM-OUT out
 exit-address-family
exit
exit"

On the backup node, use set metric 100.

Check the exported routes:

vtysh -c "show ip bgp neighbors 192.0.2.1 advertised-routes"
   Network          Next Hop            Metric LocPrf Weight Path
*> 198.51.100.0/24  0.0.0.0                  0         32768 i

The Metric column shows 0 on the primary. The backup will show 100.

When should you use AS-path prepending instead of MED?

Use AS-path prepending when your two locations peer with different upstream providers. MED is only compared between paths from the same AS, so it has no effect if your upstreams are different ASes. Prepending makes the backup path look longer, pushing global routing decisions toward the primary.

Prepend your own ASN 1-3 times on the backup node. More than 3 prepends rarely changes routing decisions and just adds noise.

BIRD2 (backup node export filter):

filter upstream_export_backup_prepend {
    if net = 198.51.100.0/24 || net = 2001:db8::/48 then {
        bgp_path.prepend(64500);
        bgp_path.prepend(64500);
        accept;
    }
    reject;
}

FRR (backup node):

vtysh -c "configure terminal
route-map UPSTREAM-OUT permit 10
 set as-path prepend 64500 64500
exit
exit"

After applying, check the AS path from a looking glass or remote host:

# From an external machine
traceroute -A 198.51.100.1

The backup path now shows 64500 64500 64500 (your ASN appears three times: once real, twice prepended) while the primary shows 64500 once.

How do you enable BFD for fast failure detection?

Without BFD, BGP relies on hold timers to detect a peer failure. The default hold time is 240 seconds in BIRD2 and 180 seconds in FRR. With BFD, detection drops to sub-second on low-latency links.

Parameter Default Recommended for VPS
Transmit interval 300 ms 300 ms
Receive interval 300 ms 300 ms
Detect multiplier 3 3
Effective detection time 900 ms 900 ms

For VPS environments on the same provider backbone, 300 ms intervals with a multiplier of 3 give reliable sub-second detection without false positives. Do not set intervals below 100 ms on VPS instances. Virtualization jitter can cause flapping.

BIRD2 BFD configuration

Add a BFD protocol and enable it on the BGP session:

protocol bfd {
    interface "*" {
        min rx interval 300 ms;
        min tx interval 300 ms;
        multiplier 3;
    };
}

protocol bgp upstream_v4 {
    local 192.0.2.2 as 64500;
    neighbor 192.0.2.1 as 64496;
    bfd graceful;
    ipv4 {
        import filter upstream_import_primary;
        export filter upstream_export_primary;
    };
}

The bfd graceful option means BIRD2 will trigger a graceful restart (preserving stale routes) rather than a hard session reset when BFD detects a failure. If the peer does not run BFD, the session still establishes normally.

After reloading, check BFD status:

birdc show bfd sessions
BFD sessions:
IP address       Interface  State   Since       Interval  Timeout
192.0.2.1        eth0       Up      12:00:00    300 ms    900 ms

FRR BFD configuration

vtysh -c "configure terminal
bfd
 profile vps-detect
  receive-interval 300
  transmit-interval 300
  detect-multiplier 3
 exit
exit
router bgp 64500
 neighbor 192.0.2.1 bfd profile vps-detect
 neighbor 2001:db8:1::1 bfd profile vps-detect
exit
exit"

Check BFD peer state:

vtysh -c "show bfd peers"
BFD Peers:
        peer 192.0.2.1 vrf default
                ID: 1
                Remote ID: 2
                Status: up
                Uptime: 5 minute(s)
                Diagnostics: ok
                Remote diagnostics: ok
                Peer Type: configured
                Local timers:
                        Receive interval: 300ms
                        Transmission interval: 300ms
                        Echo receive interval: disabled
                        Echo transmission interval: disabled
                Peer timers:
                        Receive interval: 300ms
                        Transmission interval: 300ms
                        Echo receive interval: disabled

BFD requires UDP ports 3784 and 3785 open between peers. If you skipped the firewall step earlier, BFD sessions will stay in Down state.

How do you perform a graceful shutdown for maintenance?

RFC 8326 defines the GRACEFUL_SHUTDOWN well-known community (65535:0). Before planned maintenance, you tag all routes with this community. Peers that honor it set local-preference to 0 for those routes, causing traffic to shift to alternate paths before you shut down the session. This avoids the traffic blackhole that occurs during normal BGP convergence.

The graceful shutdown procedure:

  1. Tag routes with the GRACEFUL_SHUTDOWN community on the node you are taking down
  2. Wait for convergence (30-60 seconds for the Internet to re-route)
  3. Check traffic has shifted using looking glass or traffic counters
  4. Shut down the BGP session
  5. Perform maintenance
  6. Bring the session back up and remove the community tag
  7. Confirm re-convergence

Graceful shutdown in BIRD2

To initiate graceful shutdown on the primary node before maintenance, modify the export filter:

# Temporary export filter for graceful shutdown
filter upstream_export_shutdown {
    if net = 198.51.100.0/24 || net = 2001:db8::/48 then {
        bgp_community.add((65535, 0));
        bgp_med = 65535;
        accept;
    }
    reject;
}

Apply it by changing the export filter in the BGP protocol and reloading:

# Edit bird.conf: change export filter to upstream_export_shutdown
# Then reload
birdc configure

To honor graceful shutdown from peers (apply this on both nodes), add a check in the import filter. The ordering matters: the graceful shutdown check must call accept inside the if block, otherwise a later bgp_local_pref assignment overrides it.

filter upstream_import_backup {
    if (65535, 0) ~ bgp_community then {
        bgp_local_pref = 0;
        accept;
    }
    bgp_local_pref = 100;
    accept;
}

Graceful shutdown in FRR

FRR provides a single command that handles tagging automatically:

vtysh -c "configure terminal
router bgp 64500
 bgp graceful-shutdown
exit
exit"

This adds the GRACEFUL_SHUTDOWN community (65535:0) to all routes and sets local-preference to 0. It triggers a route refresh to all peers.

To confirm the community is being sent:

vtysh -c "show ip bgp neighbors 192.0.2.1 advertised-routes"
   Network          Next Hop            Metric LocPrf Weight Path
*> 198.51.100.0/24  0.0.0.0                  0      0  32768 i
                                         Community: graceful-shutdown

After maintenance, remove it:

vtysh -c "configure terminal
router bgp 64500
 no bgp graceful-shutdown
exit
exit"

For FRR to honor graceful shutdown from peers, configure an inbound route-map:

vtysh -c "configure terminal
bgp community-list standard GRACEFUL_SHUTDOWN permit graceful-shutdown
route-map UPSTREAM-IN permit 5
 match community GRACEFUL_SHUTDOWN
 set local-preference 0
exit
route-map UPSTREAM-IN permit 10
 set local-preference 200
exit
exit"

Sequence 5 matches routes carrying the community and drops local-preference to 0. Sequence 10 handles all other routes normally.

How do you test BGP failover?

Test failover by shutting down the primary BGP session and observing from the backup node and an external vantage point.

Step 1: Check current routing state on both nodes.

BIRD2:

birdc show route for 198.51.100.0/24 all

FRR:

vtysh -c "show ip bgp 198.51.100.0/24"

Step 2: Shut down the primary BGP session.

BIRD2 (on VPS-PRI):

birdc disable upstream_v4
birdc disable upstream_v6

FRR (on VPS-PRI):

vtysh -c "configure terminal
router bgp 64500
 neighbor 192.0.2.1 shutdown
 neighbor 2001:db8:1::1 shutdown
exit
exit"

Step 3: Observe the backup node.

On VPS-BKP, the route should now show as the only path:

# BIRD2
birdc show route for 198.51.100.0/24

# FRR
vtysh -c "show ip bgp summary"

Step 4: Test from outside.

From your local machine or a looking glass, traceroute to your prefix:

traceroute -A 198.51.100.1

Traffic should now enter through the backup location. With BFD enabled, the switchover happens in under 1 second. Without BFD, expect the full hold timer duration before convergence.

Detection method Typical failover time
BGP hold timer only (BIRD2 default 240 s) 160-240 s
BGP hold timer only (FRR default 180 s) 120-180 s
Reduced hold timer (e.g. 30 s) 20-30 s
BFD (300 ms intervals, multiplier 3) < 1 s

Use NLNOG Looking Glass or bgp.tools to confirm global routing convergence.

How do you recover after failover?

Bring the primary session back up and confirm traffic returns to the preferred path.

BIRD2:

birdc enable upstream_v4
birdc enable upstream_v6

FRR:

vtysh -c "configure terminal
router bgp 64500
 no neighbor 192.0.2.1 shutdown
 no neighbor 2001:db8:1::1 shutdown
exit
exit"

After a few seconds, check that the primary path is again preferred:

# BIRD2
birdc show route for 0.0.0.0/0 all | grep local_pref
        BGP.local_pref: 200
# FRR
vtysh -c "show ip bgp"
   Network          Next Hop            Metric LocPrf Weight Path
*> 0.0.0.0/0        192.0.2.1                     200      0 64496 i

Run traceroute from an external host again to confirm traffic re-entered through the primary location.

BIRD2 vs FRR configuration comparison

Feature BIRD2 FRR
LOCAL_PREF bgp_local_pref = 200; in import filter set local-preference 200 in route-map
MED bgp_med = 0; in export filter set metric 0 in route-map
AS-path prepend bgp_path.prepend(64500); in export filter set as-path prepend 64500 in route-map
BFD protocol bfd {} + bfd graceful; in BGP bfd section + neighbor X bfd profile Y
Graceful shutdown (initiate) Add (65535, 0) to bgp_community in export filter bgp graceful-shutdown under router bgp
Graceful shutdown (honor) Check (65535, 0) ~ bgp_community in import filter, set bgp_local_pref = 0 match community GRACEFUL_SHUTDOWN in route-map, set local-preference 0
Disable session birdc disable <protocol> neighbor X shutdown
Reload config birdc configure write memory then clear ip bgp * or restart

Monitoring failover events

Set up monitoring to get alerts when failover actually occurs. Monitor BGP Announcements with BGPalerter on Linux covers BGPalerter for route monitoring. At minimum, watch for:

  • BGP session state changes: journalctl -u bird or journalctl -u frr
  • BFD session flaps: birdc show bfd sessions / vtysh -c "show bfd peers"
  • Route count changes: alert if the number of exported prefixes drops to zero

Troubleshooting

BGP session stuck in Active/Connect state:

  • Check firewall rules for TCP 179
  • Verify the peer IP and ASN match what your upstream expects
  • Check journalctl -u bird -f or journalctl -u frr -f for error messages

BFD session stuck in Down:

  • UDP ports 3784 and 3785 must be open in both directions
  • Confirm the peer supports BFD and has it configured
  • Check for MTU issues on the path

MED not affecting inbound traffic:

  • MED is only compared between paths from the same AS. If your upstreams are different ASes, use AS-path prepending instead
  • Some upstreams ignore MED by policy. Ask your provider

Graceful shutdown community not honored:

  • The peer must explicitly support RFC 8326. Not all upstreams do
  • Check with your provider whether they honor the GRACEFUL_SHUTDOWN community
  • Some implementations require explicit configuration to respect the community

Traffic not failing over:

  • Verify both nodes announce the same prefix with birdc show route export upstream_v4 or vtysh -c "show ip bgp neighbors X advertised-routes"
  • Check from an external looking glass, not from the nodes themselves
  • DNS TTL may keep clients pointed at the old IP if you use per-location IPs for services on top of the anycast prefix