Recommended BFD link down detection time when OSPF graceful-restart is configured in an SRX chassis cluster

Bidirectional Forwarding Detection (BFD) is used to providing sub-second convergence times for routing protocols. But BFD detects the link down prior to receiving “Graceful Restart” message, Graceful-restart of routing protocols may not work as expected.

An SRX chassis cluster is configured like below diagram and configuration. When the primary node RE (redundancy-group 0, also known as RG0) is failed over by system reboot using CLI “request system reboot” command, the new primary node RE is sending “Graceful LSA” message to the peer devices after 6 – 10 seconds later or even more depends on the configuration and total number of routes and etc. During this time, all pass-through traffic will be dropped that use the OSPF routes.

NOTE: This issue does not exist when RG0 failover by CLI “request chassis cluster failover redundancy-group 0 node (new primary node id)” command. In this case, the Grace restart LSA will be sent within 1-2 seconds from the new primary node RE. But if you define too aggressive BFD detection time (e.g., 1.2 seconds), the OSPF graceful-restart may not work if BFDD (BFD daemon) detect the link down and notify it to the OSPF client prior to receiving the Grace restart LSA. For more details, refer to the below “Events on the SRX 1, 2 and Router when BFD detect time is 7.5 sec”.

NOTE: Before RG0 failover, the primary node RE is in SRX 1 and the secondary node RE is in SRX 2.

With the following configuration, 11 ping packet loss was observed (from PC 1 to PC 2)

SRX Configuration (BFD detect time is 7.5 sec)

Router Configuration (BFD detect time is 7.5 sec)

Events on the SRX 1, 2 and Router when BFD detect time is 7.5 sec

NOTE: In the above scenario, even you didn’t configure BFD, but you configured very short value of OSPF dead-interval (e.g., 10 seconds), the Graceful-restart may not work because the OSPF protocol neighbor keep-alive mechanism already detected the neighbor down before receiving “Graceful Restart” message.

When BFD detect the link “Down” on the peer device (Router), it notifies to OSPF, then OSPF will bring down the OSPF neighbor and delete the forwarding table. In this case,

  • If the OSPF neighbor (Router) receives the Graceful restart LSA prior to BFD link down notification, the OSPF routes will be remain until the configured graceful period is expired.
  • If not, the OSPF neighbor will delete the routes from the forwarding table immediately.

For example, Graceful restart LSA message

In order to avoid the deletion of OSPF routes by BFD link Down notification, BFD detection time should be greater than 6-11 seconds (the time of the new primary RE sends the “Graceful restart LSA”) or even more depends on the configuration and total number of routes and etc. Juniper Engineering’s recommendation of bfd-liveness-detection minimum-interval is 2500ms and multiplier 4 or above along with the default OSPF hello-interval 10 sec and dead-interval 40 sec in an SRX chassis cluster environment.

NOTE: The above time is measured in the JTAC lab, the time of the new primary RE sends the “Graceful restart LSA” may vary in production devices. Therefore, we recommend to add additional time. For example, (6 to 11) + 3 seconds. The recommended BFD detection time will be 9 to 14 seconds.

With the following configuration, 1 ping packet loss was observed (from PC 1 to PC 2)

SRX Configuration (BFD detect time is 12 sec)

Router Configuration (BFD detect time is 12 sec)

Events on the SRX 1, 2 and Router

NOTE: BFDD detection link Down is not after 12 sec later of node0 reboot. For more details, see RFC 5880, section 6.8.7. Transmitting BFD Control Packets, “the average interval between packets will be roughly 12.5% less than that negotiated”.

Workaround

You can manually send Graceful restart LSA using hidden CLI clear ospf grace-lsa or clear ospf grace-lsa instance (name of instance) prior to the primary node RE reboot. Or Failover all the RGs to the other node before rebooting the current primary node using CLI.

 

About the author

Prasanna

Leave a Comment