How to automatically trigger failover of redundancy-group 0 and redundancy-group 1 at the same time on a chassis cluster

This article discusses how failover of redundancy-group 0 (RG0) and redundancy-group 1 (RG1) be automatically triggered at the same time on a chassis cluster.

Some protocols, such as the Unified Threat Management (UTM) feature, are currently only supported in an active/passive cluster scenario whereby both RG0 and RG1 need to be primary on the same node. How can Ithe user achieve automatic failover of both redundancy groups whenever a failover needs to occur?

Note: Interface monitoring is supported on Redundancy Group 0 (RG0) for SRX Series high-end devices only, it is not supported on SRX Series branch devices and J Series devices. Also, interface monitoring is for monitoring the health of the interfaces belonging to a redundancy group, and monitoring fxp0 is not recommended.

An alternative solution to using interface monitoring is available to achieve the goals stated above. The alternative uses an event script that can force both redundancy groups to fail over at the same time whenever an SNMP_TRAP_LINK_DOWN message is sent to the messages log (from the Junos Automation Script Library). Steps to employ this solution are below.

Achieving High Availability with the Interface Monitoring Script

Download the monitor-interface.slax script file here.

Upload the monitor-interface.slax file onto both SRX nodes in the following SRX system directory:

Note: If you are using FTP to upload the script on the SRX device, use ASCII mode to ensure proper file formatting.

The script operates just like the built-in interface-monitoring feature. However the script adds the ability to trigger RG0 and RG1 failover at the same time. Also the script prevents RG0 from being rapidly failed over within the secondary-hold state interval. To configure the interface monitoring script under any redundancy group, an administrator would create an “apply-macro monitor-interface” stanza and optionally specify its weight. If no weight is specified, it is assumed to be 255 and will trigger a failover in the event the interface fails. If the weight is not great enough, the intermediate weight is noted in the configuration under the stanza “apply macro failover-interface-monitor” in the chassis cluster section. When a link flaps, an SNMP_TRAP_LINK_DOWN syslog message will be generated as type external and level critical.

Under the chassis cluster configuration, the macro “monitoring-options” with the value of “clear-failover” can be applied. If this is configured, then when a failover of any redundancy group occurs, the manual-failover flag will be cleared. This setting is optional but recommended. If it is not configured, the manual-failover will have to be manually cleared by the user. The second option that can be configured under “monitoring-options” is the option “full-failover”. The full-failover option triggers a full failover of all redundancy groups no matter which redundancy group failed first. This option ensures that failed redundancy groups follow each other.

To allow interface monitoring to take effect, the event-options stanza must be configured. This allows interface monitoring to intercept the message “SNMP_TRAP_LINK_DOWN” and allows the monitor-interface.slax script to act on the event. Using the configuration below will activate the script to act on the interface-down messages.

Configuration Example

apply-macro Parameters:

Specify Monitored Interfaces:

Other Miscellaneous Chassis Cluster Configurations:

event-options Configuration:

Commit the above configurations. Test by removing a cable, which is monitored per configuration.

About the author

Prasanna

Leave a Comment