This article is the fourth in a series about performing an ISSU (In-Service Software Upgrade).
- Load the Junos Software package on the device
- Verify the Health of the Cluster (Important step)
- Create backup of the current configuration and set the rescue config
- Start the In-Service Software Upgrade
- Process to follow, in the event of the ISSU process stalling in the middle of the upgrade
This article primarily addresses the actual ISSU process.
It is strongly recommend that the In Service Software Upgrade (ISSU) be performed during a maintenance window or during the lowest possible traffic, as the secondary node is not available at this time, and ISSU might occasionally fail under heavy CPU load.
It is best to perform ISSU under the following conditions:
- Chassis Cluster Mode
- During System Maintenance Window
- Lowest possible traffic period
- When the Routing Engine CPU is less than 40%
The ISSU Process
JTAC does not currently recommend the ISSU process for a non-traffic disruption SRX software upgrade due to certain limitations with NAT, VPN and ALG. However, as of Junos 11.1, the ISSU upgrade limitations for NAT and ALG are resolved, but VPN configuration still could cause issues. Also, it is necessary to check the release documents of the version from and to which the upgrade is to be done. If your organization still requires following the ISSU upgrade process, for minimal traffic disruption, you can follow this guide.
1. Verify that you have console connectivity to the primary and secondary nodes.
2. Verify that ‘logging’ is enabled on both terminal sessions.
This is necessary to verify and monitor the ISSU process as it upgrades the Junos image.
3. Perform the upgrade with the following command:
{primary:node0} root@test-node0> request system software in-service-upgrade /var/tmp/junos-srx5000-10.4R3.7-domestic.tgz reboot <---be sure to include the 'reboot' option
Important: Make sure that reboot is specified in the command. If it is not specified, it will upgrade node 1 but not reboot, and the physical reboot of node 1 is needed before the automatic failover happens.
The messages reported on node 0 and node 1 will be on similar lines, as follows. (Messages that are not important have been omitted.)
NODE 0: {primary:node0} root@test-node0> request system software in-service-upgrade /var...(complete the package information as shown above) Chassis ISSU Started node1: -------------------------------------------------------------------------- Chassis ISSU Started ISSU: Validating Image Inititating in-service-upgrade node1: -------------------------------------------------------------------------- Inititating in-service-upgrade Checking compatibility with configuration Initializing... Verified manifest signed by PackageProduction_10_1_0 Verified junos-10.1-domestic signed by PackageProduction_10_1_0 Using /var/tmp/junos-srx5000-domestic.tgz Checking junos requirements on / Saving boot file package in /var/sw/pkg/junos-boot-srx5000-10.1R4.4.tgz Verified manifest signed by PackageProduction_10_1_0 Hardware Database regeneration succeeded Validating against /config/juniper.conf.gz mgd: commit complete Validation succeeded Validating against /config/rescue.conf.gz mgd: commit complete Validation succeeded ISSU: Preparing Backup RE Pushing bundle to node1 JUNOS 10. become active at next reboot WARNING: A reboot is required to load this software correctly WARNING: Use the 'request system reboot' command WARNING: when software installation is complete Saving package file in /var/sw/pkg/junos-10..tgz ... Saving state for rollback ... Finished upgrading secondary node node1 Rebooting Secondary Node node1: -------------------------------------------------------------------------- Shutdown NOW! [pid 21958] ISSU: Backup RE Prepare Done Waiting for node1 to reboot. node1 booted up. Waiting for node1 to become secondary node1 became secondary. Waiting for node1 to be ready for failover ISSU: Preparing Daemons
After this is done, the following is reported on NODE 1:
{secondary:node1} root@test-node1> show chassis cluster status Cluster ID: 2 Node Priority Status Preempt Manual failover Redundancy group: 0 , Failover count: 2 node0 254 primary no no node1 2 secondary no no Redundancy group: 1 , Failover count: 2 node0 254 primary no no node1 0 secondary no no
At this stage, Node 1 has rebooted successfully and is on the Junos version that you upgraded to. Check the output of the command “show version” to verify this. Also, check the output of the following commands:
srx> show chassis cluster status srx> show chassis fpc pic-status (all the PICs in NODE 1 should be online – keep monitoring it for 2 mins or so to make sure all are online) srx> show chassis alarms srx> show system alarms Srx> show log messages | grep issu
Now, the automatic failover will occur; after that, the upgrade of Node 0 will take place. The messages reported are similar to those above, but you should still monitor the process to see if there are any problems or warnings that the boot messages are throwing.
Node 0 should come back up in the healthy state.
Also, make sure that the Redundancy Groups are now primary on Node 1. To bring it back to node 0, follow the process below:
srx> request chassis cluster failover redundancy-group 0 node 0 srx> request chassis cluster failover redundancy-group 1 node 0 srx> request chassis cluster failover redundancy-group X node 0 srx> request chassis cluster failover reset redundancy-group X
As mentioned earlier, the failover of Redundant Groups might take some time. The rest of the Redundancy Groups should failover fast.
The ISSU process is now complete.