ISSU: Performing the In-Service Software Upgrade

This article is the fourth in a series about performing an ISSU (In-Service Software Upgrade).

Load the Junos Software package on the device
Verify the Health of the Cluster (Important step)
Create backup of the current configuration and set the rescue config
Start the In-Service Software Upgrade
Process to follow, in the event of the ISSU process stalling in the middle of the upgrade

This article primarily addresses the actual ISSU process.

It is strongly recommend that the In Service Software Upgrade (ISSU) be performed during a maintenance window or during the lowest possible traffic, as the secondary node is not available at this time, and ISSU might occasionally fail under heavy CPU load.

It is best to perform ISSU under the following conditions:

Chassis Cluster Mode
During System Maintenance Window
Lowest possible traffic period
When the Routing Engine CPU is less than 40%

The ISSU Process

JTAC does not currently recommend the ISSU process for a non-traffic disruption SRX software upgrade due to certain limitations with NAT, VPN and ALG. However, as of Junos 11.1, the ISSU upgrade limitations for NAT and ALG are resolved, but VPN configuration still could cause issues. Also, it is necessary to check the release documents of the version from and to which the upgrade is to be done. If your organization still requires following the ISSU upgrade process, for minimal traffic disruption, you can follow this guide.

1. Verify that you have console connectivity to the primary and secondary nodes.

2. Verify that ‘logging’ is enabled on both terminal sessions.

This is necessary to verify and monitor the ISSU process as it upgrades the Junos image.

3. Perform the upgrade with the following command:

{primary:node0}
root@test-node0> request system software in-service-upgrade /var/tmp/junos-srx5000-10.4R3.7-domestic.tgz reboot <---be sure to include the 'reboot' option

Important: Make sure that reboot is specified in the command. If it is not specified, it will upgrade node 1 but not reboot, and the physical reboot of node 1 is needed before the automatic failover happens.

The messages reported on node 0 and node 1 will be on similar lines, as follows. (Messages that are not important have been omitted.)

NODE 0:
{primary:node0}
root@test-node0> request system software in-service-upgrade /var...(complete the package information as shown above)
Chassis ISSU Started
node1:
--------------------------------------------------------------------------
Chassis ISSU Started
ISSU: Validating Image
Inititating in-service-upgrade
node1:
--------------------------------------------------------------------------
Inititating in-service-upgrade
Checking compatibility with configuration
Initializing...
Verified manifest signed by PackageProduction_10_1_0
Verified junos-10.1-domestic signed by PackageProduction_10_1_0
Using /var/tmp/junos-srx5000-domestic.tgz
Checking junos requirements on /
Saving boot file package in /var/sw/pkg/junos-boot-srx5000-10.1R4.4.tgz
Verified manifest signed by PackageProduction_10_1_0
Hardware Database regeneration succeeded
Validating against /config/juniper.conf.gz
mgd: commit complete
Validation succeeded
Validating against /config/rescue.conf.gz
mgd: commit complete
Validation succeeded
ISSU: Preparing Backup RE
Pushing bundle to node1
JUNOS 10. become active at next reboot
WARNING: A reboot is required to load this software correctly
WARNING: Use the 'request system reboot' command
WARNING: when software installation is complete
Saving package file in /var/sw/pkg/junos-10..tgz ...
Saving state for rollback ...
Finished upgrading secondary node node1
Rebooting Secondary Node
node1:
--------------------------------------------------------------------------
Shutdown NOW!
[pid 21958]
ISSU: Backup RE Prepare Done
Waiting for node1 to reboot.
node1 booted up.
Waiting for node1 to become secondary
node1 became secondary.
Waiting for node1 to be ready for failover
ISSU: Preparing Daemons

After this is done, the following is reported on NODE 1:

{secondary:node1}
root@test-node1> show chassis cluster status
Cluster ID: 2
Node Priority Status Preempt Manual failover
Redundancy group: 0 , Failover count: 2
node0 254 primary no no
node1 2 secondary no no
Redundancy group: 1 , Failover count: 2
node0 254 primary no no
node1 0 secondary no no

At this stage, Node 1 has rebooted successfully and is on the Junos version that you upgraded to. Check the output of the command “show version” to verify this. Also, check the output of the following commands:

srx> show chassis cluster status
srx> show chassis fpc pic-status (all the PICs in NODE 1 should be online – keep monitoring it for 2 mins or so to make sure all are online)
srx> show chassis alarms
srx> show system alarms	Srx> show log messages | grep issu

Now, the automatic failover will occur; after that, the upgrade of Node 0 will take place. The messages reported are similar to those above, but you should still monitor the process to see if there are any problems or warnings that the boot messages are throwing.

Node 0 should come back up in the healthy state.

Also, make sure that the Redundancy Groups are now primary on Node 1. To bring it back to node 0, follow the process below:

srx> request chassis cluster failover redundancy-group 0 node 0
srx> request chassis cluster failover redundancy-group 1 node 0
srx> request chassis cluster failover redundancy-group X node 0
srx> request chassis cluster failover reset redundancy-group X

As mentioned earlier, the failover of Redundant Groups might take some time. The rest of the Redundancy Groups should failover fast.

The ISSU process is now complete.

Related