How to perform a health check for an EX Standalone or a Virtual Chassis (VC)

This article explains how to perform a health check on an EX standalone or a VC switch.

From the CLI, use the following commands to check the health of your switch.
1. Confirm the version:
show version
For virtual chassis, use:
show version all-members
For chassis EX-8200 / 9200, use:
show version invoke-on all routing-engine

2. Check if both partitions on the switch have the same version:
show system snapshot media internal
If the partitions do not have the same version, enter the following command:
request system snapshot slice alternate
Note: This command is applicable for EX4500, EX4300, EX4200, EX-8200, EX9200, EX3300 and EX2200.

3. Check if any directory has reached or is close to reaching 100% storage space.
show system storage (/var, /tmp, /var/tmp, /var/log, /var/rundb, /config)
-If you are seeing any directory exceeding or about to reach the limit then issue the command “request system storage cleanup”. However, please note that this command will clear all the log messages and core dumps in /var/tmp of the switch.
request system storage cleanup

4.Verify that all the installed components are listed and recognized by the hardware.
show chassis hardware

5.Check if any switch components are reporting a failure:
show chassis environment
– Check if all the switch components are showing “ok” and not showing failure. If we notice any failure please create a JTAC ticket to confirm if this has any impact on the hardware.
– If you found any other status check if there are any system alarms or chassis alarm generated.
– The same can be verified by following step 6 and 7.

6.Check if there are any chassis alarms generated:
show chassis alarms
7.Check if there are any system alarms generated:
show system alarms
If you see Rescue configuration is not set, then save a rescue file in the switch by issuing the following command:
request system configuration rescue set

8.Check if there are any suspicious logs:
show log messages
If you see any, create a technical support case for further analysis or known issues.

9. show chassis routing-engine
-Check if there any unusual values found for idle percentage, interrupts and Kernel values.
-If the load average or interrupts are high and the idle percentage is very low, then continue with step 10.

10.Check if any process is spiking abnormally
show system process extensive
If any process spikes more than usual then check if there was any recent configuration changes made and cross check if it triggered the issue.
If you find that this is impacting the performance then create a case with JTAC.

11.Check if any processes have dumped a core file:
show system core-dumps
If the you see any core dumps or high CPU, please create a technical support case for further troubleshooting.

12.Check if any interface has errors, drops or pause frames:
show interfaces extensive| grep “errors|drops|interface|pause”
– Check if any interface has any errors, drops or pause frames on an interface.
– If any errors or drops are found then create a case with JTAC

13.Check the statistics for all layer 2 and layer 3 traffic and see if there are bad headers or drops:
show system statistics | match bad
-Check if there are any bad values are seen for the output.
-If there are any values found then keep repeating the command few times to check if the values are incrementing.
-If the values are incrementing then create a case with JTAC

14.Confirm that all the members in the VC are in present state so as to forward traffic:
show virtual-chassis status

15.Check the link and plane errors in chassis (EX-8200 / 6200 / 9200).
show chassis fabric plane
show chassis fabric summary
show chassis fabric fpc
show chassis fabric statistics

About the author

James Palmer

Leave a Comment