The ‘Voltage Fail Shutdown’ message is reported into the system message file for the M320 routing platform if the indicated device in the syslog message has or is about to undergo a reset or is improperly grounded causing the FRU to go offline. This article documents an approach to troubleshoot this problem.
The Voltage Fail Shutdown event occurs when M320 equipment which is not properly grounded undergoes a surge in the grounding system, causing various FRUs to go offline. This can lead to transient failures.
FPC Resets with Voltage Fail Shutdown message
All reported cases of FPC reset produce a “Voltage Fail Shutdown” log message prior to the FPC resetting.
A message similar to the following is reported in the syslog output:
/kernel: CMB(RD): Voltage Fail Shutdown, device 0xf1, data 0x08
– Device 0xf$ denotes the FPC slot number (eg. 0xf1 means slot 1 FPC is affected).
– Data 0x0$ denotes the different power brick failures on the FPC.
Both CBs reset with Voltage Fail Shutdown message and consequently FPCs in the same power zone reset
CBs are susceptible to this issue just like FPCs and SIBs. When CBs reset, the FPCs in the same power zone, generally being slot 0 and slot 1, undergo a reset as well. The CBs log this message along with similar messages coming from the FPCs in the same power zone, after system reboot and card resets.
/kernel: %KERN-3: CMB(RD): Voltage Fail Shutdown, device 0xf0, data 0x02 /kernel: %KERN-3: CMB(WR): Voltage Fail Shutdown, device 0xe6, data 0x00 /kernel: %KERN-3: CMB(RD): Voltage Fail Shutdown, device 0xe7, data 0x00
– Device 0xf$ denotes the FPC slot number (eg. 0xf7 means slot 7 FPC is affected)
– Device 0xe$ denotes the CB slot number (eg. 0xe6 means CB0, 0xe7 means CB1 is affected)
The cause for such an alert is in fact a surge in the grounding system for equipment that is not grounded properly, causing various FRUs to go offline. It has been found that a surge through the chassis ground or the console port might cause the FRUs to go offline which is typically caused by poor grounding in the form of inductive grounds. Other reasons include the possibility of component failure around the power circuitry.
When a surge is applied to the chassis ground, the longer the power ground cable, the higher its impedance on the ground connection becomes, which is also a function of the surge characteristics. This will result in a potential voltage rise on the chassis and common mode voltage appearing on both the router logic circuitry as well as the power circuitry. The surge will then discharge through the available ground paths which if long tends to be higher impedance than desired.
Below are the recommended solutions or follow-up actions:
- Optimal grounding has to be enforced in order to accommodate a certain level of surge through power ground.
- Third Party power consultants or certified electricians may be engaged to ensure optimal system grounding at all sites, although it might be impractical to rectify the site-level grounding for every single site.
- Juniper Products adhere to all requirements and standards; however your technical support representative should be consulted so as to check the console port type and its associated cable, and also to check if there is any floating voltage between Chassis GND to RTN which may help indicate the actual surge level.