The CHASSISD_RE_OVER_TEMP_WARNING message is reported into the system message file whenever the Routing Engine (RE) exceeds the upper temperature threshold. If the temperature does not go below the threshold within four minutes after the chassis process (chassisd) detects this condition, the chassisd process shuts down the indicated component. When this message was logged, the indicated number of seconds remained before shutdown. This article documents an approach to troubleshoot this problem.
The problem related to this syslog message is described in the following sections:
The CHASSISD_RE_OVER_TEMP_WARNING message is logged each time the routing engine temperature exceeds the designated upper threshold.
When the CHASSISD_RE_OVER_TEMP_WARNING event occurs, a message similar to the following is reported:
CHASSISD_RE_OVER_TEMP_WARNING: Routing Engine 0 temperature (101 C) over 100 degrees C, platform will shutdown in 240 seconds if condition persists
The message indicates which routing engine slot is affected, the temperature of the Routing Engine (RE) and the threshold value. This message is a warning that the affected RE will be shutting down after the specified amount of time if the RE temperature does not drop below the threshold.
The cause of this message is excessive temperatures in the location of the router, dirty air filters in the router, a fan failure, a physical problem with the routing engine or an issue with the temperature sensors.
Examine the following output to help determine the cause of this message:
show log messages
Look for any related events that occurred at or just before the CHASSISD_RE_OVER_TEMP_WARNING message. Messages showing issues with temperature sensors, a physical problem with the routing engine or a cooling system (fan) failure indicate that the CHASSISD_RE_OVER_TEMP_WARNING message is likely due to hardware failure.
If there are no related events, check the physical location to ensure that the temperature of the location is not excessive.
Perform these steps:
- If the temperature at the device location is excessive, take actions necessary to reduce the temperature of the facility.
- Check the surrounding environment to the chassis, to verify that airflow to and from the unit is not being restricted. Air must be allowed to enter and exit via only the air vents on the chassis. Vent restriction will lower the effectiveness of the cooling for the system. Also check to make sure that air inflow is not next to the exhaust of another system generating heated air, as this will also reduce cooling efficiently.
- Check the air filters on the platform and verify that they are clean and properly maintained. There are links below to the air filter maintenance section of many hardware guides.
- Verify that empty FPC slots have a blank cover plates installed, to preserve the integrity of the internal airflow inside the chassis. All system components (FPCs, CBs, PICs, REs, etc.) should fit snugly in their slots and not have gaps that allow air flow around the component to the outside environment.
- Verify that the fans are operating by either visual and/or audio observation. If a fan or fan tray does not seem to be running as expected, or there are log messages reporting a fan failure, try reseating the fan in its slot. If there are still messages reporting a fan failure, open a case with your technical support representative for further investigation.
- If a specific component is reporting a temperature sensor failure, try reseating that component. Some components, such as FPCs, may also be moved to a different slot in the chassis to see if that resolves the issue.
- If there are still messages reporting component temperature rising steadily, or over temperature, or a temperature sensor failure at this point, open a case with your technical support representative for further investigation.