The CHASSISD_TIMER_VAL_ERR message is reported into the system message file whenever the chassis control process (chassisd) receives a null identifier from a timer and thus it could not clear the timer. This article documents an approach to troubleshoot this problem.
The CHASSISD_TIMER_VAL_ERR message is logged each time chassisd receives a null identifier from a timer and thus it could not clear the timer.
When chassisd receives a null identifier from a timer and thus cannot clear the timer, it logs the message into the syslog. This can occur when the chassisd process starts a timer to track the timeout period for an event, and the timer returns a null identifier, so the chassisd process could not clear the timer. Below are examples of the message reported:
/kernel: %KERN-3: rdp keepalive expired, connection dropped - src 1:1021 dest 8:23552 chassisd[3060]: %DAEMON-3-CHASSISD_TIMER_VAL_ERR: Null timer ID craftd[3073]: %DAEMON-4: Major alarm set, FPC 4 Hard errors chassisd[3071]: %DAEMON-3-CHASSISD_TIMER_VAL_ERR: Null timer ID chassisd[90823]: snmp_ipc_try_connect: opened socket 17 chassisd[90823]: CHASSISD_TIMER_VAL_ERR: Null timer ID chassisd[2505]: CHASSISD_FRU_OFFLINE_NOTICE: Taking SFM 2 offline: shutdown due to error chassisd[2505]: CHASSISD_TIMER_VAL_ERR: Null timer ID chassisd[3181]: %DAEMON-3- CHASSISD_TIMER_VAL_ERR: Null timer ID kernel: rdp keepalive expired, connection dropped - src 1:1021 dest 11:40960 chassisd[2973]: CHASSISD_TIMER_VAL_ERR: Null timer ID chassisd[2973]: CHASSISD_IPC_CONNECTION_DROPPED: Dropped IPC connection for SFM 3 chassisd[2947]: CHASSISD_FRU_OFFLINE_NOTICE: Taking SFM 3 offline: shutdown due to error chassisd[2947]: CHASSISD_TIMER_VAL_ERR: Null timer ID chassisd[3049]:CHASSISD_FRU_OFFLINE_NOTICE: Taking SFM 1 offline: shutdown due to error chassisd[3049]: CHASSISD_TIMER_VAL_ERR: Null timer ID sfm0 CM(0): Slot 0: Unrecoverable error; A1: B to A link failure chassisd[993]: CHASSISD_TIMER_VAL_ERR: Null timer ID CHASSISD_TIMER_VAL_ERR: Null timer ID chassisd[2872]: CHASSISD_IPC_CONNECTION_DROPPED: Dropped IPC connection for SFM 0 /kernel: rdp keepalive expired, connection dropped - src 0x00000001:1021 dest 0x00000008:50176 /kernel: peer_inputs: soreceive() error 64 /kernel: pfe_listener_disconnect: conn dropped: listener idx=4, tnpaddr=0x8, reason: socket error chassisd[4264]: CHASSISD_TIMER_VAL_ERR: Null timer ID chassisd[4264]: CHASSISD_IPC_CONNECTION_DROPPED: Dropped IPC connection for SFM 0 chassisd[2638]: CHASSISD_FRU_OFFLINE_NOTICE: Taking SFM 3 offline: shutdown due to error chassisd[2638]: CHASSISD_TIMER_VAL_ERR: Null timer ID
The cause for this message is an ‘event’ that occurred on a Field Replaceable Unit (FRU) which caused it to send a null timer ID to chassisd. The ‘event’ may be a hardware failure of the FRU, a fault in the communication link between the FRU and the midplane, a crash of the FRU, a transient issue on the FRU, or a user initiated action on the FRU (offline/online).
Review the output of show log messages and show log chassisd. Look carefully at the messages that occur just before or after the CHASSISD_TIMER_VAL_ERR message to identify the FRU that was the source of the timer response. Messages of the following types can identify the FRU:
CHASSISD_IPC_CONNECTION_DROPPED (See KB18823) CHASSISD_FRU_OFFLINE_NOTICE Major alarm set
If the CHASSISD_TIMER_VAL_ERR message only occurs a few times in the logs, then it is likely due to a transient issue on the FRU.
If the surrounding messages indicate that the FRU was taken offline or brought online by a CLI command, then the CHASSISD_TIMER_VAL_ERR message is likely due to a user initiated action on the FRU.
If the surrounding messages indicate that the FRU restarted without user interaction, then the CHASSISD_TIMER_VAL_ERR message is likely due to a FRU crash.
If the surrounding messages indicate hardware issues with the FRU, then the CHASSISD_TIMER_VAL_ERR message is likely due to a hardware failure.
If the CHASSISD_TIMER_VAL_ERR message occurs continuously, troubleshooting actions outlined in the next section need to be done to isolate the source of the error.
If the CHASSISD_TIMER_VAL_ERR messages are due to a transient issue, monitor the router for a reoccurence of the messages.
If the CHASSISD_TIMER_VAL_ERR message is due to a user initiated action on the FRU then no action is required. These messages are normal when an FRU is taken offline.
If the CHASSISD_TIMER_VAL_ERR messages are due to a hardware failure, open a case with your technical support representative.
If the CHASSISD_TIMER_VAL_ERR messages are due to a FRU crash, monitor the router for a reoccurence of the FRU crash. If the FRU crashes repeatedly, open a case with your technical support representative.
If the CHASSISD_TIMER_VAL_ERR message occurs continuously, perform the following steps in a maintenance window:
- Take the FRU offline.
- Reseat the FRU.
- If needed, bring the FRU back online.
If the messages do not return immediately, monitor the router for a reoccurrence.
If the messages return, perform the following steps (if possible) in a maintenance window:
- Take the FRU offline.
- Remove the FRU.
- Insert the FRU into a different slot in the router.
- If needed, bring the FRU back online.
If the messages continue, open a case with your technical support representative.
If the messages do not continue, perform the following steps (if possible) in a maintenance window:
- Take the FRU offline.
- Remove the FRU.
- Insert the FRU into the original slot in the router.
- If needed, bring the FRU back online.
- Open a case with your technical support representative.