This article explains how the indicated IS-IS informational link-state PDU (LSP) failed an internal checksum validity test, implying that it was corrupted.
The RPD_ISIS_LSPCKSUM message is logged each time the routing process detects a checksum error in a received ISIS PDU packet.
When an RPD_ISIS_LSPCKSUM event occurs, a message similar to the following is reported:
rpd: RPD_ISIS_LSPCKSUM: IS-IS L2 LSP checksum error, interface xe-0/0/0.0, LSP id xxxx.00-00, sequence 0x556, checksum 0xcb3d, lifetime 33888
rpd: RPD_ISIS_LSPCKSUM: IS-IS L1 LSP checksum error, interface so-4/1/0.0, LSP id xxxx.00-01, sequence 0x5ba2, checksum 0x631, lifetime 65533
rpd: %DAEMON-4-RPD_ISIS_LSPCKSUM: IS-IS L2 LSP checksum error, interface xe-1/0/0.0, LSP id xxxx.00-00, sequence 0x1a075, checksum 0xb28a, lifetime 1184
At same time, you could capture the ERROR messages as well by IS-IS trace-option once the RPD_ISIS_LSPCKSUM log messages occurred.
ERROR: ISIS ignored a bad packet: L2 LSP id xxxx-re0.00-06 checksum error from <Router Name> on interface ge-3/0/0.0
You may also see this log message associated with one or more of the following messages:
rpd: RPD_ISIS_ADJDOWN: IS-IS lost L2 adjacency to xxxx on xe-0/0/0.0, reason: 3-Way Handshake Failed
rpd: RPD_ISIS_ADJDOWN: IS-IS lost L2 adjacency to xxxx on xe-3/0/0.0, reason: Bad Hello
sfm3 Bogus Cell incremented since last reported.
ssb Bogus Cell incremented since last reported.
ssb BCHIP 1: %PFE-0: correctable ECC error
ssb BCHIP 1: %PFE-0: ECC from SDRAM bank 1, at bit 62 was corrected
ssb CM(0): %PFE-3: Slot 1: Recoverable error detected; ECC error
ssb BCHIP 0: multiple correctable ECC errors
ssb BCHIP 0: ECC from SDRAM bank 1, at bit 53 was corrected
ssb CM(0): Slot 0: Recoverable error detected; multiple ECC errors
This message can be due to several types of hardware and transmission, such as PIC, FPC, or SFM/SSB/CB/SCB components. The routing process detects the checksum errors received as part of corrupted ISIS PDU messages. However, the PDU messages can only be received on an ISIS enabled interface. As a result, the component reporting the error message may not be the part generating the PDU corruption.
Examine the following output to determine the cause of this message:
show isis overview
show isis statistics
show log messages
show route summary
show pfe statistics error
show isis adjacency
show interfaces extensive
Examine the trace-option messages as well to confirm further details:
set protocols isis traceoptions file isis-trace
set protocols isis traceoptions file size 100m
set protocols isis traceoptions file files 10
set protocols isis traceoptions flag error detail
set protocols isis traceoptions flag lsp detail
To have a ping test with particlular payload between the issu interfaces to check if the payload was distorted:
ping <dst ip address> size 1500 count 100 bypass-routing interface ge-x/x/x pattern 0x1111 rapid
monitor traffic interface ge-x/x/x matching "icmp && dst host <ip address>" print-hex extensive
show system statistics | match checksum
If more than one adjacent router are reporting the RPD_ISIS_LSPCKSUM messages, it may be necessary to collect logs from all neighbor routers to determine which one has the component that is causing the faulty ISIS PDUs to be generated. Look for any related events which occurred at or just before the RPD_ISIS_LSPCKSUM message.
Perform the following procedure>
1. An indicator that a PIC may be the cause, is if it shows traffic loss in its interfaces extensive output.
2. Use the output from show isis adjacency to see the ISIS neighbors for each router and which interface is being used for that adjacency. This can help determine the common component.
3. If a router is reporting checksum errors on all its ISIS interfaces, you will need to check the FPCs on that router.
4. If a PIC is suspected of causing the issue, perform the following procedure during a maintenance window, as it will impact transit traffic:
- Reset the PIC.
- Disable ISIS on all the interfaces on that PIC to see if that stops the messages.
- Reseat the PIC in its slot.
- If the messages continue, even after a reseat, but stop with disabling ISIS on the PIC interfaces,
- then open a case with your technical support representative to further investigate the issue.
5. If an FPC/CB/SFM is suspected of causing the errors, check the log messages output to see if any router component is reporting ECC memory errors. You can also check the output of show pfe statistics error to detect the component showing errors.
6. Obtain verification that the suspected component is corrupting ISIS PDU messages, by offlining that component. If that component, be it FPC, CB, SFM, and so forth, results in the messages being stopped when offlined, then open a case with your technical support representative to further investigate the issue.
7. If there is no obvious component causing the corruption, based on analysis of the messages log and pfe statistics error output or the messages occur infrequently but consistently, then you will need to individually test each FPC. This should be done during a maintenance window, as it will impact transit traffic:
- Set the ISIS overload bit, so that the ISIS area will redirect transit traffic.
- Offline all the FPC components, except for one. (that is, FPC0).
- Observe the output of show log messages to see if the ISIS PDU checksum messages are being generated. It may be necessary to check over a period of 30 minutes or so to verify the messages do not appear. If the RPD_ISIS_LSPCKSUM message do appear, then open a case with your technical support representative to further investigate the issue.
- If the FPC does not show the ISIS PDU messages after 30 minutes, then online the next FPC and observe the log messages again. Repeat this for each FPC until the component is found, then open a case with your technical support representative to further investigate the issue.