This article describes the following syslog message:
PFE: .*Multiple UnCorrectable ECC.*
The PFE board is reporting multiple uncorrectable ECC memory errors.
The Packet Forwarding Engine (PFE) reports the ECC (Error Correction/correcting Code), which allows data transmitted to and from memory to be checked for errors, and allows the errors to be corrected, if possible.
In the example error below, the ECC is unable to correct the error.
When a Multiple UnCorrectable ECC event occurs, a message similar to the following is reported:
feb1 ICHIP(0): %PFE-3: multiple uncorrectable ECC errors, count 65535 feb1 ICHIP(0): %PFE-3: syndrome 0x00000078, addr 0x3b572d8 (bank 3 cell 1944470), dimm 2) feb1 CMALARM: %PFE-3: Error (code: 1219, type:Major) encountered, cmalarm_passive_alarm_signal feb1 CMALARM: %PFE-3: Error (code: 1269, type:Major) encountered, cmalarm_passive_alarm_signal feb1 ICHIP(0): %PFE-3: multiple correctable ECC errors, count 49518
The above log output identifies FEB in slot 1 as the problem FRU.
The following log entries identify FPC in slot 6 as the problem FRU:
fpc6 smchip_isr_base(MMB0-MD(1)): Multiple UnCorrectable ECC DMC(0) fpc6 smchip_isr_base(MMB0-MD(1)): Multiple UnCorrectable ECC DMC(0) fpc6 smchip_isr_base(MMB0-MD(1)): Multiple UnCorrectable ECC DMC(0)
The following logs identify FPC in slot 7 as the problem FRU:
fpc7 smchip_isr_base(MMB1-MD(2)): Multiple Correctable ECC DMC(3) fpc7 smchip_isr_base(MMB1-MD(2)): Multiple UnCorrectable ECC DMC(3)
Evidence of traffic impact is not always seen right away in the faulty FRU component.
If the errors continue, traffic may eventually be impacted.
ECC uncorrectable errors are raised when PFE detects hardware errors mostly within the memory component.
Perform these steps to determine the cause and resolve the problem (if any). Continue through each step until the problem is resolved.
1. Collect the show command output on the Routing Engine.
Capture the output to a file (in case you have to open a technical support case). To do this, configure each SSH client/terminal emulator to log your session.
> show log messages | no-more
If FPC is reporting the message, then:
> start shell pfe network fpcX # show nvram # show syslog messages # exit
If FEB is reporting the message, then:
> start shell pfe network febX # show nvram # show syslog messages # exit
Note: X is the FPC/FEB slot reporting the error. It should be indicated in the message.
2. Replace the FRU reporting the error message to resolve the issue.