RPD_KRT_Q_RETRIES

This article explains why the event message

appears in the syslog.

An

message is sent to the log file each time the routing protocol daemon (RPD) fails to update the kernel. The routing protocol daemon continues retrying.

When a major change occurs, the routing protocol daemon (RPD) sends updates to the kernel to maintain the current status of the routing tables. These changes can include:

  • A Routing Engine mastership switchover
  • A Routing Engine reboot
  • A restart of the routing daemon, which causes a rebuild of the routing tables
  • Links to next hops flapping
  • IGP/BGP convergence

These messages can also be generated when hitting PR836197 – Higher priority rpd tasks may be scheduled too often causing lower priority tasks appearing to be stalled.

The updates are processed through the kernel routing table (KRT) queue. During this state of high activity, the socket connection might run out of buffer space. The built-in flow control will attempt to complete processing of the updates by repeated attempts to send the update to the KRT queue. When the repeat attempt is made, the

message is sent to the log file.

There are several variations of the RPD_KRT_Q_RETRIES message. Following are some examples of

messages from the log:

When the RPD resends an update, the following message is also sent to the log file: “Route Update: No buffer space available.” If the retries are due to “Route Update: No buffer space available,” this is due to flow control and is a transient condition. It has no effect on performance.

Example:

If the retries are nontransient but permanent, then further investigation is needed. Contact your technical support representative to open a case.

Perform these steps to determine whether the messages are caused by transient queue operations or if there is some error that is permanently blocking the queue. Continue through each step until the problem is resolved.

1. Collect the show command output.

Capture the output to a file (in case you have to open a technical support case). To do this, configure the SSH client/terminal emulator to log your session.

2. Analyze the show command output.

In the output of show krt state, the labels for the various events are listed. When the RPD_KRT_Q_RETRIES message is generated, if the number to the right of the label is not zero and is increasing, then the kernel is continuing to process the updates correctly. If the numbers to the right of the various labels are not increasing, there is an error record that is stuck in the KRT queue being continuously rejected by the kernel.

The output of the command show krt queue is helpful to understand whether the rejection is due to ‘no buffer space’ or due to an error. This will also tell you which update is rejected by the kernel.

Ideally, all of the above operation queues should be 0. If this is an operation in the queue, then the number will represent the number of operations queued in that queue. If the queue is not stuck and is draining, then the number will eventually become 0 as soon as the queue becomes empty.

3. Deactivate GRES and NSR to drain the KRT queues.
Note: You may wish to contact your technical support engineer before proceeding with these actions.

If the router has both graceful Routing Engine switchover (GRES) and nonstop routing (NSR), then deactivate both functions; OR if only GRES is configured, then deactivate GRES.

For example, if NSR is enabled:

Then deactivate an item under the chassis hierarchy. For example:

Commit the changes, and the KRT queues will be drained.

4. Restart routing.
You can clear possible corrupt updates currently stuck in the KRT queue by restarting the routing protocol daemon: (http://www.juniper.net/techpubs/en_US/junos/topics/task/operational/junos-process-restarting.html). Doing this will disrupt all traffic in the router for however long it will take the router to rebuild the routing tables.

If the issue persists, then please contact your technical support representative for assistance.

About the author

Prasanna

Leave a Comment