Troubleshooting BGP Route-Reflection Issues
Route reflectors (RR), discussed in RFCs 1966 and 2796, are used to avoid IBGP full mesh in an AS, as required by RFC 1771. Route reflection ensures that all IBGP speakers in an AS receive BGP updates from all parts of the network without having to run IBGP between all the routers in the network. Route reflection reduces the number of required IBGP connections and also offers faster convergence in an IBGP network when compared with a full-mesh IBGP network.
Route-reflector clients (RRCs) typically peer IBGP with one or more RR, and they can have EBGP connections unconditionally. Logical BGP connections between RR and RRC typically follow the physical connection topology. These are some of the common rules that help BGP operators troubleshoot BGP route-reflector issues.
This section discusses various issues seen in BGP networks related to route reflection. The most common problems in route-reflection networks are as follows:
- Configuration mistakes
- An extra BGP updated stored by a route-reflector client
- Convergence time improvement for route reflectors and clients
- Loss of redundancy between route reflectors and route-reflector clients
Problem: Configuration Mistakes—Cause: Failed to Configure IBGP Neighbor as a Route-Reflector Client
Configuring route reflectors is fairly simple. In route-reflector BGP configuration, IBGP neighbors’ peering addresses are listed as route-reflector clients; however, a BGP operator inadvertently might configure an incorrect IBGP peering address as a route-reflector client.
Figure 15-27 shows that R1 is an RR. R8 and R2 are RRCs of R1.
Debugs and Verification
Example 15-59 shows the required configuration needed to make R1 an RR for R8 and R2. No additional configuration is needed in R8 and R2 to become RRCs other than just the normal IBGP configuration to peer with R1.
Example 15-59 Configuring R1 as a Route Reflector with R8 and R2 as Clients
R1#router bgp 109 no synchronization neighbor 131.108.1.2 remote-as 109 neighbor 131.108.1.2 route-reflector-client neighbor 206.56.89.1 remote-as 109 neighbor 206.56.89.1 route-reflector-client
The neighbor IP address must be the same in the route-reflector-client statement as in the remote-as configuration. The Cisco IOS Software BGP parser detects the misconfigured RRC IP address if BGP does not have an IBGP neighbor configured with this address.
For example, if the BGP operator types in this command
R1# router bgp 109 neighbor 131.108.1.8 route-reflector-client Cisco IOS Software will immediately display an error: % Specify remote-as or peer-group commands first
BGP detects that 131.108.1.8 is not configured as a neighbor, so it cannot be associated as an RRC.
Use the show ip bgp neighbor command, as demonstrated in Example 15-60, to verify that the neighbor is configured as an RRC.
Example 15-60 Verifying Neighbor Configuration as an RRC
R1# show ip bgp neighbor 131.108.1.2 BGP neighbor is 131.108.1.2, remote AS 1, internal link Index 1, Offset 0, Mask 0x2 Route-Reflector Client
Solution
A BGP operator accidentally might configure a different IP address in the RRC than is configured in the neighbor statement where the remote AS is configured. If this problem is detected, the IP address must be corrected.
Problem: Route-Reflector Client Stores an Extra BGP Update—Cause: Client-to-Client Reflection
The problem here stems from RRCs receiving extra BGP updates, which consume extra memory and take up CPU to process them.
In Figure 15-28, RRC R8 peers IBGP with RR R1; R8 is peering IBGP with RRC R2 as well. Because of this peering relationship, R2 receives an extra BGP update for all the routes originated/propagated by R8. Such a setup typically is done when a physical circuit exists between RRCs and the BGP operator wants to run BGP directly over them. In standard network design, such BGP connections between RRCs do not exist, and all RRCs simply peer with their respective route reflector(s) only.
Figure 15-29 shows the flowchart to follow to resolve this problem.
Debugs and Verification
The output in Example 15-61 shows that R2 is receiving two updates for 100.100.100.0, one from R8 and another reflected from R1.
Example 15-61 R2’s BGP Table Indicates Updates Received from Both the RR and Another RRC
R2#show ip bgp 100.100.100.0 BGP routing table entry for 100.100.100.0/24, version 3 Paths: (2 available, best #1, table Default-IP-Routing-Table Not advertised to any peer Local 131.108.10.8 (metric 20) from 131.108.10.8 (131.108.10.8) Origin IGP, metric 0, localpref 100, valid, internal, best Local 131.108.10.8 (metric 20) from 131.108.10.1 (131.108.10.1) Origin IGP, metric 0, localpref 100, valid, internal Originator: 131.108.10.1, Cluster list: 0.0.0.109
Solution
Turning off client-to-client reflection solves this problem. This problem arises only when an RRC peers IBGP with another RRC. When an RRC peers only with the RR, BGP does not run into this issue. Example 15-62 shows the configuration needed on an RR to turn off client-to-client reflection.
Example 15-62 Disabling Client-to-Client Reflection
R1#router bgp 109 no bgp client-to-client reflection
After enabling this command, the RR does not reflect any update from one RRC to another RRC, but it does reflect to normal IBGP and EBGP neighbors. The BGP operator must be certain that RRCs are peering BGP with other RRCs to make this modification.
When R1 is configured in this manner, it does not advertise 100.100.100.0/24 to the other client, R8, but does advertise to other IBGP and EBGP neighbors. Example 15-63 shows that R1 is receiving 100.100.100.0/24 from R8 but is not propagating further to anyone.
Example 15-63 R1’s BGP Table Ensures That Disabling Client-to-Client Reflection Is Successful
R1#show ip bgp 100.100.100.0 BGP routing table entry for 100.100.100.0/24, version 2 Paths: (1 available, best #1, table Default-IP-Routing-Table) Flag: 0x208 Not advertised to any peer Local, (Received from a RR-client) 131.108.10.8 from 131.108.10.8 (131.108.10.8) Origin IGP, metric 0, localpref 100, valid, internal, best
Problem: Convergence Time Improvement for RR and Clients—Cause: Use of Peer Groups
When an RR is serving many clients, any update that it receives from IBGP/EBGP peers must be generated and propagated as separate updates for each RRC. If the number of BGP updates and RRCs is large, this process could become CPU-intensive for the RR. This results in slower propagation of BGP updates and hence results in slower convergence in the network overall. Peer-group clubs configure BGP neighbors in one group. Any common update that needs to go to all members of the peer group are processed only once, and all members receive the copy of that processed update. A router that has a peer group does not process update for all members of the group, resulting in huge CPU processing savings. Overall convergence of the networks improves greatly.
Figure 15-30 shows a route-reflection environment in which peer groups can be used.
Figure 15-31 shows the flowchart to follow to resolve this problem.
Debugs and Verification
Typically, peer groups contain several clients to explain the peer group usage. Example 15-64 shows the necessary configuration required by R1 to put R8 and R6 in a peer group named INTERNAL.
Example 15-64 Configuring R8 and R6 as Peer Group Members
R1#router bgp 109 no synchronization neighbor INTERNAL peer-group neighbor 131.108.10.8 remote-as 109 neighbor 131.108.10.8 update-source Loopback0 neighbor 131.108.10.8 peer-group INTERNAL neighbor 131.108.10.6 remote-as 109 neighbor 131.108.10.6 update-source Loopback0 neighbor 131.108.10.6 peer-group INTERNAL
R1 calculates one update for the first member of the peer group INTERNAL and replicate to others. Output in Example 15-65 shows that 131.108.10.8 (R8) is the first member in the list; therefore, R1 calculates updates for R8 and replicates to the rest of the members in the list INTERNAL, to avoid calculating for the rest.
In Example 15-65, R6 is the other member in the list INTERNAL.
Example 15-65 Displaying Peer Group Members R1#show ip bgp peer-group INTERNAL BGP peer-group is INTERNAL BGP version 4 Default minimum time between advertisement runs is 5 seconds BGP neighbor is INTERNAL, peer-group internal, members: 131.108.10.8 131.108.10.6 Index 1, Offset 0, Mask 0x2 Update messages formatted 4, replicated 2
Solution
When peering to several neighbors, use the Cisco IOS Software BGP peer group feature to avoid the processing duplication required to generate the same update to every neighbor. In peer groups, BGP neighbors (in this case, all RRCs) are listed as members of a peer group that share the same outbound policy. RR computes an update for the first member of the peer group and simply replicates the same update to all members. This greatly reduces the number of CPU cycles that the RR has to spend to compute update for each RRC. In addition, using peer groups speeds up the process of propagating BGP updates to RRCs; therefore, RRCs converge faster in case of any churn. Peer groups can be used in normal IBGP and EBGP scenarios to get this benefit, with the condition that all peer-group members are configured with same outbound policy.
Problem: Loss of Redundancy Between Route Reflectors and Route-Reflector Client—Cause: Cluster List Check in RR Drops Redundant Route from Other RR
A cluster is made up of an RR and its clients. A cluster can have one or more RR and is identified by a cluster ID that is the router ID of the RR. Because each RR has a unique router ID, each cluster has only one RR by default. Network operators must manually configure identical cluster IDs on two or more RRs to configure them in the same cluster. When a BGP update traverses from an RR to other neighbors, RR adds its cluster ID in the list called the cluster list, which contains all cluster IDs that any BGP update has traversed. The cluster list is synonymous with the AS_PATH list, which contains AS lists that any update has traversed. Just as in AS_PATH loop detection, in which updates are dropped if the AS_PATH contains a local AS, the cluster list detects loops if they contain a local cluster ID.
When a route-reflector client is connected to two different RRs that are in the same cluster, chances are good that the RR will not see the redundant path to the clients.
Figure 15-32 shows two RRs configured in the same cluster. Any update one received from the other that has its own cluster ID in the cluster list will be dropped.
Figure 15-32 shows how an RR and an RRC are connected in a single cluster. Each RR must be configured with same cluster ID, as shown in the “Debugs and Verification” section. R8 is advertising 100.100.100.0/24 to its IBGP neighbors R1 and R2, which are RRs for R6 and R8, and reflects 100.100.100.0/24. R1 reflects to R6 and R2, whereas R2 reflects to R1 and R6. Because they both are configured with the same cluster ID 109, the cluster list from both RRs will contain cluster ID 109, represented as 0.0.0.109 in Cisco IOS Software output.
Figure 15-33 illustrates how the RR loses redundancy to the client.
Debugs and Verification
Example 15-66 shows the configuration of the two RRs when they are configured with identical cluster IDs of 109.
Example 15-66 RRs Configured with Identical Cluster IDs
R1# router bgp 109 no synchronization bgp cluster-id 109 neighbor 172.16.18.8 remote-as 109 neighbor 172.16.18.8 route-reflector-client neighbor 172.16.126.2 remote-as 109 neighbor 172.16.126.6 remote-as 109 neighbor 172.16.126.6 route-reflector-client _____________________________________________________________________________________ R2# router bgp 109 no synchronization bgp cluster-id 109 neighbor 172.10.28.8 remote-as 109 neighbor 172.10.28.8 route-reflector-client neighbor 172.16.126.1 remote-as 109 neighbor 172.16.126.6 remote-as 109 neighbor 172.16.126.6 route-reflector-client
As depicted in Figure 15-33, R8, an RRC, advertises 100.100.100.0/24 to both of its RRs, R1 and R2. When R1 and R2 are configured with the same cluster ID, R1 and R2 have only a single update in their BGP table for 100.100.100.0/24, learned from the RRC itself.
Example 15-67 shows that the RRs have only a single entry in their BGP tables for network 100.100.100.0/24, and this entry is from the RRC.
Example 15-67 RRs R1 and R2 Have Only One Update for 100.100.100.0/24, Resulting in Loss of Redundancy
R1#show ip bgp 100.100.100.0 BGP routing table entry for 100.100.100.0/24, version 2 Paths: (1 available, best #1, table Default-IP-Routing-Table) Flag: 0x208 Advertised to non peer-group peers: 131.108.10.2 131.108.10.6 Local, (Received from a RR-client) 131.108.10.8 from 131.108.10.8 (131.108.10.8) Origin IGP, metric 0, localpref 100, valid, internal, best R1# _____________________________________________________________________________________ R2#show ip bgp 100.100.100.0 BGP routing table entry for 100.100.100.0/24, version 2 Paths: (1 available, best #1, table Default-IP-Routing-Table) Advertised to non peer-group peers: 131.108.10.1 131.108.10.6 Local, (Received from a RR-client) 131.108.10.8 from 131.108.10.8 (131.108.10.8) Origin IGP, metric 0, localpref 100, valid, internal, best\
Each RR has an update for 100.100.100.0/24 only from R8, not from the other RR.
Example 15-68 shows the output of debug ip bgp update from R2. Notice that R2 is dropping the update for 100.100.100.0/24 from R1 because it sees its own cluster ID, 109, (represented as 0.0.0.109) in the cluster list.
Example 15-68 debug ip bgp update Command Output from R2
R1# debug ip bgp update *Mar 3 11:29:11: BGP(0): 172.16.10.8 rcvd UPDATE w/ attr: nexthop 172.16.10.8, origin i, localpref 100, metric 0 *Mar 3 11:29:11: BGP(0): 172.16.10.8 rcvd 100.100.100.0/24 *Mar 3 11:29:11: BGP(0): Revise route installing 100.100.100.0/24 -> 172.16.10. 8 to main IP table *Mar 3 11:29:11: BGP: 172.16.126.1 RR in same cluster. Reflected update dropped *Mar 3 11:29:11: BGP(0): 172.16.126.1 rcv UPDATE w/ attr: nexthop 172.16.10.8, origin i, localpref 100, metric 0, originator 172.16.8.8, clusterlist 0.0.0.109, path, community, extended community *Mar 3 11:29:11: BGP(0): 172.16.126.2 rcv UPDATE about 100.100.100.0/24-- DENIED due to: reflected from the same cluster;
Solution
If a link or IBGP connection between R8 and R2 goes down, R2 has no way to reach 100.100.100.0/24. This is because R2 has rejected the 100.100.100.0/24 advertisement from R1 as a result of the cluster list check.
It is recommended that in cases similar to those depicted in Figure 15-33, RRs should not be put in the same cluster. The cluster ID will be picked as the router ID (RID) of each RR and is guaranteed to be unique because all RIDs are unique in any network.
Example 15-69 shows the configuration of all RRs and RRCs, which are in different clusters. Example 15-69 also shows the configuration in R8 to advertise 100.100.100.0/24 to R1 and R2. Example 15-69 displays the output from the BGP table. Output from R1 and R2 shows that each has a redundant path for 100.100.100.0/24, one directly to R8 and the other one through each other. If a link or BGP session between R1 and R8 is lost, R1 has a backup path through R2 to reach R8.
Example 15-69 Unique Router ID of Each RR Will Make Unique Cluster ID per RR
R1# router bgp 109 no synchronization neighbor 131.108.10.8remote-as 109 neighbor 131.108.10.8route-reflector-client neighbor 131.108.10.6remote-as 109 neighbor 131.108.10.6route-reflector-client neighbor 131.108.10.2remote-as 109 _____________________________________________________________________________________ R2# router bgp 109 no synchronization neighbor 131.108.10.8remote-as 109 neighbor 131.108.10.8route-reflector-client neighbor 131.108.10.6remote-as 109 neighbor 131.108.10.6route-reflector-client neighbor 131.108.10.1remote-as 109 _____________________________________________________________________________________ R8# router bgp 109 no synchronization neighbor 131.108.10.1remote-as 109 neighbor 131.108.10.2remote-as 109 _____________________________________________________________________________________ R6# router bgp 109 no synchronization neighbor 131.108.10.1remote-as 109 neighbor 131.108.10.2remote-as 109 _____________________________________________________________________________________ R8# router bgp 109 no synchronization network 100.100.100.0 mask 255.255.255.0 neighbor 131.108.10.1 remote-as 109 neighbor 131.108.10.2remote-as 109 ! ip route 100.100.100.0 255.255.255.0 Null0 R8#show ip bgp 100.100.100.0 BGP routing table entry for 100.100.100.0/24, version 6 Paths: (1 available, best #1, table Default-IP-Routing-Table) Flag: 0x208 Advertised to non peer-group peers: 131.108.10.1 131.108.10.2 Local 0.0.0.0 from 0.0.0.0 (131.108.10.8) Origin IGP, metric 0, localpref 100, weight 32768, valid, sourced, local,Best _____________________________________________________________________________________ R1#show ip bgp 100.100.100.0 BGP routing table entry for 100.100.100.0/24, version 2 Paths: (2 available, best #2, table Default-IP-Routing-Table) Advertised to non peer-group peers: 131.108.10.2 131.108.10.6 Local 131.108.10.8 (metric 20) from 131.108.10.2 (131.108.10.8) Origin IGP, metric 0, localpref 100, valid, internal Originator: 131.108.10.8, Cluster list: 131.108.10.2 Local, (Received from a RR-client) 131.108.10.8 from 131.108.10.8 (131.108.10.8) Origin IGP, metric 0, localpref 100, valid, internal, best _____________________________________________________________________________________ R2#show ip bgp 100.100.100.0 BGP routing table entry for 100.100.100.0/24, version 2 Paths: (2 available, best #2, table Default-IP-Routing-Table) Advertised to non peer-group peers: 172.16.126.1 172.16.126.6 Local 131.108.10.8 (metric 20) from 131.108.10.1 (131.108.10.8) Origin IGP, metric 0, localpref 100, valid, internal Originator: 131.108.10.8, Cluster list: 131.108.10.1 Local, (Received from RR-client) 131.108.10.8 from 131.108.10.8 (131.108.10.8) Origin IGP, metric 0, localpref 100, valid, internal, best
Both R1 and R2 have a redundant path to reach 100.100.100.0/24 only because of a unique cluster ID. The example shows picking a unique cluster ID from a unique router ID; an alternate way to ensure cluster ID uniqueness would be to manually configure a unique cluster ID per RR.