STP Behavior in the Baseline Network: A Spanning Tree Review
This section analyzes Spanning Tree’s default behavior in a network such as that in Figure 7-2. Not only does this serve as a useful review of the material in Chapter 6, it provides a baseline network that can be used throughout the remainder of this chapter. In doing so, the text makes no attempt to be an exhaustive tutorial—it is only capturing the critical and easily missed aspects of the protocol (see Chapter 6 for a more complete discussion of Spanning Tree basics).
General BPDU Processing
Recall from Chapter 6 that bridges share information using Bridge Protocol Data Units (BPDUs). There are two types of BPDUs:
- Configuration BPDUs— Account for the majority of BPDU traffic and allow bridges to carry out STP elections
- Topology Change Notification (TCN) BPDUs— Assist in STP failover situations
When people use the term BPDU without indicating a specific type, they are almost always referring to a Configuration BPDU.
Determining the Best Configuration BPDU
Every bridge port running the Spanning-Tree Protocol saves a copy of the best Configuration BPDU it has seen. In doing this, the port not only considers every BPDU it receives from other bridges, but it also evaluates the BPDU that it would send out that port.
To determine the best Configuration BPDU, STP uses a four-step decision sequence as follows:
- The bridges look for the lowest Root Bridge Identifier (BID), an eight-byte field composed of a Bridge Priority and a Media Access Control (MAC) address. This allows the entire bridged network to elect a single Root Bridge.
- The bridges consider Root Path Cost, the cumulative cost of the path to the Root Bridge. Every non-Root Bridge uses this to locate a single, least-cost path to the Root Bridge.
- If the cost values are equal, the bridges consider the BID of the sending device.
- Port ID (a unique index value for every port in a bridge or switch) is evaluated if all three of the previous criteria tie.
- Tip
A shortened, easy-to-remember outline of the four-step STP decision sequence to determine the best Configuration BPDU is as follows:
- Lowest Root BID
- Lowest Root Path Cost
- Lowest Sending BID
- Lowest Port ID
As long as a port sees its own Configuration BPDU as the most attractive, it continues sending Configuration BPDUs. A port begins this process of sending Configuration BPDUs in what is called the Listening state. Although BPDU processing is occurring during the Listening state, no user traffic is being passed. After waiting for a period of time defined by the Forward Delay parameter (default=15 seconds), the port moves into the Learning state.
At this point, the port starts adding source MAC addresses to the bridging table, but all incoming data frames are still dropped. After another period equal to the Forward Delay, the port finally moves into the Forwarding state and begins passing end-user data traffic. However, if at any point during this process the port hears a more attractive BPDU, it immediately transitions into the Blocking state and stops sending Configuration BPDUs.
Converging on an Active Topology
Configuration BPDUs allow bridges to complete a three-step process to initially converge on an active topology:
- Elect a single Root Bridge for the entire Spanning Tree domain.
- Elect one Root Port for every non-Root Bridge.
- Elect one Designated Port for every segment.
First, the bridges elect a single Root Bridge by looking for the device with the lowest Bridge ID (BID). By default, all bridges use a Bridge Priority of 32,768, causing the lowest MAC address to win this Root War. In the case of Figure 7-2, Cat-A becomes the Root Bridge.
- Tip
As with all Spanning Tree parameters, the lowest numeric Bridge ID value represents the highest priority. To avoid the potential confusion of lowest value and highest priority, this text always refers to values (in other words, the lower amount is preferred by STP).
Second, every non-Root Bridge elects a single Root Port, its port that is closest to the Root Bridge. Cat-B has to choose between three ports: Port 1/1 with a Root Path Cost of 57, Port 1/2 with a cost of 38, or Port 2/1 with a cost of 19. Obviously, Port 2/1 is the most attractive and becomes the Root Port. Similarly, Cat-C chooses Port 2/1. However, Cat-D calculates a Root Path Cost of 38 on both ports—a tie. This causes Cat-D to evaluate the third decision criterion—the Sending BID. Because Cat-B has a lower Sending BID than Cat-C, Cat-D:Port-1/1 becomes the Root Port.
Finally, a Designated Port is elected for every LAN segment (the device containing the Designated Port is referred to as the Designated Bridge). By functioning as the only port that both sends and receives traffic to/from that segment and the Root Bridge, Designated Ports are the mechanism that actually implement a loop-free topology. It is best to analyze Designated Port elections on a per-segment basis. In Figure 7-2, there are five segments. Segment 1 is touched by two bridge ports—Cat-A:Port-1/1 at a cost of zero and Cat-B:Port-2/1 at a cost of 19. Because the directly-connected Root Bridge has a cost of zero, Cat-A:Port-1/1 obviously becomes the Designated Port. A similar process elects Cat-A:Port-1/2 as the Designated Port for Segment 2. Segment 3 also has two bridge ports: Cat-B:Port-1/1 at a cost of 19 and Cat-D:Port-1/1 at a cost of 38. Because it has the lower cost, Cat-B:Port-1/1 becomes the Designated Port. Using the same logic, Cat-C:Port-1/1 becomes the Designated Port for Segment 4. In the case of Segment 5, there are once again two options (Cat-B:Port-1/2 and Cat-C:Port-1/2), however both are a cost of 19 away from the Root Bridge. By applying the third decision criterion, both bridges determine that Cat-B:Port-1/2 should become the Designated Port because it has the lower Sending BID.
Figure 7-3 shows the resulting active topology and port states.
Figure 7-3. Active Topology and Port States in the Baseline Network
Two ports remain in the Blocking state: Cat-C:Port-1/2 and Cat-D:Port-1/2. These ports are often referred to as non-Designated Ports. They provide a loop-free path from every segment to every other segment. The Designated Ports are used to send traffic away from the Root Bridge, whereas Root Ports are used to send traffic toward the Root Bridge.
- Note
From a technical perspective, it is possible to debate the correct ordering of Steps 2 and 3 in what I have called the 3-Step Spanning Tree Convergence Process. Because 802.1D (the Spanning Tree standards document) specifically excludes Designated Ports from the Root Port election process, the implication is that Designated Ports must be determined first. However, 802.1D also lists the Root Port selection process before the Designated Port selection process in its detailed pseudo-code listing of the complete STP algorithm. This text avoids such nerdy debates. The fact of the matter is that both occur constantly and at approximately the same time. Therefore, from the perspective of learning how the protocol operates, the order is irrelevant.
In addition to determining the path and direction of data forwarding, Root and Designated Ports also play a key role in the sending of BPDUs. In short, Designated Ports send Configuration BPDUs, whereas Root Ports send TCN BPDUs. The following sections explore the two types of BPDUs in detail.
Configuration BPDU Processing
Configuration BPDUs are sent in three cases. The discussion that follows breaks these into two categories, normal processing and exception processing. (This terminology is non-standard, but useful for understanding how STP works.)
Normal Configuration BPDU Processing
Normal processing occurs every Hello Time seconds on all ports of the Root Bridge (unless there is a physical layer loop). This results in the origination of Configuration BPDUs at the Root Bridge.
Normal processing also occurs when a non-Root Bridge receives a Configuration BPDU on its Root Port and sends an updated version of this BPDU out every Designated Port. This results in the propagation of Configuration BPDUs away from the Root Bridge and throughout the entire Layer 2 network.
These two conditions account for the normal flow of Configuration BPDUs that constantly stream away from the Root Bridge during steady state processing. The Root Bridge originates Configuration BPDUs on its Designated Ports every two seconds (the default value of Hello Time). Note that every active port on the Root Bridge should be a Designated Port unless there is a physical layer loop to multiple ports on this bridge. As these Configuration BPDUs arrive at the Root Ports of downstream bridges, these bridges then propagate Configuration BPDUs on their Designated Ports. Figure 7-4 illustrates how this process propagates Configuration BPDUs away from the Root Bridge.
Figure 7-4. Normal Sending of Configuration BPDUs
Figure 7-4 shows Cat-A, the Root Bridge, originating Configuration BPDUs every two seconds. As these arrive on Cat-B:Port-1/1 (the Root Port for Cat-B), Configuration BPDUs are sent out Cat-B’s Designated Ports, in this case Port 1/2.
- Several observations can be made about the normal processing of Configuration BPDUs:
- Configuration BPDUs flow away from the Root Bridge.
- Root Ports receive Configuration BPDUs.
- Root Ports do not send Configuration BPDUs.
- Blocking ports do not send Configuration BPDUs.
- If the Root Bridge fails, Configuration BPDUs stop flowing throughout the network. This absence of Configuration BPDUs continues until another bridge’s Max Age timer expires and starts taking over as the new Root Bridge.
- If the path to the Root Bridge fails (but the Root Bridge is still active), Configuration BPDUs stop flowing downstream of the failure. If an alternate path to the Root Bridge is available, this absence of Configuration BPDUs continues until another path is taken out of the Blocking state. If an alternate path to the Root Bridge is not available, the bridged network has been partitioned and a new Root Bridge is elected for the isolated segment of the network.
Therefore, under the normal behavior, a non-Root Bridge only sends Configuration BPDUs when a Root Bridge-originated BPDU arrives on its Root Port.
Exception Configuration BPDU Processing
Exception Configuration BPDU processing, by contrast to normal processing, occurs when a Designated Port hears an inferior BPDU from some other device and sends a Configuration BPDU in response. The Spanning Tree algorithm includes this exception processing to squelch less attractive information as quickly as possible and speed convergence. For example, consider Figure 7-5 where the Root Bridge failed (Step 1 in the figure) just before Cat-C was connected to the Ethernet hub (Step 2 in the figure).
Figure 7-5. The Root Bridge Failed Just Before Cat-C Was Connected
Figure 7-6 illustrates the conversation that ensues between Cat-C and Cat-B.
Figure 7-6. Exception Processing of Configuration BPDUs
As discussed in Chapter 6, Cat-C initially assumes it is the Root Bridge and immediately starts sending BPDUs to announce itself as such. Because the Root Bridge is currently down, Cat-B:Port-1/2 has stopped sending Configuration BPDUs as a part of the normal processing. However, because Cat-B:Port-1/2 is the Designated Port for this segment, it immediately responds with a Configuration BPDU announcing Cat-A as the Root Bridge. By doing so, Cat-B prevents Cat-C from accidentally trying to become the Root Bridge or creating loops in the active topology.
The sequence illustrated in Figure 7-6 raises the following points about Configuration BPDU exception processing:
- Designated Ports can respond to inferior Configuration BPDUs at any time.
- As long as Cat-B saves a copy of Cat-A’s information, Cat-B continues to refute any inferior Configuration BPDUs.
- Cat-A’s information ages out on Cat-B in Max Age seconds (default=20 seconds). In the case of Figure 7-5, Cat-B begins announcing itself as the Root Bridge at that time.
- By immediately refuting less attractive information, the network converges more quickly. Consider what might happen if Cat-B only used the normal conditions to send a Configuration BPDU—Cat-C would have 20 seconds to incorrectly assume that it was functioning as the Root Bridge and might inadvertently open up a bridging loop. Even if this did not result in the formation of a bridging loop, it could lead to unnecessary Root and Designated Port elections that could interrupt traffic and destabilize the network.
- Because Cat-D:Port-1/1 is not the Designated Port for this segment, it does not send a Configuration BPDU to refute Cat-C.
Tip
Configuration BPDUs are sent in three cases:
- When the Hello Timer expires (every two seconds by default), the Root Bridge originates a Configuration BPDU on every port (assuming no self-looped ports). This is a part of the normal Configuration BPDU processing.
- When non-Root Bridges receive a Configuration BPDU on their Root Port, they send (propagate) updated Configuration BPDUs on all of their Designated Ports (normal processing).
- When a Designated Port hears an inferior Configuration BPDU from another switch, it sends a Configuration BPDU of its own to suppress the less attractive information.
TCN BPDU Processing
Whereas Configuration BPDUs are the general workhorse of the STP algorithm, TCN BPDUs perform a very specific role by assisting in network recovery after changes in the active topology. When a non-Root Bridge detects a change in the active topology, a TCN BPDU is propagated upstream through the network until the Root Bridge is reached. The Root Bridge then tells every bridge in the network to shorten their bridge table aging periods from 300 seconds to the interval specified by Forward Delay. In other words, TCN BPDUs are used to tell the Root Bridge that the topology has changed so that Configuration BPDUs can be used to tell every other bridge of the event.
TCN BPDUs are sent in three cases. It is useful to group these into two categories, change detection and propagation:
- Change detection— Occurs in the event that a bridge port is put into the Forwarding state and the bridge has at least one Designated Port. Change detection also occurs when a port in the Forwarding or Learning states transitions to the Blocking state.
- Propagation— Occurs in the event that a non-Root Bridge receives a TCN (from a downstream bridge) on a Designated Port.
The first two conditions categorized under change detection constitute a change in the active topology that needs to be reflected in bridging tables throughout the network. The last condition is used to propagate TCN BPDUs up through the branches of the Spanning Tree until they reach the Root Bridge.
Tip
TCN BPDUs are sent in three cases:
- When a port is put in the Forwarding state and the bridge has at least one Designated Port (this is a part of change detection).
- When a port is transitioned from the Forwarding or Learning states back to the Blocking state (change detection).
- When a TCN BPDU is received on Designated Port, it is forwarded out the bridge’s Root Port (propagation).
Several observations can be made about TCN BPDUs:
- TCN BPDUs are only sent out Root Ports.
- TCN BPDUs are the only BPDUs sent out Root Ports (Configuration BPDUs are only sent out Designated Ports, not Root Ports).
- TCN BPDUs are received by Designated Ports.
- TCN BPDUs flow upstream toward the Root Bridge.
- TCN BPDUs use a reliable mechanism to reach the Root Bridge. When a bridge sends a TCN BPDU, it continues repeating the BPDU every Hello Time seconds until the upstream bridge acknowledges receipt with a Topology Change Acknowledgement flag in a Configuration BPDU. TCN BPDUs are not periodic in the same sense as Configuration BPDUs. Other than the retransmission of already generated TCN BPDUs discussed in the previous bullet and used as a reliability mechanism, completely new TCN BPDUs are not sent until the next topology change occurs (this could be hours, days, or weeks later).
- TCN BPDUs are acknowledged even if the normal Configuration BPDU processing discussed earlier has stopped (because the flow of Configuration BPDUs from the Root Bridge has stopped flowing).
Tip
The TCN process is discussed in considerably more detail in Chapter 6 (see the section “Topology Change Notification BPDUs”).
STP Timers
The Spanning-Tree Protocol provides three user-configurable timers: Hello Time, Forward Delay, and Max Age. To avoid situations where each bridge is using a different set of timer values, all bridges adopt the values specified by the Root Bridge. The current Root Bridge places its timer values in the last three fields of every Configuration BPDU it sends. Other bridges do not alter these values as the BPDUs propagate throughout the network. Therefore, timer values can only be adjusted on Root Bridges.
- Tip
Avoid the frustration of trying to modify timer values from non-Root Bridges—they can only be changed from the Root Bridge. However, do not forget to modify the timer values on any backup Root Bridges so that you can keep a consistent set of timers even after a primary Root Bridge failure.
The Hello Time timer is used to control the sending of BPDUs. Its main duty is to control how often the Root Bridge originates Configuration BPDUs; however, it also controls how often TCN BPDUs are sent. By repeating TCN BPDUs every Hello Time seconds until a Topology Change Acknowledgement (TCA) flag is received from the upstream bridge, TCN BPDUs are propagated using a reliable mechanism. 802.1D, the Spanning Tree standard document, specifies an allowable range 1–10 seconds for Hello Time. The syntax for changing the Hello Time is as follows (see the section “Lowering Hello Time to One Second” for more information):
set spantree hello interval [vlan]
The Forward Delay timer primarily controls the length of time a port spends in the Listening and Learning states. It is also used in other situations related to the topology change process. First, when the Root Bridge receives a TCN BPDU, it sets the Topology Change (TC) flag for Forward Delay+Max Age seconds. Second, this action causes all bridges in the network to shorten their bridge table aging periods from 300 seconds to Forward Delay seconds. Valid Forward Delay values range from 4–30 seconds with the default being 15 seconds. To change the Forward Delay, use the following command (see the section “Tuning Forward Delay” for more information):
set spantree fwddelay delay [vlan]
The Max Age timer controls the maximum length of time that a bridge port saves Configuration BPDU information. This allows the network to revert to a less attractive topology when the more attractive topology fails. As discussed in the previous paragraph, Max Age also plays a role in controlling how long the TC flag remains set after the Root Bridge receives a TCN BPDU. Valid Max Age values are 6 to 40 seconds with the default being 20 seconds. The command for changing the Max Age time is shown in the following (see the section “Tuning Max Age” for more information):
set spantree maxage agingtime [vlan]
Configuration BPDUs also pass a fourth time-related value, the Message Age field (don’t confuse this with Max Age). The Message Age field is not a periodic timer value—it contains the length of time since a BPDU’s information was first originated at the Root Bridge. When Configuration BPDUs are originated by the Root Bridge, the Message Age field contains the value zero. As other bridges propagate these BPDUs through the network, the Message Age field is incremented by one at every bridge hop.
Although 802.1D allows for more precise timer control, in practice, bridges simply add one to the existing value, resulting in something akin to a reverse TTL. If connectivity to the Root Bridge fails and all normal Configuration BPDU processing stops, this field can be used to track the age of any information that is sent during this outage as a part of Configuration BPDU exception processing discussed earlier.
The Spanning-Tree Protocol also uses a separate Hold Time value to prevent excessive BPDU traffic. The Hold Time determines the minimum time between the sending of any two back-to-back Configuration BPDUs on a given port. It prevents a cascade effect of where BPDUs spawn more and more other BPDUs. This parameter is fixed at a non-configurable value of one second.
How Far Does Spanning Tree Reach?
It is important to note the impact of placing routers (or Layer 3 switches) in a campus network. Figure 7-7 illustrates an example.
Figure 7-7. A Network Consisting of Two Bridge Domains
Each half in this network has a completely independent Spanning Tree. For example, each half elects a single Root Bridge. When a topology change occurs on the left side of the network, it has no effect on the right side of the network (at least from a Spanning Tree perspective). As Chapter 14, “Campus Design Models,” discusses, you should strive to use Layer 3 routers and Layer 2 switches together to maximize the benefit that this sort of separation can have on the stability and scalability of your network.
On the other hand, routers do not always break a network into separate Spanning Trees. For example, there could be a backdoor connection that allows the bridged traffic to bypass the router as illustrated in Figure 7-8.
Figure 7-8. Although This Network Contains a Router, It Represents a Single Bridge Domain
In this case, there is a single, contiguous, Layer 2 domain that is used to bypass the router. The network only elects a single Root Bridge, and topology changes can create network-wide disturbances.
- Tip
By default, Route Switch Modules (RSMs) (and other router-on-a-stick implementations) do not partition the network as shown in Figure 7-7. This can significantly limit the scalability and stability of the network. See Chapter 11, “Layer 3 Switching,” Chapter 14, “Campus Design Models,” and Chapter 15, “Campus Design Implementation,” for more information.
As Chapter 6 explained, Cisco uses a separate instance of Spanning Tree for each VLAN (per-VLAN Spanning Tree—PVST). This provides two main benefits: control and isolation.
Spanning Tree control is critical to network design. It allows each VLAN to have a completely independent STP configuration. For example, each VLAN can have the Root Bridge located in a different place. Cost and Priority values can be tuned on a per-VLAN basis. Per-VLAN control allows the network designer total flexibility when it comes to optimizing data flows within each VLAN. It also makes possible Spanning Tree load balancing, the subject of the next section.
Spanning Tree isolation is critical to the troubleshooting and day-to-day management of your network. It prevents Spanning Tree topology changes in one VLAN from disturbing other VLANs. If a single VLAN loses its Root Bridge, connectivity should not be interrupted in other VLANs.
- Note
Although Spanning Tree processing is isolated between VLANs, don’t forget that a loop in a single VLAN can saturate your trunk links and starve out resources in other VLANs. This can quickly lead to a death spiral that brings down the entire network. See Chapter 15 for more detail and specific recommendations for solving this nasty problem.
However, there are several technologies that negate the control and isolation benefits of PVST. First, standards-based 802.1Q specifies a single instance of Spanning Tree for all VLANs. Therefore, if you are using vanilla 802.1Q devices, you lose all of the advantages offered by PVST. To address this limitation, Cisco developed a feature called PVST+ that is discussed later in this chapter. Second, bridging between VLANs defeats the advantages of PVST by merging the multiple instances of Spanning Tree into a single tree.
- Note
Although the initial version of 802.1Q only specified a single instance of the Spanning-Tree Protocol, future enhancements will likely add support for multiple instances. At presstime, this issue was being explored in the IEEE 802.1s committee.
To avoid the confusion introduced by these issues and exceptions, it is often useful to employ the term Spanning Tree domain. Each Spanning Tree domain contains its own set of STP calculations, parameters, and BPDUs. Each Spanning Tree domain elects a single Root Bridge and converges on one active topology. Topology changes in one domain do not effect other domains (other than the obvious case of shared equipment failures). The term Spanning Tree domain provides a consistent nomenclature that can be used to describe STP’s behavior regardless of whether the network is using any particular technology.