Catalyst Troubleshooting Tools
Cisco built several mechanisms into the Catalyst to facilitate troubleshooting and diagnostics. Some standalone and others work in conjunction with external troubleshooting tools that you need to provide. These built-in tools help to troubleshoot Layer 1 and Layer 2. You usually need to do Layer 3 troubleshooting in your routers and workstations. Just make sure that Layer 1 or 2 isn’t preventing Layer 3 from performing, as demonstrated in the previous section. The following sections examine key troubleshooting tools for evaluating the health of your switched network, including the following:
- Various show commands not discussed in previous chapters exist offering more insight into potential problem areas in your network.
- The Catalyst has extended SNMP MIB definitions and RMON (remote monitoring) capabilities to accumulate statistics on network performance and behavioral anomalies.
- A logging feature allows you to automatically record significant Catalyst events on a server. You can review the history of your Catalyst in the text file generated by the logging feature.
- An inherent capability to examine traffic in a Catalyst through a Switched Port Analyzer (SPAN) port. The SPAN port allows you to connect an external analyzer to the Catalyst and capture traffic from a port or VLAN within the Catalyst. You can look at both access and trunk links.
show Commands
Throughout this book, each chapter has presented show commands relevant to the chapter subject material. However, several additional show commands exist in the Catalyst to further enable you to diagnose your switched network environment.
show test Command
For example, you can obtain detailed information about the health of your Catalyst hardware through the show test command. It displays the results of the built-in hardware tests in the Catalyst. On power up, the Catalyst tests the power supply and components on the Supervisor module, including the bridging table memory and related chipsets. Example 16-2 shows an abbreviated output from the show test command.
Example 16-2 Output from show test
Console> show test Environmental Status (. = Pass, F = Fail, U = Unknown) PS (3.3V): . PS (12V): . PS (24V): . PS1: . PS2: . Temperature: . Fan: . Module 1 : 2-port 10/100BaseTX Supervisor Network Management Processor (NMP) Status: (. = Pass, F = Fail, U = Unknown) ROM: . Flash-EEPROM: . Ser-EEPROM: . NVRAM: . MCP Comm: . EARL Status : NewLearnTest: . IndexLearnTest: . DontForwardTest: . MonitorTest . DontLearn: . FlushPacket: . ConditionalLearn: . EarlLearnDiscard: . EarlTrapTest: . LCP Diag Status for Module 1 (. = Pass, F = Fail, N = N/A) CPU : . Sprom : . Bootcsum : . Archsum : . RAM : . LTL : . CBL : . DPRAM : . SAMBA : . Saints : . Pkt Bufs : . Repeater : N FLASH : . Phoenix : . TrafficMeter: . UplinkSprom : . PhoenixSprom: .
The first highlighted portion shows the results of the power supply tests. Because no F appears next to the supply entries, they passed the test. Other environmental test results are shown in this block. The second category tests the Enhanced Address Recognition Logic (EARL) functionality. The EARL manages the bridge tables. Again, only dots “.” appear next to each test and therefore represent pass.
show port counters Command
Another command displays much information about Layer 2 media and access operations. Use the show port counters command to gain insight into the operations of the segment as shown in Example 16-3.
Example 16-3 Output from show port counters
Console> show port counters Port Align-Err FCS-Err Xmit-Err Rcv-Err UnderSize ----- ---------- ---------- ---------- ---------- --------- 1/1 0 0 0 0 0 1/2 0 0 0 0 0 4/1 0 0 0 0 0 4/2 0 0 0 0 0 4/3 0 0 0 0 0 4/4 0 0 0 0 0 Port Single-Col Multi-Coll Late-Coll Excess-Col Carri-Sen Runts Giants ----- ---------- ---------- ---------- ---------- --------- --------- --------- 1/1 12 0 0 0 0 0 - 1/2 0 0 0 0 0 0 0 4/1 0 0 0 0 0 0 0 4/2 0 0 0 0 0 0 0 4/3 0 0 0 0 0 0 0 4/4 0 0 0 0 0 0 0 Ler Port CE-State Conn-State Type Neig Con Est Alm Cut Lem-Ct Lem-Rej-Ct Tl-Min ----- -------- ---------- ---- ---- --------------- ---------- ---------- ------ 3/1 isolated connecting A U no 9 9 7 0 0 102 3/2 isolated connecting B U no 9 8 7 0 0 40
Several of these fields merit discussion as values in some columns can suggest areas to investigate. Values in the Align-Err and FCS-Err fields indicate that the media cable deteriorated or that the station NIC no longer operates correctly. These values increment whenever the received frame has errors in it. The errors are detected by the receiver with the CRC field on the frame. Align-Err further indicates that the frame had a bad number of octets in it. This can strongly point to a NIC failure.
The Xmit-Err and Rcv-Err fields indicate that the port buffers overflowed, causing the Catalyst to discard frames. This happens if the port experiences congestion preventing the Catalyst to forward frames onto the switching BUS, or out the interface onto the media. To help resolve the first case where the port cannot transfer the frame over the BUS (Rcv-Err), increase the port priority to high with the set port priority command. When set to high, the BUS arbiter grants the port access to the BUS at a rate five times more frequently than normal. This has the effect of emptying the buffer at a faster rate.
- Tip
Do not set all ports to high priority as this effectively eliminates any advantage to it. Use this setting on your high volume servers.
If the Catalyst drops frames because it cannot place frames onto the media, this can indicate a congestion situation where there is not enough bandwidth on the media to support the amount of traffic trying to transmit through it. Figure 16-3 illustrates a switched network where multiple sources need to communicate with the same device.
Figure 16-3. A LAN Congestion Situation
In Figure 16-3, the aggregate traffic from the sources exceeds the bandwidth available. The upper devices connect at 100 Mbps, but attempt to access a device running at 10 Mbps. If all of the stations transmit at the same time, they quickly overwhelm the 10 Mbps link. This forces the Catalyst to internally buffer the frames until bandwidth becomes available. Like any LAN device, however, the Catalyst does not hold onto the frame indefinitely. If it cannot transmit the frame in a fairly short period of time, the frame is discarded.
This can happen if the Catalyst repeatedly experiences collisions when it attempts to transmit the frame. Like other LAN devices, the Catalyst attempts to transmit the frame up to 16 times. After 16 collisions, the Catalyst drops the frame. A Catalyst can also discard a frame if there is no more buffer space available. To fix this, you might need to increase the port bandwidth, or create multiple collision or broadcast domains on the egress side of the system.
Three fields indicate bad frame sizes: UnderSize, Runts, and Giants. The first two fields indicate frames that are less than a legal media frame size, whereas Giants indicates frames too large per the media specification. UnderSize and Giant frames usually mean that the frame format and CRC are valid, but the frame size falls outside of the media parameters. For example, a malfunctioning Ethernet station might create an Ethernet frame less than 64 bytes in length. Although the MAC header and CRC values are valid, they do not meet the Ethernet frame size requirements. The Catalyst discards any such frame. Runt frames differ from UnderSize frames in that they are usually a byproduct of a collision on a shared media. Runts, unlike UnderSize frames, do not carry valid CRC values.
If you see the Runt counter continuously incrementing across periods of time, this can indicate that you either have too many devices contending for bandwidth in the collision domain or you have a media problem generating collisions and the runt byproduct. If the problem stems from bandwidth contention, break the segment into smaller collision domains. If the problem is from media (such as a 10BASE2 termination problem), fix it!
Four fields describe collision combinations: Single-Coll, Multi-Coll, Excess-Col, and Late-Coll. Single-Coll counts how many times the Catalyst wanted to transmit a frame, but experienced one and only one collision. After the collision, the Catalyst attempted to transmit again, but this time successfully. Multi-Coll counts collisions inclusively, from 2–15. The Catalyst attempted multiple times to transmit the frame, but experienced collisions when doing so. Eventually, it successfully transmitted the frame. Excess-Col counters increment whenever the Catalyst tries 16 times to transmit. When this counter increments, the Catalyst discards the frame. Late-Coll stands for late collision.
A late collision occurs when the Catalyst detects a collision outside of the collision time domain described in Chapter 1, “Desktop Technologies.” This means that the collision domain is too large. You either have too many cables or repeaters extending the end to end distance beyond the media timeslot specifications. Shorten the collision domain with bridges or by removing offending equipment.
show mac Command
The show mac command provides information on the number of frames transmitted and received on each Catalyst interface. Specifically, the show mac command provides a count of the total number of frames on the interface, the number of multicast frames, and the number of broadcast frames.
Although most of the column headers are fairly self explanatory, a couple deserve additional clarification. Example 16-4 shows a partial listing of the show mac output.
Example 16-4 Partial show mac Output
Console> show mac 3/4 MAC Rcv-Frms Xmit-Frms Rcv-Multi Xmit-Multi Rcv-Broad Xmit-Broad -------- ---------- ---------- ---------- ---------- ---------- ---------- 3/4 0 0 0 0 0 0 MAC Dely-Exced MTU-Exced In-Discard Lrn-Discrd In-Lost Out-Lost -------- ---------- ---------- ---------- ---------- ---------- ---------- 3/4 0 0 0 0 0 0
The first line of the output shows the frame counters mentioned previously. The second line, highlighted in this example, counts other events. Dely-Exced indicates how many times that the Catalyst had to discard a frame when it wanted to transmit, but had to defer (wait to transmit) because the media was busy. The wait time was excessive because a source transmitted longer than what is expected for the media. This is sometimes referred to as jabber and is caused by a malfunctioning NIC in a shared media network. Rather than indefinitely holding the frame, the Catalyst discards the frame. Therefore, this counter displays the number of frames discarded because of the jabber. This should only occur when the port is attached to shared media.
MTU-Exced counts how many times the port received a frame where the frame exceeded the Maximum Transmission Unit (MTU) frame size configured on the interface. The size is set to the media maximum by default, but you can elect to reduce this value. You can do this when you have an FDDI source trying to communicate to an Ethernet source and want to ensure that any frames over the Ethernet MTU are discarded by the switch.
In-Discard reflects the number of times that the Catalyst discards a frame due to bridge filtering. This occurs when the source and destination reside on the same interface. See Chapter 3, “Bridging Technologies,” for details on filtering.
Bridges (and Catalysts) have a finite amount of memory space for the bridge tables. The bridge fills the table through the bridge learning process described in Chapter 3. Depending upon the model of Catalyst you have, the Catalyst can remember up to 16,000 entries. But if you have a very large system where this memory space gets filled because of many stations, the Catalyst must replace existing entries until older entries are aged from the table to free space. The Lrn-Discrd counter tracks the number of unlearned addresses where the switch normally learns the source address, but cannot because the bridge table is already full.
In-Lost and Out-Lost represent the number of frames dropped by the Catalyst due to insufficient buffer space. In-Lost counts the frames coming into the port from the LAN. Out-Lost counts the frames to go out the port to the LAN.
show counters Command
The undocumented show counters command allows you to view a number of SNMP and RMON counters. For details on what each means, view appropriate RFCs for the media description.
SPAN
Sometimes you want to examine traffic flowing in and out of a port, or within a VLAN. In a shared network, you attach a network analyzer to an available port and your analyzer promiscuously listens to all traffic on the segment. Your analyzer can then decode the frames and provide you with a detailed analysis of the frame content. In a switched network, however, this is not nearly as simple as in a shared network. For one thing, a switch filters a frame from transmitting out a port unless the bridge table believes the destination is on the port, or unless the bridge needs to flood the frame. This is clearly inadequate for traffic analysis purposes. Therefore, the normal Catalyst behavior must be modified to capture traffic on other ports. The Catalyst feature called Switched Port Analyzer (SPAN) enables you to attach an analyzer on a switch port and capture traffic from other ports in the switch.
High performance analysis tools are also available such as the Network Analysis Module which provides enhanced RMON reporting to your network management station. This module plugs into a slot in your Catalyst and monitors traffic from a SPAN port or from NetFlow.
Another Cisco monitoring tool, the SwitchProbe, attaches externally to a Catalyst SPAN port or network segment and gathers RMON statistics that can then be retrieved by your network management station.
By default, this feature is disabled. You need to explicitly enable SPAN to capture traffic from other ports. When you enable SPAN, you need to specify what you want to monitor and where you want to monitor it.
What you can monitor includes:
- An individual port
- Multiple ports on the local Catalyst
- Local traffic for a VLAN
- Local traffic for multiple VLANs
Monitored traffic goes to a port on the local Catalyst. Figure 16-4 illustrates that the traffic from VLAN 100 is monitored and directed to the analyzer attached to Port 4/1.
Figure 16-4. A SPAN VLAN Example
Although the set span 100 4/1 command says to monitor VLAN 100, note that only VLAN 100 traffic local to Cat-A is captured. If stations on Cat-B transmit unicast traffic to each other, and the frames are not flooded, the analyzer does not see that traffic. The only traffic that the analyzer can see is traffic flooded within VLAN 100, and any local unicast traffic.
Tip
Be careful when monitoring Gigabit Ethernet. The 9-port Gigabit Ethernet module provides local switching and cannot SPAN the switch backplane. If you have the 3-port Gigabit Ethernet module, this is not so. You can monitor all traffic within the Catalyst.
Tip
Although you can direct VLAN traffic to a SPAN port, the port sees only the local VLAN traffic. If a VLAN has a presence in multiple Catalysts, the SPAN port displays the VLAN traffic found in the local Catalyst where you enable SPAN. Therefore, you get all of the local traffic. You see VLAN traffic from other Catalysts only if the frame is forwarded or flooded to your local Catalyst. If a frame stays local in a remote Catalyst, your SPAN port does not detect this frame.
Tip
Be careful with the syntax for this command. It is very similar to the set spantree command. set span and set spant are the short forms of two different commands.
Destination Port Attributes
Depending upon the source traffic you are monitoring, you might need to be careful to avoid congestion on your SPAN port. For example, if you monitor a busy VLAN, the aggregate source traffic from the VLAN is sent to the SPAN port. You would not want, therefore, to monitor an entire VLAN and have the traffic sent to a 10 Mbps interface. This is especially true if the VLAN member ports are 100 Mbps interfaces. Make sure that your SPAN port has adequate bandwidth to effectively capture the traffic you want to monitor.
Logging
It is a good idea to maintain a log of significant events of your equipment. An automatic feature in your Catalyst can transmit information that you deem as important to a TFTP file server for you to evaluate at a later time. You might want this information for troubleshooting reasons or security reasons. You can use the file, for example, to answer questions such as “What was the last configuration?” or “Did any ports experience unusual conditions?”
A number of configuration commands modify the logging behavior. By default, logging is disabled. However, you can enable logging and direct the output to an internal buffer, to the console, or to a TFTP server. The following commands send events to the server:
- set logging server {enable | disable}— This command enables or disables the log to server feature. You must enable it if you plan to automatically record events on the server.
- set logging server ip_addr— Use this command to inform your Catalyst about the IP address for the TFTP server.
- set logging server facility server_facility_parameter— A number of processes can be monitored and logged to the server. For example, significant VTP, CDP, VMPS, and security services can be monitored. Reference the Catalyst documentation for a detailed list.
- set logging server severityserver_severity_level— Various degrees of severity ranging in value from 0 through 7 describe the events. 0 indicates emergency situations, and 6 is informational. 7 is used for debugging levels. If you set the severity level to 6, you will have a lot of entries in the logging database because it provides information on trivial and significant events. If you set the level to 0, you will only get records when something catastrophic happens. An intermediate level is appropriate for most networks.