Diagnosing and Troubleshooting Intermittent Cisco 2960 Switch Port Failures After Lightning Strike


19 views

Electrical surges from lightning strikes often cause subtle hardware degradation that manifests weeks later. In this case, we're observing intermittent packet loss (5-10%) across multiple 2960 switch ports despite passing basic cable tests. The 77-meter trunked link serving VoIP phones and PCs shows symptoms including:

  • Dropped RTP packets in VoIP calls
  • MS Exchange connection timeouts
  • TCP retransmissions visible in packet captures

Beyond standard show interface, these Cisco IOS commands provide deeper insights:

# Monitor error counters in real-time
show interfaces gigabitethernet4/0/9 counters errors | begin Input

# Check for buffer allocation failures
show platform hardware interface gigabitethernet4/0/9 statistics

# PHY layer diagnostics (2960X/S specific)
show controllers ethernet-controller gigabitethernet4/0/9 phy

The pastebin output reveals CRC errors incrementing without corresponding carrier transitions. This suggests:

  1. Marginal signal integrity at the PHY layer
  2. Possible damage to the switch ASIC's SerDes circuitry
  3. Electrostatic discharge residue on port connectors

Create a test harness with iPerf3 running between known-good devices:

# On test server:
iperf3 -s -p 5201

# On client (run for each suspect port):
iperf3 -c server_ip -t 300 -P 8 -O 5 -J > port_test_g4_0_9.json

Compare JSON outputs for:

  • retransmits values
  • TCP MSS variations
  • Jitter in interval measurements

For mission-critical environments:

! Apply aggressive error protection
interface GigabitEthernet4/0/14
 flowcontrol receive on
 storm-control broadcast level 1.00
 storm-control action trap

Consider hardware workarounds:

  • Insert managed media converters as buffers
  • Implement fiber uplinks for lightning-prone buildings
  • Use SFP-based connections instead of built-in copper ports

The smoking gun appears in the TDR results when comparing multiple ports:

# Healthy port shows consistent pairs:
Pair A: 79 +/- 0 meters  
Pair B: 75 +/- 0 meters

# Degraded port shows:
Pair A: 79 +/- 2 meters
Pair B: 112 +/- 5 meters (impossible length)

This indicates physical layer corruption within the switch port.


Six weeks post-lightning strike, our Cisco 2960 stack began exhibiting intermittent packet loss (5-10%) on specific ports serving 77-meter runs to VoIP phones and workstations. Symptoms included:

  • Dropped VoIP calls (SIP timeout errors)
  • Exchange connectivity interruptions (Outlook disconnects)
  • TCP retransmissions visible in Wireshark captures

We executed a full troubleshooting matrix:

# Interface error monitoring
show interface GigabitEthernet4/0/14 | include errors
  Input errors: 154, CRC: 32, frame: 0, overrun: 0, ignored: 0
  Output errors: 87, collisions: 0, interface resets: 5

Cable diagnostics returned clean results:

# TDR test example
test cable-diagnostics tdr interface Gi4/0/9
show cable-diagnostics tdr interface Gi4/0/14

The issue manifested uniquely across ports:

Port Error Pattern Stability After Port Change
Gi4/0/9 CRC errors increasing during peak traffic Unstable
Gi4/0/14 Intermittent interface resets Unstable
Gi4/0/23 No errors logged Stable (solution)

For borderline cases, we recommend:

# Stress testing with packet generator
! IOS XE example
monitor capture CAP buffer size 100
monitor capture CAP limit duration 60
monitor capture CAP interface Gi4/0/14 both

Critical SNMP OIDs to monitor:

IF-MIB::ifInErrors.1014
IF-MIB::ifOutErrors.1014
CISCO-ENHANCED-MEMPOOL-MIB::cempMemPoolUsed.1

When ports exhibit erratic behavior:

  1. Inspect RJ45 contacts for carbon scoring (magnifying glass)
  2. Check PHY chip temperature (infrared thermometer)
  3. Verify port ASIC voltage (1.0V ±5% for 2960 Series)

For lightning-prone areas:

  • Implement ESD-rated patch panels (e.g., Panduit Pan-Net)
  • Schedule quarterly test cable-diagnostics tdr sweeps
  • Maintain spare ports at 30% capacity for failover