Diagnosing and Troubleshooting Intermittent Cisco 2960 Switch Port Failures After Lightning Strike


3 views

Electrical surges from lightning strikes often cause subtle hardware degradation that manifests weeks later. In this case, we're observing intermittent packet loss (5-10%) across multiple 2960 switch ports despite passing basic cable tests. The 77-meter trunked link serving VoIP phones and PCs shows symptoms including:

  • Dropped RTP packets in VoIP calls
  • MS Exchange connection timeouts
  • TCP retransmissions visible in packet captures

Beyond standard show interface, these Cisco IOS commands provide deeper insights:

# Monitor error counters in real-time
show interfaces gigabitethernet4/0/9 counters errors | begin Input

# Check for buffer allocation failures
show platform hardware interface gigabitethernet4/0/9 statistics

# PHY layer diagnostics (2960X/S specific)
show controllers ethernet-controller gigabitethernet4/0/9 phy

The pastebin output reveals CRC errors incrementing without corresponding carrier transitions. This suggests:

  1. Marginal signal integrity at the PHY layer
  2. Possible damage to the switch ASIC's SerDes circuitry
  3. Electrostatic discharge residue on port connectors

Create a test harness with iPerf3 running between known-good devices:

# On test server:
iperf3 -s -p 5201

# On client (run for each suspect port):
iperf3 -c server_ip -t 300 -P 8 -O 5 -J > port_test_g4_0_9.json

Compare JSON outputs for:

  • retransmits values
  • TCP MSS variations
  • Jitter in interval measurements

For mission-critical environments:

! Apply aggressive error protection
interface GigabitEthernet4/0/14
 flowcontrol receive on
 storm-control broadcast level 1.00
 storm-control action trap

Consider hardware workarounds:

  • Insert managed media converters as buffers
  • Implement fiber uplinks for lightning-prone buildings
  • Use SFP-based connections instead of built-in copper ports

The smoking gun appears in the TDR results when comparing multiple ports:

# Healthy port shows consistent pairs:
Pair A: 79 +/- 0 meters  
Pair B: 75 +/- 0 meters

# Degraded port shows:
Pair A: 79 +/- 2 meters
Pair B: 112 +/- 5 meters (impossible length)

This indicates physical layer corruption within the switch port.


Six weeks post-lightning strike, our Cisco 2960 stack began exhibiting intermittent packet loss (5-10%) on specific ports serving 77-meter runs to VoIP phones and workstations. Symptoms included:

  • Dropped VoIP calls (SIP timeout errors)
  • Exchange connectivity interruptions (Outlook disconnects)
  • TCP retransmissions visible in Wireshark captures

We executed a full troubleshooting matrix:

# Interface error monitoring
show interface GigabitEthernet4/0/14 | include errors
  Input errors: 154, CRC: 32, frame: 0, overrun: 0, ignored: 0
  Output errors: 87, collisions: 0, interface resets: 5

Cable diagnostics returned clean results:

# TDR test example
test cable-diagnostics tdr interface Gi4/0/9
show cable-diagnostics tdr interface Gi4/0/14

The issue manifested uniquely across ports:

Port Error Pattern Stability After Port Change
Gi4/0/9 CRC errors increasing during peak traffic Unstable
Gi4/0/14 Intermittent interface resets Unstable
Gi4/0/23 No errors logged Stable (solution)

For borderline cases, we recommend:

# Stress testing with packet generator
! IOS XE example
monitor capture CAP buffer size 100
monitor capture CAP limit duration 60
monitor capture CAP interface Gi4/0/14 both

Critical SNMP OIDs to monitor:

IF-MIB::ifInErrors.1014
IF-MIB::ifOutErrors.1014
CISCO-ENHANCED-MEMPOOL-MIB::cempMemPoolUsed.1

When ports exhibit erratic behavior:

  1. Inspect RJ45 contacts for carbon scoring (magnifying glass)
  2. Check PHY chip temperature (infrared thermometer)
  3. Verify port ASIC voltage (1.0V ±5% for 2960 Series)

For lightning-prone areas:

  • Implement ESD-rated patch panels (e.g., Panduit Pan-Net)
  • Schedule quarterly test cable-diagnostics tdr sweeps
  • Maintain spare ports at 30% capacity for failover