Debugging RX_Missed_Errors: Comprehensive Guide for Network Packet Drops on Linux Servers

When examining packet drops on your NIC (Network Interface Card), you might encounter a situation where ifconfig reports significant RX drops while /sys/class/net/ethX/statistics/rx_dropped shows zero. This typically indicates the drops are being counted as rx_missed_errors instead.

# Typical output showing the discrepancy
$ ifconfig eth2 | grep 'RX.*drop'
          RX packets:2059646370 errors:0 dropped:7142467 overruns:0 frame:0
$ cat /sys/class/net/eth2/statistics/rx_dropped
0
$ cat /sys/class/net/eth2/statistics/rx_missed_errors
7142467

For Intel 10GbE NICs (ixgbe driver), these are the most frequent causes:

Insufficient RX descriptor ring buffer size
CPU saturation preventing timely packet processing
IRQ (Interrupt Request) balancing issues
Network traffic bursts exceeding processing capacity
Hardware limitations or firmware bugs

Start with these essential diagnostics:

# Check driver settings and hardware info
$ ethtool -i eth2
driver: ixgbe
version: 3.15.1-k
firmware-version: 0x800003e1

# Examine current ring buffer sizes
$ ethtool -g eth2
Ring parameters for eth2:
Pre-set maximums:
RX:             4096
RX Mini:        0
RX Jumbo:       0
TX:             4096
Current hardware settings:
RX:             512
RX Mini:        0
RX Jumbo:       0
TX:             512

# Check interrupt distribution
$ cat /proc/interrupts | grep eth2

Increasing the RX ring buffer often resolves missed errors:

# Temporarily increase ring buffer (survives reboot)
$ ethtool -G eth2 rx 2048

# Make permanent by adding to /etc/rc.local
ethtool -G eth2 rx 2048

For systems handling high traffic (10Gbps+), consider values between 2048-4096.

Add these to /etc/sysctl.conf for better performance:

# Increase socket read buffers
net.core.rmem_max = 16777216
net.core.rmem_default = 16777216

# Increase number of incoming connections backlog
net.core.netdev_max_backlog = 30000

# Enable interrupt moderation (adjust usecs as needed)
$ ethtool -C eth2 rx-usecs 50

After making changes, monitor improvements with:

# Watch error counters in real-time
$ watch -n1 "cat /sys/class/net/eth2/statistics/rx_missed_errors"

# Check current ring buffer usage (look for drops)
$ ethtool -S eth2 | grep -E 'rx_pkts|rx_missed|rx_no_buffer'

For persistent issues:

Update ixgbe driver and firmware
Test with different MTU sizes (try 9000 for jumbo frames)
Disable power saving features:
ethtool --set-eee eth2 eee off
Experiment with RSS queues:
ethtool -L eth2 combined 16

Remember to test changes methodically and monitor their impact.

After migrating services between servers, you're seeing a significant packet drop count in your interface statistics:

$ ifconfig eth2 | grep 'RX.*drop'
      RX packets:2059646370 errors:0 dropped:7142467 overruns:0 frame:0

What's particularly interesting is the discrepancy between different measurement methods:

$ cat /sys/class/net/eth2/statistics/rx_dropped
0
$ cat /sys/class/net/eth2/statistics/rx_missed_errors
7142467

Your NIC uses the ixgbe driver (version 3.15.1-k), which handles 10GbE Intel network interfaces. The rx_missed_errors counter specifically tracks packets that the NIC received but couldn't deliver to the kernel's network stack due to resource constraints.

1. Check ring buffer sizes:

$ ethtool -g eth2
Ring parameters for eth2:
Pre-set maximums:
RX:             4096
RX Mini:        0
RX Jumbo:       0
TX:             4096
Current hardware settings:
RX:             512
RX Mini:        0
RX Jumbo:       0
TX:             512

2. Monitor interrupt coalescing:

$ ethtool -c eth2
Coalesce parameters for eth2:
Adaptive RX: on  TX: on
stats-block-usecs: 0
sample-interval: 0
pkt-rate-low: 0
pkt-rate-high: 0

Based on the ixgbe driver behavior and your migration scenario, these are the most likely culprits:

Receive buffer exhaustion - The NIC's ring buffers are too small for the new traffic pattern
IRQ balancing issues - CPU cores aren't properly handling interrupts
DMA mapping problems - The new server's memory configuration differs
Flow control misconfiguration - Missing pause frames during traffic bursts

Increase ring buffers:

$ sudo ethtool -G eth2 rx 4096
$ sudo ethtool -G eth2 tx 4096

Adjust interrupt moderation:

$ sudo ethtool -C eth2 rx-usecs 50 rx-frames 32

Check NUMA settings (if applicable):

$ cat /sys/class/net/eth2/device/numa_node
0
$ numactl --hardware

Create a monitoring script to track the issue:

#!/bin/bash
INTERFACE="eth2"
LOG_FILE="/var/log/nic_stats.log"

while true; do
    RX_MISSED=$(cat /sys/class/net/$INTERFACE/statistics/rx_missed_errors)
    RX_DROPPED=$(cat /sys/class/net/$INTERFACE/statistics/rx_dropped)
    RX=$(ifconfig $INTERFACE | grep 'RX.*drop')
    TIMESTAMP=$(date '+%Y-%m-%d %H:%M:%S')
    
    echo "$TIMESTAMP - RX_MISSED: $RX_MISSED | RX_DROPPED: $RX_DROPPED | IFCONFIG: $RX" >> $LOG_FILE
    sleep 5
done

For persistent issues, consider these deeper investigations:

$ perf stat -e 'ixgbe:*' -a sleep 10
$ ethtool --register-dump eth2 | grep -i 'miss\|drop'
$ dmesg | grep -i 'ixgbe\|dma\|buffer'

Remember that rx_missed_errors often indicate that packets were received by the hardware but couldn't be processed by the driver, typically due to resource constraints rather than network issues.

ServerDevWorker

Debugging RX_Missed_Errors: Comprehensive Guide for Network Packet Drops on Linux Servers

Related Articles