Optimizing 10Gbps TCP Throughput on Linux: Troubleshooting Checksum Offload Errors and Buffer Tuning

When dealing with high-speed network connections, several factors can bottleneck performance. In this case, we're observing:

Asymmetric throughput (6.67Gbps vs 20Mbps in opposite directions)
High interrupt counts on Server A's NIC
Significant rx_csum_offload_errors (123,049 on Server A)
Maxed out ring buffers (4096 RX/TX on Server A, 4078 on Server B)

First, let's examine the critical TCP parameters that affect 10Gbps performance:

# Recommended settings for 10Gbps (add to /etc/sysctl.conf)
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
net.ipv4.tcp_rmem = 4096 87380 16777216
net.ipv4.tcp_wmem = 4096 65536 16777216
net.ipv4.tcp_window_scaling = 1
net.ipv4.tcp_timestamps = 1
net.ipv4.tcp_sack = 1
net.ipv4.tcp_no_metrics_save = 1
net.core.netdev_max_backlog = 30000
net.ipv4.tcp_congestion_control = cubic

The rx_csum_offload_errors suggest potential hardware checksum issues. Let's verify and potentially disable problematic offloading:

# Check current offload settings
ethtool -k em1 | grep checksum

# Temporary disable RX checksum offload
ethtool -K em1 rx off

# For permanent change (RHEL 6):
echo "ETHTOOL_OPTS=\"-K ${DEVICE} rx off\"" >> /etc/sysconfig/network-scripts/ifcfg-em1

High interrupt counts indicate potential CPU saturation. We should optimize interrupt handling:

# Check current coalescing settings
ethtool -c em1

# Recommended coalescing settings for 10Gbps
ethtool -C em1 rx-usecs 100 tx-usecs 100 rx-frames 25 tx-frames 25

# Set CPU affinity for NIC interrupts
IRQBALANCE_BANNED_CPUS="fffff000"  # Example for 12 core system

Beyond iperf, let's use more sophisticated tools to identify bottlenecks:

# Network stack profiling
perf stat -e 'net:*' -a sleep 10

# TCP diagnostics
ss -temoi

For the ixgbe (Server A) and bnx2x (Server B) drivers:

# ixgbe specific parameters
modprobe ixgbe RSS=8,8,8 LRO=0

# bnx2x specific parameters
echo 32768 > /sys/module/bnx2x/parameters/num_queues

After implementing changes, monitor these metrics:

# Real-time monitoring
sar -n DEV 1 0
ethtool -S em1 | grep -E 'err|drop'
cat /proc/net/softnet_stat

When dealing with high-speed 10Gb fiber connections, achieving full throughput requires careful tuning of both hardware and software components. Our case reveals several interesting symptoms:

# Key observations from iperf tests
Server A → Server B: 6.67 Gb/s
Server B → Server A: 20 Mb/s (severe bottleneck)

The rx_csum_offload_errors on Server A (123049 errors) suggest potential checksum offloading problems. Let's examine the NIC configurations:

# Checking NIC ring buffers (Server A)
ethtool -g em1
Pre-set maximums:
RX:     4096
TX:     4096

# Checking NIC offloading features
ethtool -k em1 | grep checksum
rx-checksumming: on
tx-checksumming: on

For 10Gb performance, we need to adjust several TCP parameters in /etc/sysctl.conf:

# Recommended TCP tuning parameters
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
net.ipv4.tcp_rmem = 4096 87380 16777216
net.ipv4.tcp_wmem = 4096 65536 16777216
net.ipv4.tcp_window_scaling = 1
net.ipv4.tcp_timestamps = 1
net.ipv4.tcp_sack = 1
net.ipv4.tcp_low_latency = 1
net.core.netdev_max_backlog = 30000

The high interrupt counts (50M+) on Server A indicate potential CPU saturation. Let's implement IRQ balancing:

# Install irqbalance
yum install irqbalance -y
service irqbalance start

# Alternatively, manually set CPU affinity
for irq in $(grep em1 /proc/interrupts | cut -d: -f1); do
    echo "Setting IRQ $irq to CPU 0-7"
    echo 7 > /proc/irq/$irq/smp_affinity_list
done

For Intel NICs (ixgbe), we can further optimize:

# Enable multiple queues and RSS
ethtool -L em1 combined 16
ethtool -K em1 rxhash on

# Disable problematic offloading if needed
ethtool -K em1 rx-checksum off
ethtool -K em1 tso off gso off

Use these iperf3 commands for accurate testing:

# Server side
iperf3 -s -p 5201 -D

# Client side (with parallel streams)
iperf3 -c serverB -p 5201 -P 16 -t 30 -O 5 -R

Essential commands for ongoing monitoring:

# Real-time statistics
sar -n DEV 1
ifstat -t -i em1 1
ethtool -S em1 | grep -E 'err|drop'

# TCP connection details
ss -t -i -p -m

Verify hardware compatibility (NIC firmware/drivers)
Check for network congestion with tcptraceroute
Consider TCP congestion algorithm (try cubic or bbr)
Monitor CPU utilization during transfers with mpstat -P ALL 1

ServerDevWorker

Optimizing 10Gbps TCP Throughput on Linux: Troubleshooting Checksum Offload Errors and Buffer Tuning

Related Articles