Optimizing UDP Performance: Why Higher rmem_max Values Increase Packet Loss in Linux Networking


2 views

During recent performance testing between two KVM virtual machines (CentOS 7), I encountered a counterintuitive phenomenon: increasing the net.core.rmem_max value beyond default settings actually degraded UDP packet reception performance under high throughput conditions.

# Test setup commands
sysctl -w net.core.rmem_max=131071  # Default
sysctl -w net.core.rmem_max=26214400 # JBoss recommended

Using iPerf3 with the following parameters:

# Receiver
iperf3 -s -u -p 5001

# Sender (adjust -b value for different bandwidths)
iperf3 -c [receiver_ip] -u -b 300M -t 30 -p 5001 --get-server-output

The data shows packet loss beginning at:

  • 131071 bytes (default): ~320 MB/s
  • 25MB setting: ~280 MB/s

Through kernel trace analysis (perf trace), we observed:

# Monitoring socket buffer behavior
perf trace -e 'skb:consume_skb,net:net_dev_xmit' -p $(pgrep iperf3)

The larger buffer appears to:

  1. Increase DMA mapping overhead
  2. Delay interrupt coalescing thresholds
  3. Trigger more frequent buffer flushing

For most applications:

# Optimal settings for 1Gbps networks
sysctl -w net.core.rmem_default=212992
sysctl -w net.core.rmem_max=212992

For JBoss clusters specifically:

# Only apply when:
# 1. Using multicast UDP
# 2. Network latency > 5ms
# 3. Expected throughput > 500MB/s
sysctl -w net.core.rmem_max=16777216

Complementary settings that actually help:

sysctl -w net.ipv4.udp_mem='1024000 8738000 16777216'
sysctl -w net.ipv4.udp_rmem_min=8192

When analyzing UDP performance under heavy load conditions, we observe a counterintuitive phenomenon where increasing the receive buffer size (net.core.rmem_max) beyond certain thresholds actually increases packet loss rates. This contradicts many traditional performance tuning guides, including the JBoss recommendation to set rmem_max=26214400.


# Test methodology used:
# Server:
iperf -s -u -P 0 -i 1 -p 5001 -f M

# Client (with variable bandwidth):
iperf -c 172.29.157.3 -u -P 1 -i 1 -p 5001 -f M -b 300M -t 5 -d -L 5001 -T 1

The test results clearly show that at 350MB/s traffic levels:

  • Default rmem_max (131071): ~0.5% packet loss
  • 25MB rmem_max (26214400): ~1.2% packet loss

The performance degradation occurs because:

  1. Larger buffers increase socket queue latency
  2. Excessive buffering delays congestion feedback
  3. Bufferbloat causes packet drops deeper in the network stack

For most production environments:


# Optimal settings for 1-10Gbps networks
sysctl -w net.core.rmem_max=4194304
sysctl -w net.core.rmem_default=1048576

Exceptions where large buffers help:

  • High-latency satellite links
  • Inter-continental transfers
  • When using specialized congestion control algorithms

Use this Python snippet to monitor buffer utilization:


import socket

sock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
print(f"Current SO_RCVBUF: {sock.getsockopt(socket.SOL_SOCKET, socket.SO_RCVBUF)}")

# For live monitoring:
watch -n 1 'cat /proc/net/udp | awk \'{print \$3 " " \$4}\''

The JBoss recommendation assumes:

  • Application-level retransmission mechanisms
  • Very high bandwidth (10Gbps+) environments
  • Specific JGroups configurations that manage packet loss differently