Optimizing Linux IP Routing for HAProxy: TCP Memory Management and Route Cache Tuning


1 views

When dealing with high-traffic HAProxy instances, the Linux kernel's default TCP/IP stack parameters often prove insufficient. The alarming Out of socket memory messages indicate the system is hitting its TCP memory limits:

# Default tcp_mem values (pages)
cat /proc/sys/net/ipv4/tcp_mem
45984 61312 91968

For a production HAProxy server handling 10K+ connections, consider these adjustments:

# Permanent setting in /etc/sysctl.conf
net.ipv4.tcp_mem = 183936 245248 367872

# Temporary application
sysctl -w net.ipv4.tcp_mem="183936 245248 367872"

The subsequent Route hash chain too long warning reveals another bottleneck. The route cache management parameters need careful tuning:

# Current route settings
sysctl -a | grep 'net.ipv4.route'

Key parameters for HAProxy environments:

# Recommended baseline for HAProxy (units in seconds unless noted)
net.ipv4.route.gc_elasticity = 4       # More aggressive cache pruning
net.ipv4.route.gc_interval = 30        # More frequent garbage collection
net.ipv4.route.secret_interval = 86400 # Daily flush instead of every 10 minutes
net.ipv4.route.gc_timeout = 120        # Shorter cache entry lifetime

For immediate application and persistence across reboots:

# /etc/sysctl.d/99-haproxy-tuning.conf
net.ipv4.tcp_mem = 183936 245248 367872
net.ipv4.route.gc_elasticity = 4
net.ipv4.route.gc_interval = 30
net.ipv4.route.secret_interval = 86400
net.ipv4.route.gc_timeout = 120

Apply with:

sysctl -p /etc/sysctl.d/99-haproxy-tuning.conf

Verify the changes with these diagnostic commands:

# Check current memory usage
cat /proc/net/sockstat

# Monitor route cache efficiency
cat /proc/net/rt_cache
ip route show cache

# Watch for hash chain warnings
dmesg | grep -i "route hash chain"

For extreme cases, consider kernel-level adjustments:

# Boot parameters (GRUB config)
rhash_entries=524288    # Double default hash table size

Always benchmark changes with realistic traffic patterns. A useful test command:

siege -c 10000 -t 2M http://haproxy-test-endpoint/

During a recent HAProxy failover incident, we encountered these concerning kernel messages:

Jan 26 07:41:45 haproxy2 kernel: [226818.070059] __ratelimit: 10 callbacks suppressed
Jan 26 07:41:45 haproxy2 kernel: [226818.070064] Out of socket memory
Jan 26 07:41:47 haproxy2 kernel: [226819.560048] Out of socket memory
Jan 26 07:41:49 haproxy2 kernel: [226822.030044] Out of socket memory

The immediate solution was to increase net.ipv4.tcp_mem values, which were set too conservatively by default. Here's what we changed:

# Original values
echo "45984 61312 91968" > /proc/sys/net/ipv4/tcp_mem

# New values (4x increase)
echo "183936 245248 367872" > /proc/sys/net/ipv4/tcp_mem

# To make permanent (Ubuntu example)
echo "net.ipv4.tcp_mem=183936 245248 367872" >> /etc/sysctl.conf
sysctl -p

After this adjustment, we started seeing a new warning:

Jan 26 08:18:49 haproxy1 kernel: [ 2291.579726] Route hash chain too long!
Jan 26 08:18:49 haproxy1 kernel: [ 2291.579732] Adjust your secret_interval!

This led us down the rabbit hole of Linux route cache management parameters.

The Linux kernel maintains several mechanisms for managing its routing cache:

  • secret_interval: Periodic full cache flush (default 600 seconds)
  • gc_elasticity: Average bucket depth before expiration (default 8)
  • gc_interval: How often garbage collection runs (default 60 seconds)
  • gc_timeout: How long an unused route stays cached (default 300 seconds)

For HAProxy servers handling high traffic, we recommend these settings:

# More aggressive route cache management
echo 1200 > /proc/sys/net/ipv4/route/secret_interval  # Less frequent full flushes
echo 4 > /proc/sys/net/ipv4/route/gc_elasticity       # More aggressive pruning
echo 30 > /proc/sys/net/ipv4/route/gc_interval        # More frequent GC passes
echo 180 > /proc/sys/net/ipv4/route/gc_timeout        # Shorter cache lifetime

# Permanent configuration
cat >> /etc/sysctl.conf <

To verify your changes are effective, monitor these metrics:

# Check route cache statistics
cat /proc/net/stat/rt_cache

# Monitor socket usage
ss -s

# Watch for route hash warnings
dmesg | grep "Route hash chain"

If you prefer not to be aggressive with cache management, you can increase the hash table size at boot:

# Add to GRUB_CMDLINE_LINUX in /etc/default/grub
GRUB_CMDLINE_LINUX="rhash_entries=65536"

# Update GRUB and reboot
update-grub
reboot

For most HAProxy deployments, we recommend:

  1. First increase tcp_mem as we initially did
  2. Adjust gc_elasticity to be more aggressive (4-6)
  3. Increase secret_interval to reduce disruptive full flushes
  4. Only consider rhash_entries if other adjustments prove insufficient

Remember that optimal values depend on your specific traffic patterns and server resources.