Zero-Downtime HAProxy Reload: Achieving 100% Packet Retention During Configuration Updates


2 views

When managing high-traffic load balancing setups, even a 0.24% packet loss during HAProxy reloads translates to significant service interruptions. The standard reload methods simply don't cut it for mission-critical applications where five-nines reliability is expected.

The common -sf (soft-stop) approach works by:

haproxy -D -f /etc/haproxy/haproxy.cfg -p /var/run/haproxy.pid -sf $(cat /var/run/haproxy.pid)

But still causes packet loss because:

  • TCP connections aren't properly drained before termination
  • New process initialization isn't perfectly synchronized
  • Linux kernel doesn't immediately shift sockets between PIDs

Here's the complete zero-downtime reload procedure we've battle-tested at 50K req/sec:

1. Enable Seamless Reload Support

global
    stats socket /run/haproxy/admin.sock mode 660 level admin
    stats timeout 30s
    daemon
    nbproc 1
    nbthread 4
    maxconn 500000
    tune.ssl.default-dh-param 2048
    server-state-base /var/lib/haproxy/
    server-state-file global

2. Implement Connection Draining

First set the old instance to drain mode:

echo "set server backend/web1 state drain" | socat /run/haproxy/admin.sock stdio
echo "set server backend/web2 state drain" | socat /run/haproxy/admin.sock stdio

3. The Perfect Reload Command

This sequence maintains all connections:

haproxy -f /etc/haproxy/haproxy.cfg -p /var/run/haproxy.pid \\
        -x /run/haproxy/admin.sock \\
        -sf $(cat /var/run/haproxy.pid) \\
        -L $HOSTNAME

Key parameters:

  • -x: Uses Unix socket for FD passing
  • -L: Sets local peer name for connection tracking

Add these kernel parameters to /etc/sysctl.conf:

net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_fin_timeout = 10
net.ipv4.tcp_keepalive_time = 600
net.ipv4.ip_local_port_range = 1024 65000

Test your reload with this traffic simulation:

while true; do 
    curl -sI http://lb.example.com/healthcheck | grep HTTP/1.1 | tee -a reload_test.log
    sleep 0.001
done

Then check for gaps in the log timestamps during reload.

For bulletproof setups, consider:

  • Using keepalived for VIP failover during reloads
  • Implementing DNS-based load balancing as backup
  • Setting up BGP anycast for true zero-downtime

When managing production HAProxy instances, configuration changes are inevitable. The standard reload operation introduces brief connection interruptions that become unacceptable in high-traffic environments (>1000 req/sec). Through testing, I've observed approximately 0.24% packet loss during reloads - equivalent to 12 dropped requests per 5000 at peak load.

HAProxy's -sf flag initiates a "soft stop" by:

1. Launching new process with updated config
2. Passing listening sockets via fd inheritance
3. Signaling old process to stop accepting new connections
4. Maintaining established connections until completion

The gap occurs during steps 2-3 where Linux's TCP stack may still queue packets for the old process. Here's the improved reload sequence:

haproxy -D -f /etc/haproxy/haproxy.cfg -p /var/run/haproxy.pid \ 
-sf $(cat /var/run/haproxy.pid) \
-x /var/run/haproxy.sock \
-st $(grep -m 1 'server' /etc/haproxy/haproxy.cfg | awk '{print $2}')

Socket Transfer (-x): Enables seamless fd passing between processes

Staggered Termination (-st): Gradually drains backend connections

For Kubernetes environments using ConfigMaps:

# Reload wrapper script
#!/bin/bash
OLD_PID=$(cat /var/run/haproxy.pid)
CONFIG=/etc/haproxy/haproxy.cfg

# Validate config first
haproxy -c -f $CONFIG || exit 1

# Execute atomic reload
exec haproxy -D -f $CONFIG -p /var/run/haproxy.pid \
-sf $OLD_PID -x /var/run/haproxy.sock

Packet Capture Test:

tcpdump -i eth0 'port 80' -w reload.pcap &
# Execute reload command here
# Analyze with:
tshark -r reload.pcap -q -z io,stat,0,"COUNT(tcp.analysis.retransmission) tcp.analysis.retransmission"

HAProxy Stats Validation:

echo "show info;show stat" | socat /var/run/haproxy.sock stdio | \
grep -E 'idle_pct|process_num|uptime|^svname'

For financial-grade availability (99.9999%):

  1. Implement DNS-based failover with 1s TTL
  2. Use ECMP routing with multiple HAProxy nodes
  3. Deploy configuration changes in canary stages