Debugging TCP RST Packets Under High Load: Nginx Optimization and Kernel Tuning Guide


2 views

During recent load testing on my 2GB Linode VPS running Nginx on Ubuntu 14.04, I encountered an intriguing bottleneck: TCP connection resets (RST flags) appearing at approximately 2000 concurrent connections, despite having sufficient CPU, memory, and file descriptor headroom. Here's my deep dive into solving this performance mystery.

When facing TCP resets under load, we need to examine multiple layers:

# Basic system limits check
ulimit -n
cat /proc/sys/fs/file-nr

# Kernel connection tracking
sysctl net.netfilter.nf_conntrack_max
dmesg | grep nf_conntrack

Beyond the standard somaxconn and tcp_max_syn_backlog, these parameters proved crucial:

# Add to /etc/sysctl.conf
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_fin_timeout = 30
net.ipv4.ip_local_port_range = 1024 65000
net.ipv4.tcp_max_orphans = 65536
net.ipv4.tcp_abort_on_overflow = 0

The listen directive queue depth needs explicit configuration:

server {
    listen 80 backlog=4096 reuseport;
    # rest of config...
}

The reuseport option enables SO_REUSEPORT socket option, creating separate socket queues per worker process.

When analyzing the tcpdump output, I noticed these key indicators:

# Monitor connection states
ss -s
netstat -ant | awk '{print $6}' | sort | uniq -c

# Track port allocation
cat /proc/sys/net/ipv4/ip_local_port_range

Cloud VPS instances often need additional tuning due to virtualization overhead:

# Virtual NIC optimizations
ethtool -K eth0 tso on gso on gro on
echo 2048 > /sys/class/net/eth0/queues/rx-0/rps_flow_cnt
echo 2048 > /sys/class/net/eth0/queues/tx-0/rps_flow_cnt

The packet capture revealed an important insight - some RST packets originated from the testing service's network. This suggests potential middlebox interference. Testing from multiple geographic locations helped confirm this behavior pattern.

Here's the complete set of changes that resolved my specific case:

# /etc/sysctl.conf additions
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
net.ipv4.tcp_rmem = 4096 87380 16777216
net.ipv4.tcp_wmem = 4096 65536 16777216
net.ipv4.tcp_mem = 8388608 12582912 16777216
net.ipv4.tcp_max_syn_backlog = 3240000
net.core.somaxconn = 3240000
net.core.netdev_max_backlog = 500000
net.ipv4.tcp_slow_start_after_idle = 0

For Nginx workers:

worker_processes auto;
worker_rlimit_nofile 100000;
events {
    worker_connections 50000;
    multi_accept on;
    use epoll;
}

When pushing web servers to their limits, few things are more frustrating than unexplained TCP resets during high-traffic scenarios. The symptoms you're describing - where connections start failing around 2000 concurrent requests despite available system resources - suggest we're dealing with either a kernel-level network limitation or infrastructure-level throttling.

# Check current connection tracking limits
sysctl net.netfilter.nf_conntrack_max
sysctl net.nf_conntrack_max

# Verify ephemeral port range
sysctl net.ipv4.ip_local_port_range

# Check socket buffers
sysctl net.core.rmem_max
sysctl net.core.wmem_max

Your current sysctl settings need some adjustments for high-concurrency scenarios:

# Add these to /etc/sysctl.conf
net.ipv4.tcp_max_syn_backlog = 4096
net.core.somaxconn = 4096
net.core.netdev_max_backlog = 5000
net.ipv4.tcp_max_tw_buckets = 1440000
net.ipv4.tcp_tw_reuse = 1
net.ipv4.ip_local_port_range = 1024 65535

The following nginx directives can help handle the connection surge:

worker_processes auto;
worker_rlimit_nofile 100000;
events {
    worker_connections 4000;
    multi_accept on;
    use epoll;
}

When TCP dumps show RST packets originating from the client side (as in your case), we need to examine:

  • Intermediate network devices (load balancers, firewalls)
  • Cloud provider network policies
  • Client-side connection limitations

Consider adding these monitoring commands during your next test:

# Real-time connection tracking
watch -n 1 "ss -s | grep -i total"

# Socket state monitoring
watch -n 1 "netstat -ant | awk '{print \$6}' | sort | uniq -c"

# Nginx active connections
watch -n 1 "curl -s http://localhost/nginx_status"

On Linode/VPS environments, be aware of:

  • Virtual network interface throughput limits
  • Hypervisor-level network throttling
  • Account-level connection rate limits

For more reliable testing, consider these tools:

# Install wrk benchmark tool
sudo apt-get install -y wrk

# Sample test command
wrk -t12 -c4000 -d30s http://yourserver/testfile.txt

Before your next load test, verify:

  1. All sysctl changes are applied (sysctl -p)
  2. Nginx worker limits match system limits
  3. No intermediate devices are rate-limiting
  4. Testing tool isn't hitting its own limits