When aiming for 500K requests per second (RPS), several factors come into play:
# Current hardware limitations:
- 100Mbps network interface (max ~12.5MB/s theoretical throughput)
- 4-core CPU with hyperthreading
- Software RAID 1 configuration
Here's an optimized nginx.conf for high throughput:
worker_processes auto;
worker_rlimit_nofile 1000000;
events {
worker_connections 65536;
multi_accept on;
use epoll;
}
http {
access_log off;
error_log /var/log/nginx/error.log crit;
sendfile on;
tcp_nopush on;
tcp_nodelay on;
keepalive_timeout 10;
keepalive_requests 100000;
open_file_cache max=200000 inactive=20s;
open_file_cache_valid 30s;
open_file_cache_min_uses 2;
open_file_cache_errors on;
# Disable all features that aren't needed for static content
server_tokens off;
gzip off;
server {
listen 80 reuseport;
location / {
root /var/www/html;
try_files $uri =404;
}
}
}
Add these to /etc/sysctl.conf:
net.ipv4.ip_local_port_range = 1024 65535
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_fin_timeout = 15
net.core.somaxconn = 65535
net.core.netdev_max_backlog = 4096
net.ipv4.tcp_max_syn_backlog = 4096
net.ipv4.tcp_syncookies = 1
fs.file-max = 2097152
Different tools yield different results:
# Apache Benchmark (ab)
ab -n 1000000 -c 500 http://localhost/test.txt
# wrk (more modern tool)
wrk -t8 -c1000 -d30s http://localhost/test.txt
# h2load (for HTTP/2 testing)
h2load -n 1000000 -c 1000 -m 100 http://localhost/test.txt
When testing locally:
- Disable all logging
- Use RAM disk for test files
- Monitor CPU affinity with taskset
- Consider using multiple IP addresses
For production environments nearing 500K RPS:
# Multiple worker processes with CPU affinity
worker_processes 8;
worker_cpu_affinity auto;
# TCP optimizations
listen 80 reuseport so_keepalive=on backlog=65535;
# Zero-copy file transfer
sendfile on;
directio 4m;
Essential monitoring commands:
# CPU usage
mpstat -P ALL 1
# Network throughput
iftop -i eth0 -n
# Process limits
cat /proc/$(pgrep nginx)/limits
# Socket statistics
ss -s
When pushing a web server to its absolute limits, every configuration parameter matters. Let's analyze how we can optimize an Xeon E3-1270 server with Nginx to handle 500,000 requests per second for static content.
Your current setup has decent specs:
Intel® Xeon® E3-1270 4 Cores (8 HT) x 3.4 GHz
24GB DDR3 ECC RAM
100Mbps network (bottleneck warning)
The 100Mbps NIC will theoretically max out at ~12,500 requests/second (assuming 1KB responses), so for local benchmarking we'll ignore this limitation.
Your current config shows good practices, but let's enhance it further:
worker_processes auto; # Match CPU cores
worker_rlimit_nofile 1000000; # Increase open file limit
events {
worker_connections 65536;
multi_accept on;
use epoll;
}
http {
sendfile on;
tcp_nopush on;
tcp_nodelay on;
keepalive_requests 100000;
keepalive_timeout 65;
access_log off; # Disable during benchmarks
error_log /dev/null crit; # Minimal error logging
open_file_cache max=200000 inactive=20s;
open_file_cache_valid 30s;
open_file_cache_min_uses 2;
open_file_cache_errors off;
}
Add these to /etc/sysctl.conf:
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_tw_recycle = 1
net.ipv4.ip_local_port_range = 1024 65535
net.ipv4.tcp_max_syn_backlog = 40000
net.ipv4.tcp_max_tw_buckets = 2000000
net.core.somaxconn = 65535
net.core.netdev_max_backlog = 65536
net.ipv4.tcp_no_metrics_save = 1
fs.file-max = 1000000
Different tools yield different results:
- ab: Simple but limited (~35K RPS)
- wrk: More efficient (~165K RPS)
- vegeta: Distributed testing capability
- k6: Modern load testing tool
For extreme performance:
# Use memory-backed filesystem for temporary files
mount -t tmpfs -o size=512M tmpfs /var/lib/nginx/tmp
# CPU affinity binding
worker_cpu_affinity auto;
# Zero-copy optimizations
sendfile_max_chunk 512k;
Use these commands simultaneously:
dstat -cmdn --top-cpu --top-mem --top-io
perf top -p $(pgrep -d, nginx)
iftop -nNP
Through these optimizations, you should see significant improvements in request handling capacity. The exact numbers will depend on your specific workload and environment.