When your low-traffic Nginx server suddenly complains about worker_connections
being exhausted despite minimal actual load, you're likely dealing with connection leaks. The smoking gun appears when you check for lingering connections:
# lsof | grep nginx | grep CLOSE_WAIT | wc -l
1271
A CLOSE_WAIT
state indicates the remote end has closed the connection, but your Nginx process hasn't released it. Common causes include:
- Application backends not properly closing connections
- Keepalive misconfigurations
- Upstream timeouts not being enforced
To investigate connection states in real-time:
# Active connections breakdown
ss -tanp | grep nginx | awk '{print $1}' | sort | uniq -c
# Detailed connection tracking (replace PID)
strace -p [nginx_worker_pid] -e trace=network -s 10000
These settings often contribute to connection leaks:
# Bad:
proxy_http_version 1.0; # Disables keepalive by default
proxy_set_header Connection ""; # Can break connection cleanup
# Good:
proxy_http_version 1.1;
proxy_set_header Connection "";
proxy_set_header Keep-Alive "";
keepalive_timeout 75s;
keepalive_requests 100;
For PHP/Python/Node backends, ensure proper connection headers:
location @proxy {
proxy_pass http://backend;
proxy_connect_timeout 5s;
proxy_read_timeout 30s;
proxy_send_timeout 30s;
proxy_next_upstream error timeout invalid_header;
proxy_buffer_size 4k;
proxy_buffers 8 16k;
reset_timedout_connection on; # Critical!
}
Create a simple monitoring script (/usr/local/bin/nginx_conn_check
):
#!/bin/bash
THRESHOLD=50
COUNT=$(ss -tanp | grep nginx | grep -c CLOSE_WAIT)
if [ "$COUNT" -gt "$THRESHOLD" ]; then
echo "$(date) - Found $COUNT CLOSE_WAIT connections" >> /var/log/nginx_conn.log
# Optionally trigger soft reload
nginx -s reload
fi
Add to cron:
*/5 * * * * /usr/local/bin/nginx_conn_check
The presence of 1271 CLOSE_WAIT connections (as shown by lsof | grep nginx | grep CLOSE_WAIT | wc -l
) indicates a serious connection handling issue. This TCP state means the remote peer has closed the connection, but your Nginx hasn't properly released it.
To fully understand the situation, run these commands:
# Check overall connection states
ss -antp | grep nginx
# Monitor active connections in real-time
watch -n 1 "netstat -anp | grep nginx"
# Detailed process analysis
strace -p $(pgrep nginx | head -1) -e trace=network
The most frequent causes include:
- Improper keepalive_timeout settings
- Upstream servers not closing connections properly
- Missing or incorrect proxy settings
- Client-side network issues
Add these to your nginx.conf:
http {
# Force connection closure
reset_timedout_connection on;
# Optimize keepalive
keepalive_timeout 30s;
keepalive_requests 100;
# Upstream connection management
proxy_connect_timeout 5s;
proxy_send_timeout 10s;
proxy_read_timeout 30s;
proxy_next_upstream_timeout 0;
proxy_next_upstream error timeout invalid_header;
# Buffer management
proxy_buffers 16 16k;
proxy_buffer_size 16k;
}
Adjust kernel parameters in /etc/sysctl.conf:
# Reuse sockets in TIME_WAIT state
net.ipv4.tcp_tw_reuse = 1
# Increase max number of connections
net.core.somaxconn = 65535
# Increase port range
net.ipv4.ip_local_port_range = 1024 65535
# Apply changes
sysctl -p
Create a monitoring script (/usr/local/bin/nginx_conn_check):
#!/bin/bash
THRESHOLD=500
COUNT=$(ss -ant | grep -c CLOSE_WAIT)
if [ $COUNT -gt $THRESHOLD ]; then
echo "$(date) - CLOSE_WAIT connections ($COUNT) exceeded threshold" >> /var/log/nginx_conn.log
systemctl reload nginx
fi
Add to cron: * * * * * /usr/local/bin/nginx_conn_check