You're seeing intermittent 502 Bad Gateway errors with the specific error message "no live upstreams while connecting to upstream". This occurs:
- During page transitions between site sections
- On the homepage when accessed via internal redirects
- Particularly affecting JavaScript file delivery
Your current load balancing setup shows several potential issues:
upstream example.com {
# ip_hash;
server php01 max_fails=3 fail_timeout=15s;
server php02 max_fails=3 fail_timeout=15s;
}
server {
listen IP:80;
server_name example.com;
location / {
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_pass http://$server_name/$uri;
# ... other proxy settings ...
}
}
After analyzing similar cases, these are the most likely causes:
1. Upstream Health Checks
The max_fails
and fail_timeout
parameters might be too aggressive. When both upstreams fail simultaneously, Nginx has nowhere to route traffic.
2. DNS Resolution
Using $server_name
in proxy_pass creates a circular reference. Nginx needs concrete upstream IPs.
3. Session Persistence
The commented ip_hash
suggests you considered session stickiness, which might be necessary for your application.
Here's an improved configuration that addresses these issues:
upstream backend_cluster {
ip_hash;
server 192.168.1.10:80 max_fails=3 fail_timeout=30s;
server 192.168.1.11:80 max_fails=3 fail_timeout=30s;
keepalive 32;
}
server {
listen 80;
server_name example.com;
location / {
proxy_pass http://backend_cluster;
proxy_http_version 1.1;
proxy_set_header Connection "";
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
# Health check settings
proxy_next_upstream error timeout http_502 http_503;
proxy_next_upstream_timeout 2s;
proxy_next_upstream_tries 3;
# Timeout configurations
proxy_connect_timeout 5s;
proxy_send_timeout 10s;
proxy_read_timeout 30s;
}
# Special handling for JS files
location ~* \.js$ {
proxy_cache my_js_cache;
proxy_cache_valid 200 302 1h;
proxy_cache_bypass $http_cache_control;
expires 1h;
add_header Cache-Control "public";
proxy_pass http://backend_cluster;
}
}
- Explicit IP addresses instead of hostnames
- Added connection keepalive
- Proper health check configuration
- Special handling for JavaScript assets
- More realistic timeout values
- Enabled ip_hash for session consistency
After implementing changes:
- Test with
curl -I http://example.com
to check headers - Monitor error logs:
tail -f /var/log/nginx/error.log
- Verify upstream status:
nginx -T | grep -A10 upstream
If issues persist:
# Check active connections
ss -ant | grep '80'
# Verify upstream reachability
for ip in 192.168.1.10 192.168.1.11; do
curl -v --connect-timeout 3 http://$ip/health-check
done
# Nginx debug logging
error_log /var/log/nginx/debug.log debug;
rewrite_log on;
When dealing with intermittent 502 errors in an Nginx load balancing setup, the specific behavior tells us everything:
- First homepage request succeeds (indicates basic connectivity works)
- Subsequent page transitions fail (suggestive of connection handling issues)
- JavaScript files occasionally fail (points to timeout/keepalive problems)
The current setup has several problematic areas:
upstream example.com {
# ip_hash;
server php01 max_fails=3 fail_timeout=15s;
server php02 max_fails=3 fail_timeout=15s;
}
Key problems in this configuration:
- Missing
resolve
parameter for dynamic DNS updates - No health check configuration
- Basic round-robin without session persistence
Add these directives to your nginx configuration:
proxy_connect_timeout 5s;
proxy_send_timeout 10s;
proxy_read_timeout 30s;
keepalive_timeout 60s;
keepalive_requests 100;
Here's the optimized version with all necessary fixes:
upstream example.com {
zone backend 64k;
server php01:80 max_fails=3 fail_timeout=15s resolve;
server php02:80 max_fails=3 fail_timeout=15s resolve;
keepalive 32;
keepalive_timeout 60s;
}
server {
listen IP:80;
server_name example.com;
# Connection handling
proxy_http_version 1.1;
proxy_set_header Connection "";
# Timeout settings
proxy_connect_timeout 5s;
proxy_send_timeout 10s;
proxy_read_timeout 30s;
# Standard proxy headers
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
location / {
proxy_pass http://example.com;
proxy_next_upstream error timeout invalid_header http_500 http_502 http_503 http_504;
proxy_next_upstream_timeout 5s;
proxy_next_upstream_tries 3;
}
# Static asset handling
location ~* \.(js|css|png|jpg|jpeg|gif|ico)$ {
expires 30d;
add_header Cache-Control "public, no-transform";
proxy_pass http://example.com;
}
}
Verify your configuration with these commands:
# Check nginx syntax
sudo nginx -t
# Check upstream status
curl http://localhost/nginx_status
# Monitor TCP connections
ss -ant | grep ESTAB | grep 80
# Check error patterns
tail -f /var/log/nginx/example.com.error | grep -E '502|upstream'
For more reliable upstream monitoring:
match server_ok {
status 200-399;
header Content-Type ~ "text/html";
body !~ "maintenance";
}
upstream example.com {
server php01:80 check interval=5000 rise=2 fall=3 match=server_ok;
server php02:80 check interval=5000 rise=2 fall=3 match=server_ok;
}
- Not reusing keepalive connections between Nginx and backend
- Setting proxy timeouts too low for PHP applications
- Missing
proxy_http_version 1.1
directive - Overlooking DNS caching issues (always use
resolve
)