How to Configure Nginx Proxy Retry with Delay for Backend Restarts


4 views

When your backend service restarts (e.g., during deployments or crashes), Nginx's default behavior is to immediately return a 502 Bad Gateway error if it can't establish a connection. This creates poor user experience and unnecessary failures for transient issues.

Nginx's proxy_next_upstream directive combined with retry parameters solves this elegantly:


location / {
    proxy_pass http://backend;
    proxy_next_upstream error timeout http_502 http_503 http_504;
    proxy_next_upstream_tries 3;
    proxy_next_upstream_timeout 10s;
    proxy_connect_timeout 2s;
    proxy_read_timeout 10s;
}
  • proxy_next_upstream: Specifies which conditions warrant a retry (error, timeout, or specific HTTP status codes)
  • proxy_next_upstream_tries: Maximum number of retry attempts (N in your question)
  • proxy_next_upstream_timeout: Total time limit for all retry attempts
  • proxy_connect_timeout: Individual connection attempt timeout

For the M seconds delay requirement, we need to combine Nginx with a bit of Lua scripting (requires ngx_http_lua_module):


location / {
    access_by_lua_block {
        local retry_count = tonumber(ngx.var.cookie_retry_count) or 0
        if retry_count > 0 then
            ngx.sleep(2) -- M seconds delay
        end
        ngx.header["Set-Cookie"] = "retry_count=" .. (retry_count + 1)
    }
    
    proxy_pass http://backend;
    proxy_next_upstream error timeout http_502 http_503 http_504;
    proxy_next_upstream_tries 3;
    proxy_intercept_errors on;
    error_page 502 = @retry;
}

location @retry {
    internal;
    content_by_lua_block {
        ngx.exit(502)
    }
}

If you're using Nginx Plus, you get built-in retry delay functionality:


location / {
    proxy_pass http://backend;
    proxy_next_upstream error timeout http_502 http_503 http_504;
    proxy_next_upstream_tries 3;
    proxy_next_upstream_timeout 10s;
    health_check interval=2s fails=1 passes=1;
}

Use this simple Python script to simulate a restarting backend:


from flask import Flask
import time
import os

app = Flask(__name__)
start_time = time.time()

@app.route('/')
def hello():
    # Return 503 for first 5 seconds to simulate restart
    if time.time() - start_time < 5:
        return "Backend restarting", 503
    return "Backend available"

if __name__ == '__main__':
    app.run(port=5000)

When your backend service restarts (whether for deployments or crashes), Nginx's default behavior is to immediately return a 502 Bad Gateway error if it can't establish a connection. This creates poor user experience during maintenance windows.

Nginx's proxy_next_upstream directive combined with timeout settings provides the retry mechanism we need:

location / {
    proxy_pass http://backend;
    proxy_next_upstream error timeout invalid_header http_502 http_503 http_504;
    proxy_next_upstream_timeout 60s;  # Total retry window
    proxy_next_upstream_tries 3;      # Max retry attempts
    proxy_connect_timeout 5s;         # Initial connection timeout
}

For true resiliency, we should add delays between retries. While Nginx doesn't have built-in retry delay, we can approximate it:

http {
    # Custom error page that triggers client-side retry
    proxy_intercept_errors on;
    error_page 502 = @retry_backend;
    
    location @retry_backend {
        # Return 503 with Retry-After header
        add_header Retry-After 5;
        return 503;
    }
    
    location / {
        proxy_pass http://backend;
        proxy_next_upstream error timeout http_502;
        proxy_read_timeout 10s;
    }
}

For exact retry timing control, use OpenResty with Lua:

location / {
    access_by_lua_block {
        local max_retries = 3
        local retry_delay = 2  -- seconds
        
        for i = 1, max_retries do
            ngx.sleep(retry_delay)
            local res = ngx.location.capture("/proxy-pass")
            
            if res.status < 500 then
                ngx.exec("@handle_response", { res })
                return
            end
        end
        
        ngx.exit(502)
    }
}

location /proxy-pass {
    internal;
    proxy_pass http://backend;
}
  • Always set reasonable timeout values (connect_timeout, read_timeout)
  • Monitor retry metrics using Nginx status modules
  • Combine with health_check for optimal backend selection
  • Consider circuit breakers for prolonged outages