How to Implement Backend Request Concurrency Limiting in Nginx Using proxy_cache and Upstream Directives

When working with computationally intensive backend services, unrestricted parallel requests can quickly overwhelm your servers. Unlike simple rate limiting (requests per second), we need to control the actual number of simultaneous active connections to each backend instance.

While nginx doesn't have a direct equivalent to Varnish's .max_connections, we can achieve similar functionality through these methods:

upstream myservice {
    server 127.0.0.1:80 max_conns=10;
    server 123.123.123.123:80 max_conns=10;
    queue 100 timeout=60s;
}

The max_conns parameter (available since nginx 1.11.5) limits active connections to each backend server. The queue directive handles request overflow:

max_conns: Maximum concurrent connections per backend
queue: Number of requests to hold when backends are saturated
timeout: How long to keep requests in queue

Here's a production-tested configuration that implements proper concurrency limiting:

http {
    proxy_cache_path /var/cache/nginx levels=1:2 keys_zone=my_cache:10m inactive=60m;
    
    upstream computational_backend {
        zone backend_zone 64k;
        server 10.0.0.1:8000 max_conns=15;
        server 10.0.0.2:8000 max_conns=15;
        server 10.0.0.3:8000 max_conns=15;
        queue 500 timeout=30s;
    }

    server {
        listen 80;
        
        location / {
            proxy_pass http://computational_backend;
            proxy_cache my_cache;
            proxy_cache_lock on;
            proxy_cache_valid 200 5m;
            
            proxy_set_header Host $host;
            proxy_set_header X-Real-IP $remote_addr;
            
            # Important for proper queue handling
            proxy_next_upstream error timeout http_502 http_503;
        }
    }
}

Check your nginx status page to monitor queue usage:

server {
    listen 8080;
    location /status {
        stub_status;
    }
}

Key metrics to watch:

Active connections to each backend
Queue size and timeout rates
Cache hit ratios

For comprehensive protection, combine concurrency limiting with traditional rate limiting:

limit_req_zone $binary_remote_addr zone=api_limit:10m rate=100r/s;

location /api/ {
    limit_req zone=api_limit burst=50;
    proxy_pass http://computational_backend;
}

If you're using Nginx Plus, you get additional controls:

upstream computational_backend {
    zone backend_zone 64k;
    server 10.0.0.1:8000 max_conns=15 slow_start=30s;
    server 10.0.0.2:8000 max_conns=15;
    queue 500 timeout=30s;
    
    # Advanced health checks
    health_check interval=5s fails=1 passes=1 uri=/health;
}

When dealing with computationally intensive backend services through Nginx reverse proxy, uncontrolled parallel connections can overwhelm your servers. Unlike simple rate limiting, we need to control the actual number of simultaneous active connections to each backend.

The ngx_http_upstream_module provides the max_conns parameter exactly for this purpose. Here's how to implement it:

upstream heavy_backend {
    server backend1.example.com max_conns=10;
    server backend2.example.com max_conns=10;
    keepalive 20;
}

server {
    location / {
        proxy_pass http://heavy_backend;
        proxy_set_header Host $host;
        proxy_http_version 1.1;
        proxy_set_header Connection "";
    }
}

For better performance with limited connections, combine with caching:

proxy_cache_path /var/cache/nginx levels=1:2 keys_zone=my_cache:10m inactive=60m;

server {
    location / {
        proxy_cache my_cache;
        proxy_cache_valid 200 302 10m;
        proxy_cache_valid 404      1m;
        proxy_pass http://heavy_backend;
        proxy_cache_lock on;
        proxy_cache_lock_timeout 5s;
    }
}

Nginx 1.11.5+ introduced request queueing when max_conns is reached:

upstream backend {
    server backend1.example.com max_conns=10 queue=100 timeout=60s;
    server backend2.example.com max_conns=10 queue=100 timeout=60s;
}

Track active connections with Nginx status module:

location /nginx_status {
    stub_status;
    allow 127.0.0.1;
    deny all;
}

The status page shows active connections in each state (reading, writing, waiting).

ServerDevWorker

How to Implement Backend Request Concurrency Limiting in Nginx Using proxy_cache and Upstream Directives

Related Articles