Fixing Gunicorn Worker Timeout and Nginx 502 Errors for Long-Running Django Requests

When dealing with experimental features in Django that involve computationally intensive operations, you might encounter premature worker timeouts even when your Gunicorn timeout settings appear adequate. This manifests as 502 Bad Gateway errors from Nginx before reaching your configured worker timeout.

The key indicators in your setup:

2012-01-20 17:30:13 [23128] [DEBUG] GET /results/
2012-01-20 17:30:43 [23125] [ERROR] WORKER TIMEOUT (pid:23128)
Traceback (most recent call last):
  File "/home/demo/python_envs/frontend/lib/python2.6/site-packages/gunicorn/app/base.py", line 111, in run
    os.setpgrp()
OSError: [Errno 1] Operation not permitted

Notice the 30-second gap between request start and timeout, despite your 10-minute timeout setting.

Your current Gunicorn configuration shows several important parameters:

exec sudo -u demo $VENV/bin/gunicorn_django --preload --daemon \
-w 4 -t 600 \
--log-level debug \
--log-file $VENV/run/gunicorn-8080.log \
-p $VENV/run/gunicorn-8080.pid \
-b localhost:8080 /path/to/settings.py

The complete request flow involves multiple components with their own timeout settings:

Gunicorn worker timeout (600s in your case)
Nginx proxy timeout (default 60s)
Linux system-level constraints (sudo limitations)

Here's how to properly configure all components:

1. Nginx Configuration

Add these parameters to your Nginx server block:

location / {
    proxy_pass http://localhost:8080;
    proxy_connect_timeout 600s;
    proxy_send_timeout 600s;
    proxy_read_timeout 600s;
    send_timeout 600s;
}

2. Gunicorn Systemd/Upstart Fix

Modify your upstart script to handle process groups correctly:

script
    exec start-stop-daemon --start --chuid demo \
    --exec $VENV/bin/gunicorn_django \
    -- --preload \
    -w 4 -t 600 \
    --log-level debug \
    --log-file $VENV/run/gunicorn-8080.log \
    -p $VENV/run/gunicorn-8080.pid \
    -b localhost:8080 /path/to/settings.py
end script

3. Alternative: Async Workers

For truly long-running tasks, consider using async workers:

exec $VENV/bin/gunicorn_django \
--worker-class gevent \
--worker-connections 1000 \
-w 4 -t 600 \
--log-level debug \
-b localhost:8080 /path/to/settings.py

To verify your configuration is working:

# Check active timeouts
sudo netstat -tnlp | grep gunicorn
sudo grep -i timeout /var/log/nginx/error.log

# Test with a known long request
curl -v http://yourserver.com/long-task-endpoint/

For production environments with long-running requests:

Implement proper task queueing (Celery/RQ)
Use websockets or polling for progress updates
Consider breaking down large operations

When dealing with Django applications served through Gunicorn and Nginx, long-running requests can trigger premature worker timeouts even when timeout values are explicitly configured. The error manifests as:

2012-01-20 17:30:43 [23125] [ERROR] WORKER TIMEOUT (pid:23128)
Traceback (most recent call last):
  File "/home/demo/python_envs/frontend/lib/python2.6/site-packages/gunicorn/app/base.py", line 111, in run
    os.setpgrp()
OSError: [Errno 1] Operation not permitted

Several factors contribute to this behavior:

Multiple timeout thresholds exist in the stack (Gunicorn, Nginx, OS)
The --preload flag in Gunicorn can cause unexpected behavior with long requests
System-level process management (Upstart/systemd) may enforce additional limits

For Gunicorn configuration (gunicorn.conf.py):

import multiprocessing

bind = "127.0.0.1:8080"
workers = multiprocessing.cpu_count() * 2 + 1
timeout = 600  # 10 minutes
keepalive = 5
worker_class = "sync"  # or "gevent" for async workloads
preload_app = False  # Disable preloading for long-running requests
max_requests = 1000
max_requests_jitter = 50

Ensure proper timeout settings in Nginx:

location / {
    proxy_pass http://localhost:8080;
    proxy_connect_timeout 600s;
    proxy_send_timeout 600s;
    proxy_read_timeout 600s;
    proxy_buffer_size 128k;
    proxy_buffers 4 256k;
    proxy_busy_buffers_size 256k;
}

For truly long operations (10+ minutes), consider:

Celery task queue with periodic status checks
WebSocket implementation for progress updates
Async views with Django Channels

Example async view implementation:

from django.http import JsonResponse
from asgiref.sync import sync_to_async

@sync_to_async
def heavy_operation():
    # Your long-running task
    import time
    time.sleep(300)  # 5 minute operation
    return "Result"

async def async_view(request):
    result = await heavy_operation()
    return JsonResponse({"status": "complete", "result": result})

Check these OS-level settings:

/proc/sys/net/ipv4/tcp_keepalive_time
ulimit -n (file descriptor limits)
Systemd/Upstart service timeouts (if applicable)

ServerDevWorker

Fixing Gunicorn Worker Timeout and Nginx 502 Errors for Long-Running Django Requests

1. Nginx Configuration

2. Gunicorn Systemd/Upstart Fix

3. Alternative: Async Workers

Related Articles