Troubleshooting “Resource temporarily unavailable” Errors in PHP-FPM/Nginx Under High Concurrent Connections (10K+)

When dealing with high-traffic PHP applications, the dreaded "Resource temporarily unavailable while connecting to upstream" error can be particularly frustrating - especially when server resources appear underutilized. Here's a deep dive into the issue and actionable solutions.

The error typically manifests when:

Concurrent connections exceed 10K (visible via netstat -an |grep 80 |wc -l)
Available memory remains high (>10GB free)
Server load stays reasonable (<3)
Configuration tweaks to PHP-FPM child processes show no improvement

While your current configs seem adequate:

# /etc/php5/fpm/php.ini
memory_limit = 1024M
default_socket_timeout = 120

# /etc/php5/fpm/pool.d/www.conf
pm = dynamic
pm.max_children = 30
pm.start_servers = 5
pm.min_spare_servers = 2
pm.max_spare_servers = 5
request_terminate_timeout = 120s

# /etc/nginx/nginx.conf
worker_connections 4024;
keepalive_timeout 10;

The real issue often lies in the connection backlog between Nginx and PHP-FPM. When the connection queue fills up faster than FPM can process requests, we hit the "Resource temporarily unavailable" wall.

1. Tune PHP-FPM's Process Management

Your dynamic pool settings need recalculation. Try this formula:

pm.max_children = (Total RAM - (MySQL + Other Services)) / (Memory per PHP Process)
pm.start_servers = Number of CPU cores × 4
pm.min_spare_servers = Number of CPU cores × 2
pm.max_spare_servers = Number of CPU cores × 6

Example for a 32GB server with 16 cores running MySQL:

pm.max_children = (32768 - (6144 + 1024)) / 128 ≈ 200
pm.start_servers = 16 × 4 = 64
pm.min_spare_servers = 16 × 2 = 32
pm.max_spare_servers = 16 × 6 = 96

2. Optimize Nginx-FPM Communication

Add these critical parameters to your Nginx vhost config:

fastcgi_buffers 16 16k;
fastcgi_buffer_size 32k;
fastcgi_connect_timeout 60s;
fastcgi_send_timeout 60s;
fastcgi_read_timeout 60s;
fastcgi_busy_buffers_size 64k;
fastcgi_temp_file_write_size 256k;

3. Kernel-Level Tuning

Modify these sysctl parameters:

net.core.somaxconn = 65535
net.ipv4.tcp_max_syn_backlog = 65535
net.core.netdev_max_backlog = 65535

Apply with sysctl -p after adding to /etc/sysctl.conf

4. Implement Queue Monitoring

This bash script helps monitor connection queues:

#!/bin/bash
while true; do
  echo "--- $(date) ---"
  echo "NGINX active connections: $(netstat -an | grep :80 | grep ESTABLISHED | wc -l)"
  echo "PHP-FPM queue: $(ps -ef | grep php-fpm | grep -v grep | wc -l)/$(grep 'pm.max_children' /etc/php5/fpm/pool.d/www.conf | cut -d' ' -f3)"
  echo "TCP backlog: $(ss -lnt | grep ':80' | awk '{print $2}')"
  sleep 5
done

For extreme cases (20K+ connections), consider:

Implementing PHP-FPM status page monitoring
Setting up multiple FPM pools for different request types
Offloading static content to CDN
Implementing Opcache with aggressive settings

Remember to test changes incrementally and monitor with tools like htop, nginx-status, and php-fpm-exporter for Prometheus.

When dealing with the "Resource temporarily unavailable" error despite having sufficient memory and low server load, we need to examine multiple layers of the stack:

# Check current FPM processes
ps aux | grep php-fpm | wc -l

# Monitor socket connections
ss -s | grep "Total:"

Your current PHP-FPM settings might be creating artificial bottlenecks. Let's optimize them:

# /etc/php5/fpm/pool.d/www.conf
pm = dynamic
pm.max_children = 150  # Adjusted based on (Total RAM - (MySQL + other services)) / Average PHP process size
pm.start_servers = 20
pm.min_spare_servers = 10
pm.max_spare_servers = 30
pm.process_idle_timeout = 10s
pm.max_requests = 500  # Helps prevent memory leaks
request_terminate_timeout = 30s  # Lower than Nginx timeout

The worker connections and upstream timeouts need synchronization:

# /etc/nginx/nginx.conf
worker_processes auto;
events {
    worker_connections 16384;
    multi_accept on;
    use epoll;
}

http {
    fastcgi_buffers 16 16k;
    fastcgi_buffer_size 32k;
    keepalive_timeout 15;
    keepalive_requests 1000;
    
    upstream php {
        server unix:/var/run/php5-fpm.sock;
        keepalive 20;  # Critical for persistent connections
    }
}

Add these to /etc/sysctl.conf for better TCP stack handling:

net.core.somaxconn = 65535
net.ipv4.tcp_max_syn_backlog = 65535
net.ipv4.tcp_tw_reuse = 1
net.ipv4.ip_local_port_range = 1024 65535
fs.file-max = 2097152

Apply with sysctl -p

Implement real-time monitoring to identify patterns:

# Install and configure php-fpm status
pm.status_path = /status
ping.path = /ping

# Nginx location block
location ~ ^/(status|ping)$ {
    access_log off;
    allow 127.0.0.1;
    deny all;
    include fastcgi_params;
    fastcgi_pass unix:/var/run/php5-fpm.sock;
}

For extreme cases, consider implementing queueing with Nginx's limit_req:

limit_req_zone $binary_remote_addr zone=phpfpm:10m rate=30r/s;

location ~ \.php$ {
    limit_req zone=phpfpm burst=50 nodelay;
    # ... existing PHP handling config
}

ServerDevWorker

Troubleshooting “Resource temporarily unavailable” Errors in PHP-FPM/Nginx Under High Concurrent Connections (10K+)

1. Tune PHP-FPM's Process Management

2. Optimize Nginx-FPM Communication

3. Kernel-Level Tuning

4. Implement Queue Monitoring

Related Articles