Troubleshooting “Resource temporarily unavailable” Errors in PHP-FPM/Nginx Under High Concurrent Connections (10K+)


6 views

When dealing with high-traffic PHP applications, the dreaded "Resource temporarily unavailable while connecting to upstream" error can be particularly frustrating - especially when server resources appear underutilized. Here's a deep dive into the issue and actionable solutions.

The error typically manifests when:

  • Concurrent connections exceed 10K (visible via netstat -an |grep 80 |wc -l)
  • Available memory remains high (>10GB free)
  • Server load stays reasonable (<3)
  • Configuration tweaks to PHP-FPM child processes show no improvement

While your current configs seem adequate:

# /etc/php5/fpm/php.ini
memory_limit = 1024M
default_socket_timeout = 120

# /etc/php5/fpm/pool.d/www.conf
pm = dynamic
pm.max_children = 30
pm.start_servers = 5
pm.min_spare_servers = 2
pm.max_spare_servers = 5
request_terminate_timeout = 120s

# /etc/nginx/nginx.conf
worker_connections 4024;
keepalive_timeout 10;

The real issue often lies in the connection backlog between Nginx and PHP-FPM. When the connection queue fills up faster than FPM can process requests, we hit the "Resource temporarily unavailable" wall.

1. Tune PHP-FPM's Process Management

Your dynamic pool settings need recalculation. Try this formula:

pm.max_children = (Total RAM - (MySQL + Other Services)) / (Memory per PHP Process)
pm.start_servers = Number of CPU cores × 4
pm.min_spare_servers = Number of CPU cores × 2
pm.max_spare_servers = Number of CPU cores × 6

Example for a 32GB server with 16 cores running MySQL:

pm.max_children = (32768 - (6144 + 1024)) / 128 ≈ 200
pm.start_servers = 16 × 4 = 64
pm.min_spare_servers = 16 × 2 = 32
pm.max_spare_servers = 16 × 6 = 96

2. Optimize Nginx-FPM Communication

Add these critical parameters to your Nginx vhost config:

fastcgi_buffers 16 16k;
fastcgi_buffer_size 32k;
fastcgi_connect_timeout 60s;
fastcgi_send_timeout 60s;
fastcgi_read_timeout 60s;
fastcgi_busy_buffers_size 64k;
fastcgi_temp_file_write_size 256k;

3. Kernel-Level Tuning

Modify these sysctl parameters:

net.core.somaxconn = 65535
net.ipv4.tcp_max_syn_backlog = 65535
net.core.netdev_max_backlog = 65535

Apply with sysctl -p after adding to /etc/sysctl.conf

4. Implement Queue Monitoring

This bash script helps monitor connection queues:

#!/bin/bash
while true; do
  echo "--- $(date) ---"
  echo "NGINX active connections: $(netstat -an | grep :80 | grep ESTABLISHED | wc -l)"
  echo "PHP-FPM queue: $(ps -ef | grep php-fpm | grep -v grep | wc -l)/$(grep 'pm.max_children' /etc/php5/fpm/pool.d/www.conf | cut -d' ' -f3)"
  echo "TCP backlog: $(ss -lnt | grep ':80' | awk '{print $2}')"
  sleep 5
done

For extreme cases (20K+ connections), consider:

  • Implementing PHP-FPM status page monitoring
  • Setting up multiple FPM pools for different request types
  • Offloading static content to CDN
  • Implementing Opcache with aggressive settings

Remember to test changes incrementally and monitor with tools like htop, nginx-status, and php-fpm-exporter for Prometheus.


When dealing with the "Resource temporarily unavailable" error despite having sufficient memory and low server load, we need to examine multiple layers of the stack:

# Check current FPM processes
ps aux | grep php-fpm | wc -l

# Monitor socket connections
ss -s | grep "Total:"

Your current PHP-FPM settings might be creating artificial bottlenecks. Let's optimize them:

# /etc/php5/fpm/pool.d/www.conf
pm = dynamic
pm.max_children = 150  # Adjusted based on (Total RAM - (MySQL + other services)) / Average PHP process size
pm.start_servers = 20
pm.min_spare_servers = 10
pm.max_spare_servers = 30
pm.process_idle_timeout = 10s
pm.max_requests = 500  # Helps prevent memory leaks
request_terminate_timeout = 30s  # Lower than Nginx timeout

The worker connections and upstream timeouts need synchronization:

# /etc/nginx/nginx.conf
worker_processes auto;
events {
    worker_connections 16384;
    multi_accept on;
    use epoll;
}

http {
    fastcgi_buffers 16 16k;
    fastcgi_buffer_size 32k;
    keepalive_timeout 15;
    keepalive_requests 1000;
    
    upstream php {
        server unix:/var/run/php5-fpm.sock;
        keepalive 20;  # Critical for persistent connections
    }
}

Add these to /etc/sysctl.conf for better TCP stack handling:

net.core.somaxconn = 65535
net.ipv4.tcp_max_syn_backlog = 65535
net.ipv4.tcp_tw_reuse = 1
net.ipv4.ip_local_port_range = 1024 65535
fs.file-max = 2097152

Apply with sysctl -p

Implement real-time monitoring to identify patterns:

# Install and configure php-fpm status
pm.status_path = /status
ping.path = /ping

# Nginx location block
location ~ ^/(status|ping)$ {
    access_log off;
    allow 127.0.0.1;
    deny all;
    include fastcgi_params;
    fastcgi_pass unix:/var/run/php5-fpm.sock;
}

For extreme cases, consider implementing queueing with Nginx's limit_req:

limit_req_zone $binary_remote_addr zone=phpfpm:10m rate=30r/s;

location ~ \.php$ {
    limit_req zone=phpfpm burst=50 nodelay;
    # ... existing PHP handling config
}