When dealing with intermittent 504 errors in an Nginx+PHP-FPM environment, the key diagnostic artifacts appear in two places:
# Nginx error logs show:
[error] upstream timed out (110: Connection timed out) while reading response header from upstream
[error] recv() failed (104: Connection reset by peer) while reading response header from upstream
[error] connect() failed (111: Connection refused) while connecting to upstream
The critical insight comes from examining TCP socket states:
netstat -tnp | grep 9000
# Reveals numerous CLOSE_WAIT and FIN_WAIT2 pairs
tcp 9 0 localhost:9000 localhost:36094 CLOSE_WAIT 14269/php5-fpm
tcp 0 0 localhost:46664 localhost:9000 FIN_WAIT2 -
The FIN_WAIT2/CLOSE_WAIT pairs indicate a fundamental TCP stack imbalance - the PHP-FPM workers aren't properly closing connections after Nginx terminates them. This creates socket exhaustion over time.
Three primary factors contribute to this:
- PHP-FPM's process manager (static/dynamic/ondemand) not recycling workers properly
- Keepalive timeouts mismatched between Nginx and PHP-FPM
- PHP scripts hanging during execution (database queries, external API calls)
1. PHP-FPM Configuration Tuning
; /etc/php-fpm.d/www.conf
pm = dynamic
pm.max_children = 50
pm.start_servers = 5
pm.min_spare_servers = 2
pm.max_spare_servers = 8
pm.max_requests = 500 ; Critical for preventing memory leaks
request_terminate_timeout = 30s ; Force kill hanging scripts
catch_workers_output = yes ; For debugging
2. Nginx FastCGI Timeout Adjustments
location ~ \.php$ {
fastcgi_read_timeout 300;
fastcgi_send_timeout 300;
fastcgi_connect_timeout 60;
fastcgi_buffer_size 128k;
fastcgi_buffers 4 256k;
fastcgi_busy_buffers_size 256k;
keepalive_timeout 15; # Must be lower than PHP-FPM's
}
3. Kernel-Level TCP Tweaks
# /etc/sysctl.conf
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_fin_timeout = 30
net.core.somaxconn = 65535
Implement this status check script to catch issues early:
#!/bin/bash
# monitor_fpm_sockets.sh
WARN_THRESHOLD=50
CRIT_THRESHOLD=100
CLOSE_WAIT=$(netstat -tnp | grep 9000 | grep CLOSE_WAIT | wc -l)
FIN_WAIT=$(netstat -tnp | grep 9000 | grep FIN_WAIT | wc -l)
if [ $CLOSE_WAIT -ge $CRIT_THRESHOLD ]; then
echo "CRITICAL: $CLOSE_WAIT CLOSE_WAIT sockets | sockets=$CLOSE_WAIT"
service php-fpm restart
elif [ $CLOSE_WAIT -ge $WARN_THRESHOLD ]; then
echo "WARNING: $CLOSE_WAIT CLOSE_WAIT sockets | sockets=$CLOSE_WAIT"
fi
When the issue persists, use these forensic tools:
# Show which PHP processes are stuck
strace -p $(pgrep -d, php-fpm) -s 1024 -f
# Monitor TCP connections in real-time
tcptrack -i eth0 port 9000
# Detailed PHP-FPM status
curl http://localhost/status?json | jq
When examining these timeout errors in your Nginx + PHP-FPM setup, several key patterns emerge from the logs:
# Typical error sequence observed
[error] upstream timed out (110: Connection timed out)
[error] recv() failed (104: Connection reset by peer)
[error] connect() failed (111: Connection refused)
The netstat
output reveals a critical issue with TCP connection states:
tcp 0 0 localhost:46680 localhost:9000 FIN_WAIT2
tcp 1337 0 localhost:9000 localhost:46680 CLOSE_WAIT
This persistent CLOSE_WAIT/FIN_WAIT2 pairing indicates that PHP-FPM isn't properly closing connections after processing requests, leading to connection pool exhaustion.
After extensive testing, these are the most effective configuration changes:
Nginx Configuration
location ~ \.php$ {
fastcgi_read_timeout 300;
fastcgi_send_timeout 300;
fastcgi_connect_timeout 75s;
fastcgi_buffer_size 128k;
fastcgi_buffers 4 256k;
fastcgi_busy_buffers_size 256k;
fastcgi_temp_file_write_size 256k;
# Critical for connection reuse
fastcgi_keep_conn on;
}
PHP-FPM Pool Adjustments
[www]
pm = dynamic
pm.max_children = 50
pm.start_servers = 5
pm.min_spare_servers = 5
pm.max_spare_servers = 10
pm.max_requests = 500
; Socket-specific fixes for TCP mode
listen = 127.0.0.1:9000
listen.backlog = 65535
listen.allowed_clients = 127.0.0.1
request_terminate_timeout = 300s
request_slowlog_timeout = 60s
Add these to /etc/sysctl.conf
:
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_fin_timeout = 30
net.core.somaxconn = 65535
net.ipv4.tcp_max_syn_backlog = 65535
Apply with sysctl -p
Create this bash script to monitor connection states:
#!/bin/bash
watch -n 2 "netstat -tnpa | grep -E '9000|php-fpm' | awk '{print \$6}' | sort | uniq -c"
And for real-time PHP-FPM status:
watch -n 2 "curl -s 127.0.0.1/status | grep -E 'active|listen'"
- Verify all timeouts match between Nginx and PHP-FPM
- Implement proper connection pooling settings
- Monitor TCP connection states post-fix
- Consider switching to Unix sockets if possible
- Implement proper process recycling with max_requests