When examining the server-status output during freeze incidents, the scoreboard shows an abnormal pattern dominated by "_" (waiting for connection) and "K" (Keepalive read) states. The netstat output reveals:
# During freeze:
109 CLOSE_WAIT
2652 ESTABLISHED
91 SYN_RECV
# Normal operation:
108 ESTABLISHED
50 SYN_RECV
11276 TIME_WAIT
The massive SYN_RECV state during incidents suggests TCP connection queue overflow. Try these sysctl adjustments:
# Increase SYN backlog and connection tracking
sysctl -w net.ipv4.tcp_max_syn_backlog=8192
sysctl -w net.core.somaxconn=4096
sysctl -w net.ipv4.tcp_syncookies=1
# Faster connection recycling (for TIME_WAIT)
sysctl -w net.ipv4.tcp_tw_reuse=1
sysctl -w net.ipv4.tcp_fin_timeout=30
The current prefork configuration appears problematic for modern workloads:
# Problematic settings:
KeepAlive On
KeepAliveTimeout 1 # Too aggressive for high traffic
MaxClients 920 # Likely exceeding available memory
Recommended adjustments:
KeepAlive Off # Or increase timeout to 3-5 seconds
StartServers 20
MinSpareServers 20
MaxSpareServers 40
MaxClients 400 # Based on 8GB RAM
MaxRequestsPerChild 1000
Create a real-time monitoring script to catch connection buildup:
#!/bin/bash
watch -n 5 "netstat -ant | awk 'BEGIN {
print \"HTTP States Monitoring\";
print \"====================\";
}
NR>2 {
s[$6]++
}
END {
for (i in s) print i, s[i];
}' | sort -n -k2"
With mod_php, each Apache child carries full PHP memory overhead. Consider switching to:
- PHP-FPM with mod_proxy_fcgi
- Event MPM instead of prefork
- OPcache with proper memory settings
# Example PHP-FPM pool config
pm = dynamic
pm.max_children = 50
pm.start_servers = 10
pm.min_spare_servers = 5
pm.max_spare_servers = 20
When incidents occur, capture tcpdump data:
tcpdump -ni eth0 'tcp port 80 and (tcp-syn|tcp-ack)' -w /tmp/http_debug.pcap
Analyze with Wireshark for:
- SYN flood patterns
- Retransmission rates
- Keepalive negotiation
When your Apache cluster suddenly becomes unresponsive with all worker processes stuck in "_" (Waiting for Connection) state, while showing:
netstat -an|awk '/tcp/ {print $6}'|sort|uniq -c
109 CLOSE_WAIT
2652 ESTABLISHED
2 FIN_WAIT1
11 LAST_ACK
91 SYN_RECV
This typically indicates a TCP connection handling issue rather than pure Apache misconfiguration. Let me share my troubleshooting journey and solution.
First, we need to check kernel-level TCP parameters that might be causing connection buildup:
# Check current TCP settings
sysctl net.ipv4.tcp_fin_timeout
sysctl net.ipv4.tcp_keepalive_time
sysctl net.ipv4.tcp_max_syn_backlog
sysctl net.ipv4.tcp_tw_reuse
# Temporary solution during crisis
echo 30 > /proc/sys/net/ipv4/tcp_fin_timeout
echo 1 > /proc/sys/net/ipv4/tcp_tw_reuse
The current configuration has:
KeepAlive On
MaxKeepAliveRequests 20
KeepAliveTimeout 1
For high-traffic servers, try this optimized version:
KeepAlive Off # Or reduce timeout if must keep alive
MaxKeepAliveRequests 100
KeepAliveTimeout 2
TimeOut 30
ServerLimit 600 # Reduce from 920 to prevent overcommit
StartServers 50
MinSpareServers 50
MaxSpareServers 100
MaxClients 600
MaxRequestsPerChild 1000
Create this monitoring script (/usr/local/bin/conn_monitor.sh):
#!/bin/bash
watch -n 5 "date; \
echo '------ Apache Status ------'; \
apache2ctl status | grep 'Waiting'; \
echo '------ TCP Connections ------'; \
netstat -tn | awk '{print \$6}' | sort | uniq -c; \
echo '------ Top Connections ------'; \
ss -s | grep 'estab'; \
echo '------ Memory Usage ------'; \
free -m"
For mod_php setups, add these php.ini tweaks:
max_execution_time = 30
memory_limit = 128M
realpath_cache_size = 256k
opcache.enable=1
opcache.memory_consumption=128
- Reduce ServerLimit/MaxClients to 80% of memory capacity
- Set KeepAliveTimeout between 1-3 seconds max
- Enable TCP reuse and faster FIN timeouts
- Monitor with the connection tracking script
- Consider switching to event MPM if possible