Apache Performance: Diagnosing and Fixing Requests Stuck in ‘Reading’ State (MaxClients Reached)


6 views

When examining the server-status dump, we observe numerous connections stuck in the 'R' (Reading Request) state with unusually high SS (Seconds Since beginning of request) values. This indicates client requests are taking excessively long to complete the initial request reading phase.

Current Time: Monday, 29-Apr-2013 11:46:00 PDT
Restart Time: Monday, 29-Apr-2013 11:03:48 PDT
Server uptime: 42 minutes 12 seconds
CPU Usage: u188.25 s345.65 cu2601.11 cs0 - 124% CPU load
244 requests currently being processed, 56 idle workers

The netstat and lsof outputs reveal:

httpd      4092   nobody   25u     IPv4          711278095         0t0        TCP 198-57-162-52.unifiedlayer.com:http->4.sub-70-193-66.myvzw.com:12471 (ESTABLISHED)
httpd      4092   nobody   26u     IPv4          711350284         0t0        TCP 198-57-162-52.unifiedlayer.com:http->c75-111-15-253.amrlcmta01.tx.dh.suddenlink.net:51298 (ESTABLISHED)

Key Apache tuning parameters:

ServerLimit 300
MaxClients 300
KeepAlive On
KeepAliveTimeout 1
MaxKeepAliveRequests 100
  • Slow client connections (mobile networks, international traffic)
  • Insufficient TCP buffer sizes
  • Mod_security or similar filtering modules
  • DNS lookup delays
  • SSL/TLS handshake issues

To further investigate:

# Check for slow requests
apachectl fullstatus | grep -E "R [0-9]{3,}"

# Monitor network connections in real-time
tcpdump -i eth0 -nn -s0 -w /tmp/apache_debug.pcap port 80

Add these directives to httpd.conf:

# Timeout settings
Timeout 30
RequestReadTimeout header=20-40,MinRate=500 body=20,MinRate=500

# Buffer optimizations
ReceiveBufferSize 87380
SendBufferSize 16384

# Worker MPM tuning
ThreadsPerChild 25
MaxRequestWorkers 300
ServerLimit 16

Create a monitoring script:

#!/bin/bash
# Monitor reading states
while true; do
    reading_count=$(apachectl fullstatus | grep -c "R [0-9]")
    if [ $reading_count -gt 50 ]; then
        echo "Warning: $reading_count requests stuck reading" | mail -s "Apache Alert" admin@example.com
    fi
    sleep 60
done

When examining Apache server-status output, we're seeing an abnormal concentration of worker processes stuck in the "R" (Reading) state with suspiciously high SS (Seconds since beginning of most recent request) values:

0-0 10320   0/548/548   R   331.52  558 45  0.0 0.12    0.12    ?   ?   ..reading..
0-0 10320   0/460/460   _   399.05  0   42  0.0 0.08    0.08    98.129.101.123  mysite.com  GET /home.php
0-0 10320   0/473/473   R   364.89  301 48  0.0 0.09    0.09    ?   ?   ..reading..

Contrary to initial assumptions, the SS column doesn't represent pure "reading time" but rather:

  • Time since the connection was established
  • Includes both request reading and any processing time
  • Measures the lifetime of the TCP connection

To properly diagnose this, we need multiple perspectives:

# Network-level analysis
netstat -tnp | grep :80 | awk '{print $6}' | sort | uniq -c

# Process-level inspection
strace -p [PID] -s 1024 -e trace=network,read,write

# Apache module debugging
apachectl -M | grep -E 'proxy|timeout'

From our analysis, several patterns emerge:

Pattern Indicators Solution Approach
Slow clients Mobile clients, international connections Adjust Timeout directives
Proxy misconfiguration Reverse proxy timeouts mismatched Synchronize proxy timeouts
Application hangs Blocking I/O operations Implement async processing

For the worker MPM shown in the config, consider these adjustments:

# In httpd.conf
TimeOut 30  # Reduce from default 60
KeepAliveTimeout 5
MaxKeepAliveRequests 50
LimitRequestBody 1048576  # 1MB max


    ThreadLimit         100
    ServerLimit         30
    StartServers        10
    MaxClients          300
    MinSpareThreads     25
    MaxSpareThreads     75
    ThreadsPerChild     25
    MaxRequestsPerChild 10000

Implement real-time monitoring with this shell snippet:

#!/bin/bash
while true; do
  echo "==== $(date) ===="
  curl -s http://localhost/server-status | grep "R " | awk '{print $6}' | sort -n | uniq -c
  netstat -tn | grep :80 | awk '{print $5}' | cut -d: -f1 | sort | uniq -c | sort -n
  sleep 5
done

For persistent cases, enable core dumps and analyze:

# Configure core dumps
ulimit -c unlimited
echo "/tmp/core.%e.%p" > /proc/sys/kernel/core_pattern

# After reproduction
gdb /usr/sbin/httpd /tmp/core.httpd.1234
(gdb) thread apply all bt full