How to Diagnose High Memory Apache Processes Stuck in Uninterruptible Sleep (D State)


57 views

When Apache processes consume excessive memory and get stuck in "D" state (Uninterruptible sleep), it typically indicates a blocking I/O operation. Common causes include:

  • Long-running database queries (especially in Ruby/PHP applications)
  • File system operations on slow/NFS storage
  • External API calls that hang
  • Content management systems processing large uploads

First, get detailed process information:

# Show extended process info
ps auxf | grep apache

# Alternative with thread view
top -H -p $(pgrep -d',' apache2)

# Memory usage breakdown
pmap -x PID | sort -n -k3

Use strace to see what the process is doing:

# Attach strace to running process
strace -p PID -f -s 80 -o /tmp/apache_trace.log

# Alternative for quick check
strace -p PID -c

Identify what files/sockets the process is accessing:

lsof -p PID

# Focus on network connections
lsof -p PID -i

# Check Ruby-specific files if suspected
lsof -p PID | grep -E '\.rb|\.so'

For Ruby apps like Redmine, add these to config.ru:

require 'rack/contrib'
use Rack::Deflect, :log_file => "/var/log/apache2/ruby_blocking_operations.log"

Here's what you might find when debugging:

# From strace output
[pid 1526] read(8,  
[pid 1527] poll([{fd=12, events=POLLIN}], 1, 5000 

# From lsof
ruby    1526 www-data 12u IPv4 123456789 0t0 TCP server:34152->database:3306 (ESTABLISHED)

This shows the process is waiting on MySQL database response.

Add these to your Apache configuration:

# Limit request processing time
TimeOut 30
RLimitCPU 60 120
RLimitMEM 512000000 1024000000

# For mod_passenger (common with Ruby)
PassengerMaxRequestTime 30
PassengerMemoryLimit 512

Create a cron job with this check:

#!/bin/bash
THRESHOLD=500000 # KB
APACHE_PIDS=$(pgrep apache2)

for pid in $APACHE_PIDS; do
  MEM=$(ps -p $pid -o rss=)
  if [ $MEM -gt $THRESHOLD ]; then
    echo "$(date) - Apache PID $pid using $MEM KB" >> /var/log/apache_memory.log
    strace -p $pid -f -s 80 -o /var/log/apache_strace_$pid.log &
    gdb -p $pid -batch -ex "thread apply all bt" &>> /var/log/apache_gdb_$pid.log
  fi
done

When Apache starts consuming excessive memory and swap space, the "D" state (Uninterruptible sleep) in ps output indicates potential I/O waits. Here's a comprehensive approach to diagnose such issues:

# First identify the problematic Apache process
ps -auxf | grep apache | grep -v grep

# Sample output showing the culprit:
www-data  1526  0.1 78.9 14928852 3191628 ?    D    Oct17   6:45 /usr/sbin/apache2 -k start

Use these Linux commands to gather more context about the problematic process:

# Check open files and connections
lsof -p PID

# Examine memory map
pmap -x PID

# View stack traces
gdb -p PID
(gdb) thread apply all bt
(gdb) detach

For Apache processes, we need to trace back to the actual web application causing the issue:

# Check Apache scoreboard to see active requests
apachectl fullstatus

# Alternative method using mod_status
lynx -dump http://localhost/server-status

For deeper investigation, SystemTap can help profile Apache's behavior:

# Install SystemTap and Apache debug symbols
sudo apt-get install systemtap apache2-dbg

# Sample script to trace Apache memory allocation
probe process("/usr/sbin/apache2").function("ap_alloc_pool") {
    printf("Allocating pool at %s\n", user_string($1));
}

When suspecting Ruby applications (like Redmine), additional tools are helpful:

# Check Passenger status if used
passenger-memory-stats
passenger-status

# For RVM-based applications
rvm debug ruby PID

Implement these proactive monitoring solutions:

# Simple monitoring script
while true; do
    date >> apache_monitor.log
    ps -eo pid,user,pcpu,pmem,vsz,rss,state,command --sort=-rss | head -20 >> apache_monitor.log
    sleep 60
done

# Alternative using atop
atop -a -P CPU,MEM,DSK,NET 10 60