How to Diagnose High Memory Apache Processes Stuck in Uninterruptible Sleep (D State)


2 views

When Apache processes consume excessive memory and get stuck in "D" state (Uninterruptible sleep), it typically indicates a blocking I/O operation. Common causes include:

  • Long-running database queries (especially in Ruby/PHP applications)
  • File system operations on slow/NFS storage
  • External API calls that hang
  • Content management systems processing large uploads

First, get detailed process information:

# Show extended process info
ps auxf | grep apache

# Alternative with thread view
top -H -p $(pgrep -d',' apache2)

# Memory usage breakdown
pmap -x PID | sort -n -k3

Use strace to see what the process is doing:

# Attach strace to running process
strace -p PID -f -s 80 -o /tmp/apache_trace.log

# Alternative for quick check
strace -p PID -c

Identify what files/sockets the process is accessing:

lsof -p PID

# Focus on network connections
lsof -p PID -i

# Check Ruby-specific files if suspected
lsof -p PID | grep -E '\.rb|\.so'

For Ruby apps like Redmine, add these to config.ru:

require 'rack/contrib'
use Rack::Deflect, :log_file => "/var/log/apache2/ruby_blocking_operations.log"

Here's what you might find when debugging:

# From strace output
[pid 1526] read(8,  
[pid 1527] poll([{fd=12, events=POLLIN}], 1, 5000 

# From lsof
ruby    1526 www-data 12u IPv4 123456789 0t0 TCP server:34152->database:3306 (ESTABLISHED)

This shows the process is waiting on MySQL database response.

Add these to your Apache configuration:

# Limit request processing time
TimeOut 30
RLimitCPU 60 120
RLimitMEM 512000000 1024000000

# For mod_passenger (common with Ruby)
PassengerMaxRequestTime 30
PassengerMemoryLimit 512

Create a cron job with this check:

#!/bin/bash
THRESHOLD=500000 # KB
APACHE_PIDS=$(pgrep apache2)

for pid in $APACHE_PIDS; do
  MEM=$(ps -p $pid -o rss=)
  if [ $MEM -gt $THRESHOLD ]; then
    echo "$(date) - Apache PID $pid using $MEM KB" >> /var/log/apache_memory.log
    strace -p $pid -f -s 80 -o /var/log/apache_strace_$pid.log &
    gdb -p $pid -batch -ex "thread apply all bt" &>> /var/log/apache_gdb_$pid.log
  fi
done

When Apache starts consuming excessive memory and swap space, the "D" state (Uninterruptible sleep) in ps output indicates potential I/O waits. Here's a comprehensive approach to diagnose such issues:

# First identify the problematic Apache process
ps -auxf | grep apache | grep -v grep

# Sample output showing the culprit:
www-data  1526  0.1 78.9 14928852 3191628 ?    D    Oct17   6:45 /usr/sbin/apache2 -k start

Use these Linux commands to gather more context about the problematic process:

# Check open files and connections
lsof -p PID

# Examine memory map
pmap -x PID

# View stack traces
gdb -p PID
(gdb) thread apply all bt
(gdb) detach

For Apache processes, we need to trace back to the actual web application causing the issue:

# Check Apache scoreboard to see active requests
apachectl fullstatus

# Alternative method using mod_status
lynx -dump http://localhost/server-status

For deeper investigation, SystemTap can help profile Apache's behavior:

# Install SystemTap and Apache debug symbols
sudo apt-get install systemtap apache2-dbg

# Sample script to trace Apache memory allocation
probe process("/usr/sbin/apache2").function("ap_alloc_pool") {
    printf("Allocating pool at %s\n", user_string($1));
}

When suspecting Ruby applications (like Redmine), additional tools are helpful:

# Check Passenger status if used
passenger-memory-stats
passenger-status

# For RVM-based applications
rvm debug ruby PID

Implement these proactive monitoring solutions:

# Simple monitoring script
while true; do
    date >> apache_monitor.log
    ps -eo pid,user,pcpu,pmem,vsz,rss,state,command --sort=-rss | head -20 >> apache_monitor.log
    sleep 60
done

# Alternative using atop
atop -a -P CPU,MEM,DSK,NET 10 60