When your Linux system's ps aux
command hangs while displaying process information, despite having sufficient RAM (1GB in your case), this typically indicates underlying system issues. The top
output reveals crucial details:
top - 11:00:29 up 3:53, 2 users, load average: 51.75, 50.52, 45.38
Tasks: 79 total, 1 running, 77 sleeping, 0 stopped, 1 zombie
Cpu(s): 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 1747660k total, 603572k used, 1144088k free, 12644k buffers
Swap: 917496k total, 0k used, 917496k free, 97732k cached
Zombie processes (state "Z" in process listings) are terminated processes waiting for their parent to read their exit status. While they don't consume resources, excessive zombies can indicate programming errors. A single zombie usually isn't problematic, but combined with your high load averages (51.75, 50.52, 45.38), it suggests deeper issues.
Several factors can cause ps aux
to hang:
- Kernel process table issues: Try alternative commands for diagnosis:
cat /proc/stat | grep processes cat /proc/sys/kernel/pid_max
- I/O wait problems: Check with:
iostat -x 1 5 dmesg | grep -i "I/O"
- Mount point issues: Some proc filesystem mounts can cause hangs:
mount | grep proc ls -la /proc/[0-9]*/fd
When basic checks don't reveal the cause, try these advanced techniques:
1. Strace the ps command:
strace -o ps_debug.log ps aux
This reveals where exactly the command gets stuck.
2. Check for hung NFS mounts:
mount | grep nfs
timeout 5 ls /mnt/nfs_share || echo "NFS timeout"
3. Alternative process viewing:
# Use proc directly
ls -1 /proc/[0-9]*/cmdline | xargs -n1 cat 2>/dev/null
# Try different ps formats
ps -eo pid,ppid,cmd
To handle the zombie process identified in your top
output:
# Find zombie processes
ps aux | awk '$8=="Z" {print $2,$11}'
# Identify parent process
ps -ef | grep [PID_of_zombie]
# Kill parent process (if safe)
kill -HUP [parent_PID]
Your extremely high load averages (51.75) with mostly idle CPU suggest:
- Disk I/O bottlenecks (
iostat -x 1
) - Memory pressure despite free RAM (
vmstat 1 5
) - Process scheduler issues (
perf sched record
)
To avoid recurrence:
# Regular process cleanup script
#!/bin/bash
# Kill defunct processes
ps -ef | grep defunct | grep -v grep | awk '{print $3}' | xargs kill -9 2>/dev/null
# Restart hung services
systemctl list-units --state=failed | awk '/failed/ {print $1}' | xargs systemctl restart
When encountering a frozen ps aux
command while other monitoring tools like top
remain functional, we're typically dealing with one of these scenarios:
# Sample output showing the problematic state
$ ps aux
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 1 0.0 0.1 33608 2164 ? Ss Apr08 0:02 /sbin/init
...
[freezes at this point]
The zombie process (marked 'Z' in process status) shown in your top
output indicates a process that has completed execution but hasn't been properly reaped by its parent. While a single zombie isn't inherently dangerous, it can signal deeper issues.
# Identifying zombie processes
$ ps -A -ostat,pid,ppid | grep -e '[zZ]'
Z 1234 5678
Your load averages (51.75, 50.52, 45.38) are extremely concerning - they suggest your system is massively overloaded. This explains why ps aux
hangs while top
works (as top
uses more efficient polling mechanisms).
# Check for I/O wait (alternative to frozen ps)
$ vmstat 1 5
# Identify processes causing high load
$ pidstat 1 5
# Check for disk saturation
$ iostat -x 1 5
- Runaway process spawning (fork bombs)
- Disk I/O contention (check %wa in top)
- Memory pressure despite free RAM (check swappiness)
- Kernel thread deadlock
When basic tools fail, consider these alternatives:
# Use procfs directly
$ ls -l /proc/[0-9]*/exe
# Check for uninterruptible sleep (D state)
$ ps -eo stat,pid,cmd | grep "^D"
# Alternative process viewer
$ htop --tree
To properly handle the zombie process:
# Option 1: Kill the parent process
$ kill -HUP [parent_pid]
# Option 2: Force kernel reaping (if parent is init)
$ kill -CHLD 1
- Implement process monitoring with systemd or supervisor
- Set proper ulimits for process count
- Regularly audit crontabs and service units
- Consider using cgroups for process containment
For persistent issues, collect kernel diagnostics:
# Capture kernel ring buffer
$ dmesg > kernel_log.txt
# Check for OOM killer activity
$ grep -i kill /var/log/messages*
# Capture system state (requires sysrq)
$ echo t > /proc/sysrq-trigger