When a child process terminates but hasn't been waited for by its parent, it becomes a zombie (defunct process). While technically harmless, these entries can cause practical issues:
- Applications checking process tables may see them as active processes
- System monitoring tools may flag them as anomalies
- In extreme cases, they can exhaust PID limits (though this is rare)
The usual advice of kill -9
won't work because:
$ ps aux | grep defunct
user 12345 0.0 0.0 0 0 ? Z 12:34 0:00 [sshfs] <defunct>
$ sudo kill -9 12345
bash: kill: (12345) - No such process
Even the init process (PID 1) can't kill zombies - it can only reap them through wait()
.
Here's a C program that creates a temporary parent process to adopt and properly reap the zombie:
#include <sys/types.h>
#include <sys/wait.h>
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char *argv[]) {
if (argc != 2) {
fprintf(stderr, "Usage: %s <pid>\n", argv[0]);
exit(EXIT_FAILURE);
}
pid_t target_pid = atoi(argv[1]);
pid_t child_pid = fork();
if (child_pid == 0) {
// Child process just sleeps briefly
sleep(1);
exit(EXIT_SUCCESS);
} else {
// Parent process adopts the zombie
if (ptrace(PTRACE_ATTACH, target_pid, NULL, NULL) == -1) {
perror("ptrace attach failed");
exit(EXIT_FAILURE);
}
waitpid(target_pid, NULL, 0);
ptrace(PTRACE_DETACH, target_pid, NULL, NULL);
// Wait for our own child to prevent new zombies
waitpid(child_pid, NULL, 0);
}
return EXIT_SUCCESS;
}
- Compile the program:
gcc -o reap_zombie reap_zombie.c
- Identify the zombie PID:
ps aux | grep defunct
- Run the reaper:
sudo ./reap_zombie ZOMBIE_PID
- Verify removal:
ps aux | grep ZOMBIE_PID
For environments where C compilation isn't possible, here's a Python alternative using ctypes:
import ctypes
import os
import sys
LIBC = ctypes.CDLL(None)
PTRACE_ATTACH = 16
PTRACE_DETACH = 17
def reap_zombie(pid):
child = os.fork()
if child == 0:
os._exit(0)
else:
LIBC.ptrace(PTRACE_ATTACH, pid, 0, 0)
os.waitpid(pid, 0)
LIBC.ptrace(PTRACE_DETACH, pid, 0, 0)
os.waitpid(child, 0)
if __name__ == "__main__":
if len(sys.argv) != 2:
print(f"Usage: {sys.argv[0]} <pid>")
sys.exit(1)
reap_zombie(int(sys.argv[1]))
- Implement proper signal handling in parent processes
- Use
waitpid()
withWNOHANG
option in event loops - Consider double-forking for daemon processes
- For SSHFS specifically, use
-o reconnect
option
We've all encountered those stubborn zombie processes (defunct processes) that cling to the process table after their parent dies. While they technically consume no resources, they can cause real problems:
- Application startup checks that rely on process table scanning
- Monitoring systems triggering false alerts
- Process ID (PID) exhaustion in long-running systems
The common advice of "just reboot" or "ignore them" isn't practical for production systems. Let's examine why normal approaches don't work:
# These WON'T work:
kill -9 [zombie_pid] # Zombies are already dead
pkill -f "process_name" # Can't signal a terminated process
The solution involves creating a temporary process that adopts the zombie, then terminates properly. Here's a step-by-step implementation in C:
#include <sys/types.h>
#include <sys/wait.h>
#include <unistd.h>
void reap_zombie(pid_t zombie_pid) {
pid_t temp_parent = fork();
if (temp_parent == 0) {
// Child process becomes the new parent
sleep(1); // Ensure zombie is inherited
exit(0); // Proper termination cleans up the zombie
} else if (temp_parent > 0) {
// Original parent waits for the temporary process
waitpid(temp_parent, NULL, 0);
}
}
For those who prefer Python, here's how to achieve the same result:
import os
import time
def remove_zombie(zombie_pid):
if os.fork() == 0:
# Child process inherits the zombie
time.sleep(0.5)
os._exit(0)
else:
# Wait for the temporary process to finish
os.wait()
For systems that frequently generate zombies, consider this bash script that automatically detects and cleans them up:
#!/bin/bash
# Find zombie processes
zombies=$(ps aux | awk '$8=="Z" {print $2}')
for pid in $zombies; do
echo "Cleaning up zombie PID: $pid"
(
sleep 0.1
exit 0
) &
wait
done
While the above solutions work, preventing zombies is always better:
- Always implement proper signal handling in parent processes
- Use double-forking for daemon processes
- Consider using process supervisors like systemd or supervisord
For particularly stubborn cases where the process table becomes corrupted, you might need to:
# Force a kernel process table refresh (risky!)
echo 1 > /proc/sys/kernel/ns_last_pid
Note: This should only be used as a last resort and may have system-wide implications.