When running critical applications on Linux servers, process crashes can cause significant downtime. Unlike Windows systems that have services managers, Linux requires more hands-on approaches for process supervision.
Linux offers several built-in mechanisms for process monitoring:
1. systemd Service Management:
[Unit]
Description=My Critical Application
After=network.target
[Service]
Type=simple
ExecStart=/path/to/your/application
Restart=always
RestartSec=5s
[Install]
WantedBy=multi-user.target
This configuration automatically restarts the application if it exits for any reason, with a 5-second delay between restart attempts.
2. Supervisor Daemon (supervisord):
[program:myapp]
command=/path/to/your/application
autostart=true
autorestart=true
startretries=3
startsecs=10
For more control, consider a bash monitoring script:
#!/bin/bash
APP_PATH="/path/to/your/application"
LOG_FILE="/var/log/app_monitor.log"
while true; do
if ! pgrep -x "$(basename $APP_PATH)" > /dev/null; then
echo "$(date): Application not running. Restarting..." >> $LOG_FILE
$APP_PATH &
fi
sleep 10
done
For containerized applications, Docker provides built-in restart policies:
docker run -d --restart unless-stopped your_image
For production environments, consider dedicated process managers:
- PM2 (for Node.js applications)
- Runit
- Monit (with comprehensive monitoring capabilities)
When implementing process monitoring:
- Set appropriate restart limits to prevent thrashing
- Configure proper logging for crash analysis
- Consider resource constraints (memory, CPU)
- Implement proper signal handling in your application
In production environments, critical processes must remain available even after unexpected failures. While Linux processes can terminate for various reasons (segfaults, OOM kills, manual termination), we need reliable mechanisms to restart them automatically.
Linux offers several native approaches for process supervision:
Systemd Service Units
The most modern solution for most distributions (RHEL 7+, Ubuntu 15.04+). Example unit file at /etc/systemd/system/myapp.service
:
[Unit] Description=My Critical Application [Service] ExecStart=/usr/local/bin/myapp Restart=always RestartSec=5s [Install] WantedBy=multi-user.target
Key parameters:
- Restart=always
: Restarts regardless of exit code
- RestartSec
: Delay between restart attempts
Supervisor Daemons
For older systems without systemd, consider these alternatives:
# Using inittab (SysV init) myapp:2345:respawn:/usr/local/bin/myapp
For more sophisticated monitoring (process + resource checks):
# /etc/monitrc check process myapp matching "myapp" start program = "/usr/local/bin/myapp" stop program = "/usr/bin/pkill myapp" if failed port 8080 protocol http then restart if cpu > 80% for 5 cycles then alert if 5 restarts within 5 cycles then timeout
For Docker deployments, use restart policies:
docker run --restart unless-stopped -d myapp-image
Kubernetes provides even more robust options through pod restart policies and liveness probes.
When you need custom logic, a simple bash watchdog works:
#!/bin/bash while true; do /usr/local/bin/myapp || { echo "Process crashed with status $?. Restarting..." >&2 sleep 5 } done
Consider these factors when selecting a solution:
- System architecture (legacy vs modern)
- Required monitoring granularity
- Infrastructure constraints
- Available operational expertise