When managing system services through traditional init.d scripts, ensuring process persistence becomes crucial for critical daemons. The fundamental issue arises when a monitoring daemon (like dtnd in your case) needs to maintain continuous operation while being properly integrated with the system's init system.
For systems using SysVinit, the standard method involves creating a proper init script that handles start, stop, and status operations. Here's a basic template for your dtnd service:
#!/bin/sh
### BEGIN INIT INFO
# Provides: dtnd
# Required-Start: $remote_fs $syslog
# Required-Stop: $remote_fs $syslog
# Default-Start: 2 3 4 5
# Default-Stop: 0 1 6
# Short-Description: Start DTN daemon
# Description: Daemon for monitoring and restarting processes
### END INIT INFO
PATH=/sbin:/usr/sbin:/bin:/usr/bin
NAME=dtnd
DAEMON=/usr/sbin/$NAME
DAEMON_ARGS=""
PIDFILE=/var/run/$NAME.pid
USER=dtnduser
case "$1" in
start)
echo "Starting $NAME"
start-stop-daemon --start --quiet --pidfile $PIDFILE \
--chuid $USER --exec $DAEMON -- $DAEMON_ARGS
;;
stop)
echo "Stopping $NAME"
start-stop-daemon --stop --quiet --pidfile $PIDFILE
;;
restart)
$0 stop
sleep 1
$0 start
;;
status)
status_of_proc -p $PIDFILE $DAEMON $NAME && exit 0 || exit $?
;;
*)
echo "Usage: /etc/init.d/$NAME {start|stop|restart|status}"
exit 1
;;
esac
exit 0
For ensuring your daemon stays alive, you have several technical approaches:
1. Using a Wrapper Script
Create a simple shell wrapper that restarts your daemon if it fails:
#!/bin/sh
while true; do
/usr/sbin/dtnd
sleep 10
done
Then modify your init script to launch this wrapper instead of the direct binary.
2. Leveraging inittab (if available)
For systems with inittab, add an entry like:
dtnd:2345:respawn:/usr/sbin/dtnd
3. Cron-based Monitoring
Create a cron job that checks and restarts the process:
*/5 * * * * root pgrep dtnd || /etc/init.d/dtnd restart
While the above solutions work, consider these more robust approaches if system modifications are possible:
- systemd: Create a proper service unit with Restart=always
- supervisord: Dedicated process control system
- monit: Advanced monitoring and restart capability
When implementing process resurrection:
- Ensure proper logging to detect restart loops
- Implement maximum restart attempts to prevent thrashing
- Consider resource limits to prevent memory leaks
- Add proper exit codes to distinguish between normal and error exits
For your specific constraints (init.d only), the wrapper script approach combined with proper init.d integration represents the most robust solution. The cron-based monitoring provides additional redundancy. Remember to:
- Test failure scenarios thoroughly
- Implement proper logging
- Document the resurrection mechanism
When managing persistent daemons through init.d scripts, the fundamental requirement is ensuring continuous operation even after unexpected crashes. The traditional approach using start-stop-daemon
alone doesn't provide resurrection capabilities.
For systems limited to System-V init scripts, we can implement watchdog functionality directly in the init script. Here's a robust template:
#!/bin/sh
### BEGIN INIT INFO
# Provides: dtnd-watchdog
# Required-Start: $local_fs $network
# Required-Stop: $local_fs $network
# Default-Start: 2 3 4 5
# Default-Stop: 0 1 6
# Short-Description: DTN daemon with auto-respawn
### END INIT INFO
DAEMON=/usr/sbin/dtnd
NAME=dtnd
USER=dtnuser
PIDFILE=/var/run/$NAME.pid
LOGFILE=/var/log/$NAME.log
MAX_RETRIES=5
RETRY_DELAY=5
start() {
if [ -f $PIDFILE ]; then
echo "$NAME is already running"
return 1
fi
echo "Starting $NAME watchdog"
start-stop-daemon --start --background --make-pidfile --pidfile $PIDFILE \
--chuid $USER --exec $DAEMON -- >> $LOGFILE 2>&1 &
# Start monitor process
(
retries=0
while [ $retries -lt $MAX_RETRIES ]; do
if ! pgrep -F $PIDFILE >/dev/null; then
((retries++))
echo "Process crashed, restarting attempt $retries" >> $LOGFILE
start-stop-daemon --start --background --make-pidfile --pidfile $PIDFILE \
--chuid $USER --exec $DAEMON -- >> $LOGFILE 2>&1
sleep $RETRY_DELAY
else
retries=0
sleep 10
fi
done
echo "Max retries reached for $NAME" >> $LOGFILE
) &
}
stop() {
echo "Stopping $NAME and watchdog"
start-stop-daemon --stop --pidfile $PIDFILE
rm -f $PIDFILE
}
case "$1" in
start)
start
;;
stop)
stop
;;
restart)
stop
start
;;
*)
echo "Usage: $0 {start|stop|restart}"
exit 1
;;
esac
For systems with minimal process supervision requirements, we can leverage cron as a fallback mechanism:
# In /etc/crontab
* * * * * root /usr/bin/test -f /var/run/dtnd.pid || /etc/init.d/dtnd start >/dev/null 2>&1
When implementing resurrection logic:
- Implement maximum restart attempts to prevent thrashing
- Include exponential backoff between restarts
- Log all restart events with timestamps
- Consider resource limits (memory, CPU) that might cause crashes
While maintaining compatibility with init.d, you can wrap the service in systemd (when available):
# /etc/systemd/system/dtnd.service
[Unit]
Description=DTN Daemon with Auto-Restart
[Service]
User=dtnuser
ExecStart=/usr/sbin/dtnd
Restart=always
RestartSec=5s
[Install]
WantedBy=multi-user.target
The init.d script can then delegate to systemd when present:
if type systemctl >/dev/null 2>&1; then
exec systemctl "$1" dtnd
else
# Original init.d logic
fi