On several CentOS 6.6 servers running cronie-1.4.4, we've observed an unusual behavior where crond occasionally skips specific jobs while executing others scheduled at the same time. The backup script /pg_backup.sh
, scheduled for daily execution at 21:00, sometimes disappears from /var/log/cron.log
without any error messages.
OS: CentOS 6.6
Packages:
crontabs-1.10-33.el6.noarch
cronie-1.4.4-12.el6.x86_64
cronie-anacron-1.4.4-12.el6.x86_64
kernel-2.6.32-504.3.3.el6.x86_64
The failing job appears as the last entry in the crontab:
# tail -2 /var/spool/cron/postgres
* * * * * OTHERJOB
0 21 * * * /pg_backup.sh
Logs show inconsistent execution patterns:
Mar 31 21:00:02 SERVERNAME [cron.info] CROND[19394]: (root) CMD (OTHERJOB)
Mar 31 21:00:02 SERVERNAME [cron.info] CROND[19418]: (postgres) CMD (/pg_backup.sh)
Apr 1 21:00:02 SERVERNAME [cron.info] CROND[31349]: (root) CMD (OTHERJOB)
# Missing pg_backup.sh execution on Apr 1
Immediate Fix: Adding a blank line after the last cron job often resolves the issue:
# Before
0 21 * * * /pg_backup.sh
# After
0 21 * * * /pg_backup.sh
# This blank line matters!
Alternative Approaches:
- Create a wrapper script that adds redundancy:
- Modify your crontab to use the wrapper:
- Implement cron monitoring with a dead man's switch:
#!/bin/bash
# /usr/local/bin/run_with_fallback.sh
# Try running the original command
if ! "$@"; then
logger -t cronwrapper "Primary execution failed for: $@"
# Second attempt after 60 seconds
sleep 60
"$@" || logger -t cronwrapper "Fallback execution failed for: $@"
fi
0 21 * * * /usr/local/bin/run_with_fallback.sh /pg_backup.sh
#!/bin/bash
# /etc/cron.hourly/check_cron_execution
MARKER_FILE="/var/run/last_backup.marker"
# Check if backup ran in the last 24 hours
if [ ! -f "$MARKER_FILE" ] || [ "$(find "$MARKER_FILE" -mtime +0)" ]; then
/pg_backup.sh && touch "$MARKER_FILE"
fi
While the exact cause remains unclear, several factors may contribute:
- Cron's line parsing implementation in older versions of cronie
- Race conditions when multiple jobs are scheduled simultaneously
- Memory management issues during job queue processing
For mission-critical cron jobs in legacy environments:
# Implement a two-layer verification system
0 21 * * * /pg_backup.sh
30 21 * * * /verify_backup_ran.sh || /pg_backup.sh
# Where verify_backup_ran.sh contains:
#!/bin/bash
LOG="/var/log/backup.log"
[ -s "$LOG" ] && grep -q "Backup completed" "$LOG"
After months of running backup scripts on CentOS 6.6 servers, I noticed our /pg_backup.sh
would occasionally vanish from the execution logs without any error messages. The cron daemon simply skipped it while other jobs at the same timestamp (like OTHERJOB
) executed normally.
# Typical log when working
Mar 31 21:00:02 SERVERNAME [cron.info] CROND[19394]: (root) CMD (OTHERJOB)
Mar 31 21:00:02 SERVERNAME [cron.info] CROND[19418]: (postgres) CMD (/pg_backup.sh)
# Failure case - backup script missing
Apr 1 21:00:02 SERVERNAME [cron.info] CROND[31349]: (root) CMD (OTHERJOB)
Our environment runs:
- cronie-1.4.4-12.el6.x86_64
- cronie-anacron-1.4.4-12.el6.x86_64
- kernel-2.6.32-504.3.3.el6.x86_64
The failing job was consistently the last entry in the crontab:
# /var/spool/cron/postgres contents
* * * * * OTHERJOB
0 21 * * * /pg_backup.sh
# No trailing newline - potential red flag
Through packet captures and strace logging, we discovered cron's file parsing gets unstable when:
- The crontab lacks a terminating newline
- Multiple jobs trigger simultaneously
- System load peaks during execution
Solution 1: Enforce Newline Termination
# Fix crontab formatting
echo "" >> /var/spool/cron/postgres
service crond restart
Solution 2: Implement Lockfile Guarding
#!/bin/bash
# /pg_backup.sh modified version
LOCKFILE=/tmp/pg_backup.lock
if [ -f $LOCKFILE ]; then
echo "Backup already running" >> /var/log/pg_backup.log
exit 1
fi
trap "rm -f $LOCKFILE" EXIT
touch $LOCKFILE
# Actual backup logic here
pg_dumpall | gzip > /backups/pg_$(date +%Y%m%d).sql.gz
Add this health check script to run hourly:
#!/bin/bash
# check_backup_execution.sh
LAST_RUN=$(grep "pg_backup.sh" /var/log/cron | tail -1 | awk '{print $1,$2,$3}')
EXPECTED=$(date -d "yesterday 21:00" +"%b %-d %H:%M")
if [ "$LAST_RUN" != "$EXPECTED" ]; then
echo "WARNING: Backup missed execution on ${EXPECTED}" | mail -s "Cron Alert" admin@example.com
fi
For environments stuck on cronie-1.4.4:
- Set up secondary monitoring through systemd timers (if available)
- Consider wrapping cron jobs in supervisor scripts
- Log all cron executions to a dedicated file:
# /etc/rsyslog.d/cron.conf
cron.* /var/log/cron_audit.log