Understanding and Troubleshooting “holdoff time over” in systemd Services for PuppetDB


2 views

When you see the message "holdoff time over" in systemd logs, it indicates that the service restart cooldown period has expired. This is part of systemd's service restart throttling mechanism to prevent rapid-fire restart cycles.

# Example log entry:
Sep 03 20:50:16 l-pm1 systemd[1]: pe-puppetdb.service holdoff time over, scheduling restart.
Sep 03 20:50:16 l-pm1 systemd[1]: Starting pe-puppetdb Service...

Puppet Enterprise services like PuppetDB may trigger this behavior when:

  • Dependent services aren't fully initialized
  • Database connections timeout during startup
  • Resource constraints prevent immediate startup

Check your service unit file for these critical directives:

# View the complete service configuration
systemctl cat pe-puppetdb.service

# Key parameters to examine:
[Service]
Restart=on-failure
RestartSec=5s  # Default holdoff period
StartLimitInterval=100s
StartLimitBurst=5

For Puppet Enterprise specifically, try these adjustments:

# Create override file
sudo systemctl edit pe-puppetdb.service

# Add these customizations:
[Service]
RestartSec=10s
TimeoutStartSec=300
Environment="JAVA_ARGS=-Xmx2g -XX:MaxPermSize=256m"

Use these commands to investigate deeper issues:

# Check service status with details
systemctl status pe-puppetdb.service -l

# Examine journal logs with timestamps
journalctl -u pe-puppetdb.service --since "1 hour ago" --no-pager

# Verify dependencies
systemctl list-dependencies pe-puppetdb.service

For production environments, consider this complete unit file override:

[Unit]
After=pe-postgresql.service
Requires=pe-postgresql.service
StartLimitIntervalSec=300
StartLimitBurst=10

[Service]
Restart=always
RestartSec=15s
TimeoutStartSec=300
ExecStartPre=/opt/puppetlabs/bin/puppetdb check
EnvironmentFile=/etc/sysconfig/pe-puppetdb

Create a watchdog script to track restart frequency:

#!/bin/bash
SERVICE="pe-puppetdb.service"
LOG_FILE="/var/log/puppetdb_restarts.log"

while true; do
    restarts=$(journalctl -u $SERVICE --since "5 minutes ago" | grep -c "holdoff time over")
    if [ $restarts -gt 3 ]; then
        echo "$(date) - Excessive restarts detected ($restarts)" >> $LOG_FILE
        systemctl status $SERVICE >> $LOG_FILE
    fi
    sleep 300
done

When you see messages like pe-puppetdb.service holdoff time over, scheduling restart, this indicates systemd's restart throttling mechanism in action. By default, systemd imposes a 100ms delay between service restart attempts to prevent rapid cycling of failing services.

# View current service configuration
systemctl show pe-puppetdb.service | grep -i restart
Restart=on-failure
RestartSec=100ms

Common scenarios causing PuppetDB to hit holdoff limits:

  • Dependency services (PostgreSQL) not ready
  • Insufficient JVM heap memory allocation
  • Port conflicts (8080/8081 commonly used)
  • Certificate authentication failures

Check detailed logs with journalctl:

journalctl -u pe-puppetdb --no-pager -n 50
# Or for time-filtered logs:
journalctl -u pe-puppetdb --since "2023-09-03 20:50:00" --until "2023-09-03 20:51:00"

For services that need longer initialization:

# Create override file
sudo systemctl edit pe-puppetdb.service

[Service]
# Increase restart delay to 5 seconds
RestartSec=5s
# Extend timeout for slow-starting services
TimeoutStartSec=300

For Puppet Enterprise services, consider:

# Reconfigure PuppetDB with proper memory settings
puppet infrastructure configure --no-recover
# Or for manual config:
vi /etc/sysconfig/pe-puppetdb
# Set JAVA_ARGS with proper -Xmx values

When standard fixes don't work:

# Check service dependencies
systemctl list-dependencies pe-puppetdb.service

# Verify socket activation status
systemctl status pe-puppetdb.socket

# Test manual startup with debug
sudo -u pe-puppetdb /opt/puppetlabs/server/bin/puppetdb foreground
  • Implement proper service ordering in systemd unit files
  • Configure health checks in Puppet Enterprise
  • Monitor for Java memory issues
  • Set appropriate StartLimitInterval in systemd