How to Configure systemd for Automatic Service Restart After StartLimitInterval Threshold

When working with critical services in Linux, we often need them to automatically recover from failures. The standard systemd approach of using Restart=always with rate limiting works well for immediate restarts, but presents a gap when we want services to resume automatically after the rate limit window expires.

The standard configuration you've implemented:

[Service]
Restart=always
StartLimitInterval=90
StartLimitBurst=3

correctly limits restarts to 3 attempts within 90 seconds. However, systemd doesn't automatically attempt to restart the service after the interval expires - this is by design but not always the desired behavior.

To achieve persistent automatic restarts even after rate limiting, we need to modify our approach:

[Unit]
StartLimitIntervalSec=90
StartLimitBurst=3

[Service]
Restart=always
RestartSec=10

The key difference is moving the rate limiting parameters to the [Unit] section and adding RestartSec to control the delay between restart attempts.

For more sophisticated control, consider this enhanced configuration:

[Unit]
Description=My Resilient Service
StartLimitIntervalSec=90
StartLimitBurst=3

[Service]
ExecStart=/usr/bin/my-service
Restart=always
RestartSec=10
# Wait 30 seconds after start before considering it successful
TimeoutStartSec=30
# Allow 10 seconds for normal shutdown
TimeoutStopSec=10

This configuration adds proper timeouts and makes the service more robust against various failure scenarios.

After implementing these changes, verify the behavior:

# Reload systemd configuration
sudo systemctl daemon-reload

# Restart your service
sudo systemctl restart your-service.service

# Monitor the service journal
journalctl -u your-service.service -f

For even more control, you could implement a watchdog timer:

[Unit]
Description=Service with Watchdog
ConditionACPower=true

[Service]
ExecStart=/usr/bin/my-service
Restart=on-failure
WatchdogSec=30

[Install]
WantedBy=multi-user.target

This approach gives you additional monitoring capabilities beyond simple restart logic.

If you encounter problems:

Check systemctl status your-service for current state
Verify systemctl show your-service | grep StartLimit shows correct values
Ensure your service properly implements systemd's notification protocol if using advanced features

In production environments, consider:

Implementing proper logging for restart events
Setting up monitoring for repeated restart cycles
Considering higher-level orchestration tools if restart logic becomes too complex

When configuring resilient services with systemd, we often face a tricky situation where the service stops attempting restarts after hitting the StartLimitBurst threshold. While this prevents runaway restart loops, sometimes we need the service to automatically resume attempts after the cooldown period.

With this configuration:

[Service]
Restart=always
StartLimitInterval=90
StartLimitBurst=3

The service will:

Restart automatically on failure (good)
Stop after 3 quick failures (expected)
Not automatically resume after 90 seconds (problematic)

To achieve true automatic recovery after the rate limit window, we need to combine several directives:

[Service]
Restart=always
RestartSec=5s
StartLimitInterval=90
StartLimitBurst=3
StartLimitAction=none

[Unit]
StartLimitIntervalSec=90
StartLimitBurst=3

The critical addition is StartLimitAction=none which prevents systemd from taking any action when limits are hit. Combined with proper unit-level configuration, this allows the service to automatically resume attempts after the interval expires.

Here's a complete service file for a Node.js application:

[Unit]
Description=Node API Service
After=network.target
StartLimitIntervalSec=90
StartLimitBurst=3

[Service]
Type=simple
User=nodeuser
WorkingDirectory=/opt/node-app
ExecStart=/usr/bin/node index.js
Restart=always
RestartSec=10s
StartLimitInterval=90
StartLimitBurst=3
StartLimitAction=none
Environment=NODE_ENV=production

[Install]
WantedBy=multi-user.target

To test this configuration:

Force crash your service 3 times quickly
Check status with systemctl status yourservice
Wait 90+ seconds
Verify automatic restart attempts with journalctl: journalctl -u yourservice -f

For more complex scenarios, consider using a watchdog timer:

[Service]
WatchdogSec=30
Restart=on-watchdog

This provides additional monitoring between restart attempts.

ServerDevWorker

How to Configure systemd for Automatic Service Restart After StartLimitInterval Threshold

Related Articles