How to Configure Windows Service Recovery Actions for Automatic Restart on Failure


3 views

Many developers encounter situations where their Windows services fail but don't restart automatically, despite configuring recovery actions. This happens across all Windows versions (XP, Server, Vista, and newer). The service simply stops when encountering an error, leaving applications dependent on it in a broken state.

The often-missed solution lies in the "Enable Actions For Stops With Errors" checkbox in the service properties. This setting must be enabled for recovery actions to trigger when the service stops due to errors (which is the most common failure scenario).

Here's how to properly set up service recovery through both GUI and programmatic methods:

GUI Configuration

  1. Open Services.msc
  2. Right-click your service → Properties
  3. Navigate to the Recovery tab
  4. Check "Enable Actions For Stops With Errors"
  5. Set First failure: Restart the Service
  6. Set Second failure: Restart the Service
  7. Set Subsequent failures: Restart the Service
  8. Set restart delay (recommended: 1 minute)

Programmatic Configuration (C# Example)

using System;
using System.ServiceProcess;

public class ServiceRecoveryConfigurator
{
    public static void ConfigureServiceRecovery(string serviceName)
    {
        using (var sc = new ServiceController(serviceName))
        {
            var manager = new System.Management.ManagementObject(
                $"Win32_Service.Name='{serviceName}'");
                
            // Enable recovery options
            var parameters = new object[11];
            parameters[5] = true;  // Enable actions for stops with errors
            parameters[6] = "Restart";  // First failure
            parameters[7] = 60000;  // First failure delay (ms)
            parameters[8] = "Restart";  // Second failure
            parameters[9] = 60000;  // Second failure delay (ms)
            parameters[10] = "Restart"; // Subsequent failures
            
            manager.InvokeMethod("Change", parameters);
        }
    }
}

For automation scenarios, PowerShell provides powerful configuration options:

$service = Get-WmiObject -Class Win32_Service -Filter "Name='YourServiceName'"
$service.Change(
    $null, $null, $null, $null, $null, $null,
    $true,  # Enable actions for stops with errors
    "Restart",  # First failure action
    60000,  # First failure delay (ms)
    "Restart",  # Second failure action
    60000,  # Second failure delay (ms)
    "Restart"  # Subsequent failures action
)

After setting up recovery actions, test them by:

  1. Manually stopping the service (net stop YourService)
  2. Using Task Manager to kill the process
  3. Simulating a crash (throw unhandled exception in code)
  • Services that stop too quickly (less than 1 second) may not trigger recovery
  • Multiple rapid failures may trigger service marked as "failed" state
  • Account permissions may prevent proper restart (use Local System account)

Configure event log entries to track service recovery attempts:

EventLog.WriteEntry("Application", 
    "Service recovery triggered", 
    EventLogEntryType.Information, 
    7036);

For years, I've deployed countless custom Windows services across various OS versions (XP through Server editions), always configuring the recovery options to restart on failures. Yet mysteriously, services that crash would simply remain stopped. The solution turned out to be hiding in plain sight - the often-overlooked "Enable Actions For Stops With Errors" checkbox.

Windows provides three levels of recovery actions that trigger sequentially:

First failure: Restart the Service
Second failure: Restart the Service
Subsequent failures: Take no action

But this sequence only executes when:

  • The service terminates unexpectedly (not graceful stops)
  • The checkbox for error actions is enabled
  • The failure counter resets properly (default: 1 day)

Here's how to properly set this up programmatically using PowerShell:

# Configure recovery for MyCustomService
$service = "MyCustomService"
$action1 = @{Type="Restart"; Delay="60000"}  # 1 minute delay
$action2 = @{Type="Restart"; Delay="60000"}
$action3 = @{Type="None"}

# Critical: Enable error action processing
sc.exe failure $service reset= 86400 actions= restart/60000/restart/60000/none/0
sc.exe failureflag $service 1

# Alternative via WMI
$svc = Get-WmiObject -Class Win32_Service -Filter "Name='$service'"
$svc.Change($null,$null,$null,$null,$null,$null,$null,$null,$null,$null,$null,1)

1. Permission Issues: The service account needs appropriate rights. Add to Local Security Policy:

SeServiceLogonRight - For the service account
SeIncreaseQuotaPrivilege - Often required

2. Error Reporting Conflicts: Disable WerSvc (Windows Error Reporting) if it interferes:

Stop-Service WerSvc -Force
Set-Service WerSvc -StartupType Disabled

3. Custom Error Handling: For .NET services, implement proper unhandled exception handling:

protected override void OnStart(string[] args)
{
    AppDomain.CurrentDomain.UnhandledException += (sender, e) => 
    {
        EventLog.WriteEntry("Service crashed: " + e.ExceptionObject.ToString(), 
                          EventLogEntryType.Error);
        Environment.Exit(1); // Forces error status
    };
    // Main service logic
}

For mission-critical services, combine with external monitoring:

# Sample watchdog script
while ($true) {
    $svc = Get-Service "MyCustomService"
    if ($svc.Status -ne "Running") {
        Start-Service $svc -ErrorAction SilentlyContinue
        if ($LASTEXITCODE -ne 0) {
            Send-MailMessage -To "admin@domain.com" -Subject "Service recovery failed"
        }
    }
    Start-Sleep -Seconds 30
}

To test your configuration:

  1. Forcefully terminate the service process (taskkill /f /pid [PID])
  2. Check Event Viewer (Application logs) for SCM entries
  3. Verify the service recovery counter increments properly

The key insight? That obscure checkbox actually serves as the master switch for Windows to distinguish between intentional stops and crash scenarios. Now our services reliably resurrect themselves after failures.