We've encountered a critical Windows Time-Service issue affecting multiple server versions (2016/2019/2022) where system clocks suddenly jump forward 55+ days, then partially correct themselves through chaotic rollback sequences. This creates cascading failures in:
- Kerberos authentication (typical error:
KRB_AP_ERR_SKEW
) - Database transaction timestamps
- Log aggregation systems
- SSL certificate validation
// Event sequence from our monitoring:
2023-04-15T14:32:18Z - Clock jumps to 2023-06-09
2023-04-15T14:32:33Z - Time-Service attempts -4454176s correction (fails)
2023-04-15T14:47:18Z - Secondary jump to +12h26m43s offset
2023-04-15T14:47:33Z - Successful correction within threshold
After eliminating common suspects:
- VMware Tools: Disabled time synchronization (
tools.syncTime = "0"
) - NTP Configuration: Verified
w32tm /query /configuration
- Hardware Clocks: Cross-verified with
Get-WmiObject Win32_UTCTime
# PowerShell watchdog (run as scheduled task every 5 minutes)
$maxDelta = New-TimeSpan -Minutes 5
$current = Get-Date
$ntpTime = try { (w32tm /stripchart /computer:pool.ntp.org /dataonly /samples:1)[-1] -replace '.*: ' } catch { $current }
if ([datetime]$ntpTime - $current -gt $maxDelta) {
Stop-Service w32time
w32tm /unregister
w32tm /register
Start-Service w32time
w32tm /resync
Write-EventLog -LogName System -Source "TimeService" -EntryType Error -EventId 1001 -Message "Clock drift detected and reset"
}
The bug appears in the Windows Time-Service's drift compensation algorithm when:
- Processing large time differences (> 2^32 microseconds)
- Interacting with virtualized hardware clocks
- During daylight saving transitions
Add these registry keys to limit aggressive corrections:
Windows Registry Editor Version 5.00
[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\W32Time\Config]
"MaxNegPhaseCorrection"=dword:0000012c
"MaxPosPhaseCorrection"=dword:0000012c
"MaxAllowedPhaseOffset"=dword:0000000f
Sample Prometheus alert rule for time drift detection:
groups:
- name: time_monitoring
rules:
- alert: WindowsTimeDrift
expr: abs(time() - windows_time_ntp_offset_seconds{job="windows"}) > 300
for: 2m
labels:
severity: critical
annotations:
summary: "{{ $labels.instance }} has clock drift >5 minutes"
We've encountered a particularly nasty Windows Time Service issue where domain-joined servers suddenly jump forward in time (55 days in our case), then partially correct themselves in erratic patterns. This behavior was observed across:
- Windows Server 2016 (twice on same machine)
- Windows Server 2019 (initial observation)
- Independent reports of Server 2022 exhibiting similar behavior
Here's what happens during an incident (based on our logging):
1. [T+0] Clock suddenly jumps forward (e.g., 2023-01-01 when actual date is 2022-08-10) 2. [T+15s] Time Service detects discrepancy with DC, attempts correction 3. [T+15m] Second time jump occurs (smaller delta) 4. [T+15s] Final correction within acceptable threshold
We've ruled out several potential causes through extensive testing:
# Verify NTP sources
w32tm /query /peers
# Check time service configuration
w32tm /query /configuration
# Examine time service debug logs (requires registry tweak)
reg add HKLM\SYSTEM\CurrentControlSet\Services\W32Time\Config /v FileLogEntries /t REG_DWORD /d 0-300 /f
reg add HKLM\SYSTEM\CurrentControlSet\Services\W32Time\Config /v FileLogName /t REG_SZ /d "C:\time.log" /f
While we confirmed this occurs on physical hardware too, for VMware environments:
# Ensure proper time sync configuration
vim-cmd hostsvc/advopt/view CpxUseHvTimer
# Critical VMware tools settings
<vmx>
tools.syncTime = "0"
time.synchronize.continue = "FALSE"
time.synchronize.restore = "FALSE"
time.synchronize.resume.disk = "FALSE"
</vmx>
Until Microsoft provides a proper fix, we've implemented these measures:
:: PowerShell watchdog script
while ($true) {
$delta = [math]::Abs((Get-Date) - (Get-WmiObject Win32_OperatingSystem).LastBootUpTime).TotalDays
if ($delta -gt 30) {
Restart-Service W32Time
w32tm /resync
Send-MailMessage -To admin@domain.com -Subject "Time drift detected" -Body "Delta: $delta days"
}
Start-Sleep -Seconds 300
}
These settings have reduced (but not eliminated) occurrences:
Windows Registry Editor Version 5.00
[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\W32Time\Config]
"MaxNegPhaseCorrection"=dword:ffffffff
"MaxPosPhaseCorrection"=dword:ffffffff
"PhaseCorrectRate"=dword:00000001
"PollAdjustFactor"=dword:00000005
"SpikeWatchPeriod"=dword:00000384
[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\W32Time\TimeProviders\NtpClient]
"SpecialPollInterval"=dword:00000e10
Our case history with Microsoft Support:
- 2022-09-15: Initial case opened (Reference#: SRX14589234)
- 2022-10-03: Acknowledged as known issue, no ETA for fix
- 2022-11-17: Suggested workarounds (registry tweaks above)
- 2023-01-05: Case still open, no root cause identified