When a DC crashes in a multi-controller environment, follow this immediate action checklist:
# PowerShell: Verify DC replication status
Get-ADReplicationPartnerMetadata -Target * |
Select-Object Server, LastReplicationAttempt, LastReplicationResult |
Format-Table -AutoSize
# Check FSMO roles status
netdom query fsmo
If the crashed DC held any FSMO roles, you'll need to handle them appropriately:
# PowerShell: Seize roles when original DC is unrecoverable
Move-ADDirectoryServerOperationMasterRole -Identity "NEW_DC_NAME" -OperationMasterRole SchemaMaster, DomainNamingMaster, PDCEmulator, RIDMaster, InfrastructureMaster -Force
After confirming the DC won't be restored, perform metadata cleanup:
# Step 1: Remove server object
NTDSUTIL
metadata cleanup
connections
connect to server [GOOD_DC_NAME]
quit
select operation target
list domains
select domain [N]
list sites
select site [N]
list servers in site
select server [FAILED_DC_NAME]
quit
remove selected server
Ensure proper DNS configuration and replication:
# Check DNS SRV records
Get-DnsServerResourceRecord -ZoneName "yourdomain.com" -RRType "_ldap._tcp.dc._msdcs"
# Force immediate replication
Repadmin /syncall /AdeP
If the DC hosted Certificate Services:
certutil -dcinfo
certutil -getreg ca\CRLPublicationURLs
# Verify SYSVOL replication
dcdiag /test:sysvolcheck /test:advertising
# Check overall AD health
dcdiag /v /c /e
When a Domain Controller (DC) crashes in a multi-DC environment, the first critical step is determining whether this was a graceful shutdown or a catastrophic failure. Check Event Viewer logs on surviving DCs for relevant errors (Event IDs 1566, 1586, or 1988). The presence of these events helps identify replication issues before the crash.
Example PowerShell command to check replication status:
Get-ADReplicationFailure -Target * | Format-Table -AutoSize
For a proper cleanup of the failed DC, follow these steps:
- Confirm the DC is permanently unrecoverable
- Seize FSMO roles if necessary (using ntdsutil)
- Remove the DC object from Active Directory Sites and Services
Sample ntdsutil commands for FSMO role seizure:
ntdsutil
roles
connections
connect to server <surviving_DC>
q
seize infrastructure master
seize naming master
seize PDC
seize RID master
seize schema master
q
q
Failed DCs often leave stale DNS records that can cause authentication issues. Verify and clean up:
Get-DnsServerResourceRecord -ZoneName "yourdomain.com" -RRType "A" |
Where-Object {$_.HostName -like "FailedDC*"} |
Remove-DnsServerResourceRecord -ZoneName "yourdomain.com" -Force
If the failed DC was a Global Catalog server, ensure at least one remaining DC has this role. Check with:
Get-ADDomainController -Filter * | Select-Object Name,IsGlobalCatalog
After removing the failed DC, verify the replication topology:
repadmin /replsummary
repadmin /showrepl
repadmin /syncall /AdeP
If the DC hosted Certificate Services, additional steps are required:
certutil -getconfig
certutil -viewstore -restore My
Before considering the recovery complete, run these checks:
- DCDIAG /v /e /c on all remaining DCs
- Test user authentication across sites
- Verify Group Policy application