When monitoring our HP ProLiant DL380 G7 with dual Xeon X5650 CPUs (6 cores + HT = 24 logical processors), Windows Task Manager shows decent overall CPU utilization (~30-40%) but one logical processor consistently maxed out at 100%. Standard monitoring tools like PerfMon only reveal the System process as the apparent culprit.
Traditional tools often miss kernel-level contention. Let's use Event Tracing for Windows (ETW) to capture detailed processor usage:
# PowerShell command to start 60-second kernel scheduler trace
wpr -start CPU -start NTKernel -filemode -recordtempto C:\Traces
Start-Sleep -Seconds 60
wpr -stop C:\Traces\KernelTrace.etl
Process the ETL file with Windows Performance Analyzer (WPA). Look for:
- DPC/ISR activity spikes on the saturated core
- Spin lock contention in kernel stacks
- Processor affinity settings forcing work to one core
For virtualized workloads, we often see:
// Example of problematic CPU affinity in C#
Process.GetCurrentProcess().ProcessorAffinity = (IntPtr)0x00000001;
// This pins to first logical processor
Other frequent offenders:
- NT Kernel & System handling asymmetric interrupts
- Storage drivers with poor multi-queue support
- Legacy applications using GetThreadContext/SetThreadContext
Use perfmon to monitor these counters:
- \System\Processor Queue Length
- \Processor(*)\% Privileged Time
- \System\Context Switches/sec
For a SQL Server exhibiting this behavior, we modified the affinity64 mask
:
-- T-SQL to spread load across NUMA nodes
ALTER SERVER CONFIGURATION
SET PROCESS AFFINITY CPU = AUTO;
Some valid cases for single-core saturation:
- Hardware interrupts routed to specific cores (common with NICs)
- Real-time audio processing threads
- Certain cryptographic operations with sequential dependencies
When examining your HP ProLiant DL380 G7 with dual 6-core Xeons (HT-enabled), a single logical CPU hitting 100% while others remain idle typically indicates:
// Common architectural suspects:
1. Hardware interrupt routing (IRQ affinity)
2. Kernel-mode driver spinlock contention
3. NUMA node memory access patterns
4. Scheduled task/core affinity misconfiguration
Standard monitoring tools often fail to reveal the true offender. Try these PowerShell commands for deeper inspection:
# Real-time thread-level analysis
Get-Counter '\Process(*)\% Processor Time' -Continuous |
Where-Object {$_.CounterSamples.CookedValue -gt 90}
# Kernel stack sampling (requires admin)
wpr -start CPU -start NTKernel -filemode & sleep 10 & wpr -stop C:\kernel_trace.etl
# Check IRQ distribution
Get-WmiObject Win32_PerfFormattedData_Counters_Interrupts |
Format-Table Name, InterruptsPersec, DPCsPersec -AutoSize
For Hyper-V hosts (common on ProLiants), try modifying the VM scheduler:
# Force round-robin core distribution (PowerShell)
Set-VMProcessor -VMName * -CompatibilityForOlderOperatingSystemsEnabled $false
# BIOS-level fixes:
1. Disable "Turbo Core" in HP iLO
2. Set "NUMA Group Size Optimization" to Flat
3. Update SPP to latest (2023.10.0 or newer)
A recent client had identical symptoms due to:
-- Bad query plan stuck in parallelized loop
DBCC TRACEON (8649, -1) -- Force parallel query off
ALTER DATABASE SCOPED CONFIGURATION
SET MAXDOP = 12 -- Half logical cores
Create a live dump of the overloaded CPU context:
cd C:\Program Files (x86)\Windows Kits\10\Debuggers\x64
kd -kl -c "!runaway 7; !threadpool; !irql; !running -it; q" -logo C:\dump.txt
Key things to analyze in the output:
- DPC/ISR counts per processor
- Thread migration history
- Spinlock acquisition attempts