When dealing with nonpaged pool memory leaks on Windows servers, the challenge often lies in identifying which specific application or driver is responsible. In this case, we've observed HTTP 503 errors and connection refusals traced back to nonpaged pool exhaustion, with Poolmon.exe indicating significant allocations under the 'Even' tag.
Beyond basic Poolmon usage, here are more sophisticated approaches:
// PowerShell script to monitor nonpaged pool usage by process
Get-Counter '\Process(*)\Pool Nonpaged Bytes' -Continuous |
Where-Object {$_.CounterSamples.CookedValue -gt 1048576} |
Format-Table -AutoSize
For unknown driver tags like [< unknown >Event objects], we can use Windbg (even without full installation):
!poolused 2 // Show nonpaged pool usage sorted by tag
!poolfind Even // Find specific allocations with our tag
Create a custom ETW session to track pool allocations:
xperf -start PoolSession -on PoolTrace -BufferSize 1024 -MinBuffers 128 -MaxBuffers 256
xperf -stop PoolSession -d PoolTrace.etl
xperf -i PoolTrace.etl -o PoolTrace.txt -a pool
Consider these specialized tools:
- RAMMap from Sysinternals
- WPA (Windows Performance Analyzer)
- Driver Verifier in special pool mode
For live servers where process termination isn't an option:
- Set up performance counters to log pool usage over time
- Use kernel debugging remotely
- Implement controlled service restarts during maintenance windows
When analyzing the 'Even' tag specifically:
Metric | Value | Interpretation |
---|---|---|
Allocs | 51,231,806 | High allocation frequency |
Diff | 684,922 | Significant leak rate |
Per Alloc | 48 bytes | Small, frequent allocations |
When your production server starts throwing 503 errors due to nonpaged pool exhaustion, you know you're in for some serious debugging. The process isn't straightforward, but through systematic investigation we can identify the culprit.
In our case, the smoking gun was found in httperr.log showing numerous Connections_Refused errors. Poolmon.exe revealed the problematic memory allocation tag:
Memory Tag Analysis:
Tag Type Allocs Frees Diff Bytes Per Alloc
Even Nonp 51,231,806 50,633,533 684,922 32,878,688 48
Beyond Poolmon, these tools provide crucial insights:
# PowerShell command to track nonpaged pool usage
Get-Counter '\Memory\Pool Nonpaged Bytes' -Continuous
# Process Explorer alternative approach:
# 1. Download from Sysinternals
# 2. Add "Nonpaged Pool" column
# 3. Sort processes by memory usage
When basic tools aren't enough, these methods can help:
// Kernel debugger approach (requires Windows SDK)
!poolused 2 // Shows nonpaged pool usage
!poolfind Even // Find allocations with specific tag
// Driver verification method
verifier /flags 0x01 /driver *.sys
Based on our experience resolving this issue:
- Monitor Pool Nonpaged Bytes over time (minimum 48 hours)
- Check for driver updates, especially network-related
- Review recent system changes/updates
- Consider third-party driver memory analyzers
To avoid future occurrences:
# Registry tweak to increase nonpaged pool size
Set-ItemProperty -Path "HKLM:\SYSTEM\CurrentControlSet\Control\Session Manager\Memory Management" -Name "NonPagedPoolSize" -Value 0xFFFFFFFF
Remember that rebooting (as we initially did) only provides temporary relief. The real solution requires identifying and fixing the underlying leak source.