When diagnosing memory pressure on Linux systems, several key metrics reveal thrashing behavior:
# Basic vmstat output (1 second intervals)
vmstat 1
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
1 2 24580 102340 45288 654320 120 340 1024 880 1234 5678 12 8 70 10 0
Critical columns to monitor:
- si/so: Swap-ins/swap-outs per second (persistent values > 0 indicate thrashing)
- wa: CPU wait time for I/O (high values suggest disk contention)
- bi/bo: Block input/output operations (shows disk activity spikes)
For deeper investigation, combine these tools:
# Comprehensive memory snapshot
sudo grep -E '^(Swap|MemFree|MemTotal|Buffers|Cached):' /proc/meminfo
# Per-process swap usage
sudo smem -t -k -s swap
# Disk I/O pressure metrics
iostat -xmt 1
This Bash script monitors thrashing conditions:
#!/bin/bash
THRESHOLD=50 # % of swap usage triggering alert
while true; do
swap_used=$(free | awk '/Swap/{printf "%.0f", $3/$2*100}')
[ "$swap_used" -ge "$THRESHOLD" ] && \
echo "[$(date)] Thrashing detected! Swap usage: $swap_used%" >> /var/log/thrash_monitor.log
# Capture vmstat snapshot
vmstat 1 5 >> /var/log/vmstat_snapshots.log
sleep 30
done
The sysstat package provides long-term trends:
# Generate memory usage report for today
sar -r -f /var/log/sa/sa$(date +%d)
# Sample output:
# Linux 5.4.0-135-generic (host) 01/15/2023 _x86_64_ (8 CPU)
# 12:00:01 AM kbmemfree kbmemused %memused kbbuffers kbcached kbcommit %commit
# 12:10:01 AM 324876 7903124 96.05 92308 2123456 8234567 102.34
Key tuning parameters in /etc/sysctl.conf:
vm.swappiness = 10 # Reduce tendency to swap
vm.vfs_cache_pressure = 50 # Balance between inode/dentry cache and page cache
vm.dirty_ratio = 20 # Limit dirty pages before forcing writeback
vm.dirty_background_ratio = 10
For production systems, implement this node_exporter query:
# PromQL for swap activity
100 * (rate(node_vmstat_pswpin[1m]) + rate(node_vmstat_pswpout[1m]))
Create dashboards tracking:
- Swap usage % over time
- Major page fault rate
- Disk I/O queue length
- OOM killer events
When your Linux system starts thrashing, you'll typically notice:
- Severe performance degradation - High disk I/O activity (constantly blinking disk LED) - System becomes unresponsive to commands - High CPU wait times (seen in top/htop as %wa)
The Linux ecosystem provides several powerful tools for memory analysis:
vmstat - The Classic Approach
Run vmstat with a sampling interval (in seconds):
vmstat 1 10 # Sample 10 times at 1-second intervals
Key columns to monitor:
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu----- r b swpd free buff cache si so bi bo in cs us sy id wa st
Red flags:
- High 'b' (blocked processes) - High 'si'/'so' (swap in/swap out) - High 'wa' (I/O wait percentage)
sar - Historical Perspective
Install sysstat package for comprehensive historical data:
sudo apt install sysstat # Debian/Ubuntu sudo yum install sysstat # RHEL/CentOS
View memory statistics:
sar -r 1 3 # Memory utilization sar -B 1 3 # Paging statistics sar -S 1 3 # Swap utilization
htop - Visual Monitoring
For a more interactive view:
sudo apt install htop htop
Look for:
- Memory bars showing high swap usage - Processes with high RES/VSZ ratios - Red-colored memory indicators
Using pidstat for Process-Level Analysis
Monitor individual process memory behavior:
pidstat -r -p ALL 1 # Memory statistics per process pidstat -d 1 # Disk I/O per process
Custom Monitoring Script
Create a bash script for periodic checks:
#!/bin/bash while true; do echo "===== $(date) =====" free -h echo "--- Top 5 memory consumers ---" ps -eo pid,ppid,cmd,%mem,%cpu --sort=-%mem | head -6 echo "--- Swap activity ---" grep -i swap /proc/vmstat | grep -v " 0" sleep 5 done
Thresholds indicating potential thrashing:
- Swap usage > 30% of total memory - si/so values consistently > 1000 pages/sec - wa (I/O wait) > 20% for extended periods - More than 10% of processes in 'D' state (uninterruptible sleep)
Once identified, consider these adjustments:
1. Increase swappiness (temporary fix): sudo sysctl vm.swappiness=10 2. Identify and kill memory-hog processes 3. Add more physical memory 4. Optimize application memory usage 5. Consider using zswap or zram for compression
Add these to /etc/sysctl.conf for long-term stability:
vm.swappiness = 10 vm.vfs_cache_pressure = 50 vm.dirty_background_ratio = 5 vm.dirty_ratio = 10
Apply changes immediately:
sudo sysctl -p