Understanding Linux Memory Paging Metrics: Analyzing sar -B Output for System Performance Optimization


2 views

The sar -B command provides crucial insights into Linux memory management behavior. These metrics help diagnose system performance bottlenecks related to memory pressure and disk I/O:

pgpgin/s   - KB/s paged in from disk (includes swap & file-backed pages)
pgpgout/s  - KB/s paged out to disk (includes swap & file-backed pages)  
fault/s    - Total page faults (major + minor) per second
majflt/s   - Major faults requiring disk I/O per second

While related, paging and swapping represent different memory operations:

  • Paging refers to moving individual memory pages between RAM and disk (both swap and file-backed pages)
  • Swapping specifically means moving entire processes between RAM and swap space

The sar -B output measures page-level activity, which includes both swap-related operations and regular file-backed memory operations.

Major faults occur when a process attempts to access memory that isn't in RAM and must be loaded from disk. Consistently high majflt/s values indicate:

01:40:02 AM  pgpgin/s pgpgout/s   fault/s  majflt/s
01:40:02 AM     42.89    227.22   4319.88      0.02  # Normal
01:50:06 AM    214.46    441.33   4760.78      5.00  # Problematic

Reasons for excessive major faults include:

  • Insufficient physical memory (OOM situations)
  • Memory leaks causing constant paging
  • Improper swappiness settings
  • Disk I/O bottlenecks

When troubleshooting high paging activity:

# Check current memory usage
free -h

# Monitor processes generating most page faults
ps -eo pid,comm,min_flt,maj_flt --sort=-maj_flt | head

# Adjust swappiness (0-100)
echo 10 > /proc/sys/vm/swappiness

For a server showing sustained high majflt/s:

# 1. Identify problematic processes
$ sar -B 1 5 | grep -v "Average"
12:00:01 AM  pgpgin/s pgpgout/s   fault/s  majflt/s
12:00:02 AM   1204.55   5022.76  15109.80     15.01

# 2. Cross-reference with process stats
$ ps -eo pid,comm,min_flt,maj_flt --sort=-maj_flt | head -n 5
PID   COMMAND         MINFLT MAJFLT
1234  java            45023   142
5678  mysqld          32012    98

# 3. Check memory configuration
$ grep -i swappiness /etc/sysctl.conf
vm.swappiness = 60  # Consider reducing to 10-30 for database servers

Here's a bash script to monitor and alert on paging activity:

#!/bin/bash
THRESHOLD=5  # majflt/s threshold
INTERVAL=60  # seconds

while true; do
    MAJFLT=$(sar -B 1 1 | awk 'END {print $5}')
    if (( $(echo "$MAJFLT > $THRESHOLD" | bc -l) )); then
        echo "$(date): High major faults detected - $MAJFLT majflt/s"
        ps -eo pid,comm,min_flt,maj_flt --sort=-maj_flt | head -n 5
    fi
    sleep $INTERVAL
done

The sar -B command provides crucial insights into Linux memory management through several key metrics:

pgpgin/s   - KB/s paged in from disk (including swap)
pgpgout/s  - KB/s paged out to disk (including swap)  
fault/s    - Total page faults (minor + major) per second
majflt/s   - Major faults requiring disk I/O per second

These metrics capture both traditional paging and swap activity:

  • Regular paging: Moving memory pages between RAM and disk (non-swap filesystems)
  • Swap activity: Specifically when pages move to/from swap partitions/files

Example code to identify swap usage:

#!/bin/bash
# Check if paging is going to swap
awk '/pgpgin/ {print "Paging in:", $2, "KB/s"}
     /pgpgout/ {print "Paging out:", $3, "KB/s"}
     /swp/ {print "Swap usage:", $4, "KB/s"}' /proc/vmstat

Consistent majflt/s values indicate performance problems because:

  • Each major fault requires disk I/O (10,000x slower than RAM access)
  • High frequency suggests insufficient physical memory
  • Leads to CPU stall cycles waiting for I/O

When analyzing this sample output:

12:00:08 AM  pgpgin/s pgpgout/s   fault/s  majflt/s
12:10:05 AM    207.55   2522.76   5109.80      0.01

We can deduce:

  1. Moderate paging activity (207KB/s in, 2.5MB/s out)
  2. Nearly all faults are minor (only 0.01 majflt/s)
  3. System likely has sufficient RAM

Python script to monitor processes causing major faults:

import psutil

def check_major_faults():
    for proc in psutil.process_iter(['pid', 'name', 'num_maj_faults']):
        if proc.info['num_maj_faults'] > 100:  # threshold
            print(f"PID {proc.info['pid']} ({proc.info['name']}): "
                  f"{proc.info['num_maj_faults']} major faults")

while True:
    check_major_faults()
    time.sleep(5)

When observing high paging activity:

  • Increase vm.swappiness if swap is underutilized
  • Decrease vm.swappiness if swap is overused
  • Add physical RAM if major faults persist
  • Verify application memory usage patterns

Example sysctl adjustment:

# Check current swappiness
sysctl vm.swappiness

# Temporarily set to 10 (default 60)
sysctl -w vm.swappiness=10

# Make permanent
echo "vm.swappiness = 10" >> /etc/sysctl.conf