Linux OOM Prevention: How to Maintain System Responsiveness During Memory Exhaustion


2 views

When Linux encounters severe memory pressure, the kernel's Out-of-Memory (OOM) killer activates to terminate processes. However, this happens after the system becomes unresponsive. Here's what actually occurs:


struct task_struct *select_bad_process(void) {
    // Kernel source snippet showing OOM selection logic
    for_each_process(p) {
        points = badness(p, uptime.tv_sec);
        if (points > most_points) {
            most_points = points;
            chosen = p;
        }
    }
    return chosen;
}

Add these settings to /etc/sysctl.conf for better OOM handling:


vm.overcommit_memory = 2
vm.overcommit_ratio = 80
vm.panic_on_oom = 0
kernel.panic = 10

The overcommit_memory=2 prevents memory overallocation, while panic_on_oom=0 allows the OOM killer to work instead of panicking.

Reserve resources for critical system processes using cgroups v2:


#!/bin/bash
mkdir /sys/fs/cgroup/rescue
echo "50M" > /sys/fs/cgroup/rescue/memory.max
echo "1000" > /sys/fs/cgroup/rescue/cpu.weight

# Add critical processes
echo $(pidof systemd) > /sys/fs/cgroup/rescue/cgroup.procs
echo $(pidof sshd) > /sys/fs/cgroup/rescue/cgroup.procs

Linux 4.20+ includes PSI metrics. Monitor them with:


# Install required tools
apt install stress-ng

# Monitor memory pressure
cat /proc/pressure/memory

# Sample output showing pressure levels
some avg10=5.24 avg60=2.13 avg300=0.72 total=12456543
full avg10=1.45 avg60=0.67 avg300=0.22 total=7843251

Create a watchdog script to terminate memory-hungry processes before OOM:


#!/usr/bin/python3
import psutil, os

MAX_MEM_PERCENT = 85  # Trigger threshold

def check_memory():
    mem = psutil.virtual_memory()
    if mem.percent >= MAX_MEM_PERCENT:
        for proc in psutil.process_iter(['pid', 'name', 'memory_percent']):
            try:
                if proc.info['memory_percent'] > 5:  # Target processes using >5% memory
                    os.kill(proc.info['pid'], 9)
            except (psutil.NoSuchProcess, psutil.AccessDenied):
                continue

while True:
    check_memory()
    time.sleep(5)

Add these boot parameters to /etc/default/grub:


GRUB_CMDLINE_LINUX="sysrq_always_enabled=1 
                    oom_score_adj=-1000 
                    transparent_hugepage=never
                    mitigations=off"

The sysrq_always_enabled allows emergency keyboard shortcuts even during freezes.

For MySQL servers, combine these techniques:


# Create dedicated cgroup
cgcreate -g memory:mysql_reserved
cgset -r memory.low=2G mysql_reserved
cgset -r memory.high=4G mysql_reserved

# Apply OOM score adjustment
echo -17 > /proc/$(pidof mysqld)/oom_score_adj

# Configure swappiness
sysctl vm.swappiness=10

When a Linux system runs out of available memory (including swap space), the Out-of-Memory (OOM) killer gets triggered to terminate processes. However, this doesn't always happen quickly enough to prevent system freezes. Let's examine how to optimize this behavior.

Prevention starts with awareness. Implement these monitoring solutions:

# Install and configure earlyoom
sudo apt install earlyoom
sudo systemctl enable --now earlyoom

# Alternative: Use a simple monitoring script
while true; do
  free -h | awk '/Mem:/ {print $3 "/" $2}'
  sleep 5
done

Adjust how aggressively your system uses swap space:

# Check current swappiness
cat /proc/sys/vm/swappiness

# Temporary change (recommended value: 10-60)
sudo sysctl vm.swappiness=30

# Permanent change
echo "vm.swappiness=30" | sudo tee -a /etc/sysctl.conf

Use cgroups to protect essential system processes:

# Create a cgroup for critical processes
sudo cgcreate -g memory:reserved

# Set memory limit (here reserving 512MB)
echo "512M" | sudo tee /sys/fs/cgroup/memory/reserved/memory.limit_in_bytes

# Add critical processes (example for sshd)
echo $(pgrep sshd) | sudo tee /sys/fs/cgroup/memory/reserved/cgroup.procs

Manually adjust OOM scores to prioritize system stability:

# Make critical processes less likely to be killed
sudo echo -1000 > /proc/$(pgrep systemd)/oom_score_adj

# Make a specific process more likely to be killed
sudo echo 1000 > /proc/$(pgrep memory_hog)/oom_score_adj

The earlyoom package provides a more responsive alternative to the kernel OOM killer:

# Install on Debian/Ubuntu
sudo apt install earlyoom

# Configure (edit /etc/default/earlyoom)
EARLYOOM_ARGS="-r 60 -m 10 -s 10"

# Where:
# -r 60 = send SIGTERM when memory is below 5%
# -m 10 = minimum available memory percentage (10%)
# -s 10 = minimum available swap percentage (10%)

Adjust these parameters in /etc/sysctl.conf:

# More aggressive memory reclaim
vm.vfs_cache_pressure=100
vm.dirty_ratio=10
vm.dirty_background_ratio=5

# Keep some memory free
vm.min_free_kbytes=65536

When all else fails, this script can help recover control:

#!/bin/bash
# Save as /usr/local/bin/panic_button

# Magic SysRq commands to unfreeze (enable first: sudo sysctl kernel.sysrq=1)
echo 1 > /proc/sys/kernel/sysrq
echo s > /proc/sysrq-trigger  # Sync filesystems
echo u > /proc/sysrq-trigger  # Remount FS read-only
echo b > /proc/sysrq-trigger  # Reboot

# Make executable and accessible
chmod +x /usr/local/bin/panic_button
echo "kernel.sysrq=1" >> /etc/sysctl.conf