Analyzing Linux OOM Killer Events with kdump/crash: Debugging Memory Leaks in Kernel Modules


2 views

When facing an OOM situation where 99% of 64GB RAM is consumed (kmem -i output shows 62.7GB used), the first step is identifying allocation patterns:

crash> kmem -s
CACHE            NAME                 OBJSIZE  ALLOCATED     TOTAL  SLABS  SSIZE
ffff88083fc0e800 dentry                   192    1165216   1203200   9400     8k
ffff88083fc0e000 size-4096               4096      32768     32768    256    32k
ffff88083f413800 task_struct             2960      30240     30240    210    16k
ffff88083fc0d800 size-8192               8192      16384     16384    128    32k

The system log reveals the OOM killer's victims over 30 seconds. This timeline helps reconstruct memory pressure buildup:

crash> log -m | grep -A5 "Out of memory"
[  223.556616] Out of memory: Kill process 3189 (portreserve) score 1
[  223.787234] Out of memory: Kill process 3196 (rsyslogd) score 1
[  224.237119] Out of memory: Kill process 3728 (dbus-daemon) score 1
...
[  252.603324] Out of memory: Kill process 4855 (cmfileassistd) score 1

Standard ps shows minimal userland memory usage (0.0039GB). Focus shifts to kernel threads with:

crash> bt -a
PID: 4925   TASK: ffff880828a38ae0  CPU: 5   COMMAND: "kworker/u:3"
 #0 [ffff8808279e7c38] schedule at ffffffff814f8a3c
 #1 [ffff8808279e7cc0] worker_thread at ffffffff8108d7b6
 #2 [ffff8808279e7d60] kthread at ffffffff8108f0b6
 #3 [ffff8808279e7ea0] ret_from_fork at ffffffff8140b30c

For suspected DRBD memory leaks, examine module allocations:

crash> mod -S drbd
MODULE       NAME                   SIZE  OBJECTS
ffffffffa01a6000  drbd              217344  3384
crash> sym drbd_alloc_pages
ffffffffa019a3c0 (t) drbd_alloc_pages

Detailed slab analysis reveals potential culprits:

crash> kmem -S drbd
CACHE            NAME                 OBJSIZE  ALLOCATED     TOTAL  SLABS  SSIZE
ffff88083fc0a000 drbd_request          1032      30240     30240    210    16k
ffff88083fc0a800 drbd_peer_device      1088      32768     32768    256    32k

For bonding/network module issues, inspect socket buffers:

crash> net -s
Family               Protocol              RX      TX    Total
IPv4                 TCP               1048576 2097152 3145728
IPv4                 UDP                524288  524288 1048576

Reconstruct the final memory state before panic:

crash> vm -p
PID: 5079   TASK: ffff88082b882ae0  COMMAND: "bash"
    MM               PGD          RSS    TOTAL_VM
ffff88082a8d8000  ffff88082b0e3000    1324k    19348k
    VMA           START       END     FLAGS FILE
ffff88082a8d8158 00400000 004f1000 8000875 /bin/bash

The server's demise began with a cascade of 39 OOM kill messages before finally succumbing to the dreaded Kernel panic - not syncing: Out of memory. What makes this case particularly interesting is that standard process monitoring tools showed no obvious culprits in userspace - the memory hemorrhage appeared to originate from kernel territory.

Let's start with the essential crash utility commands that reveal the smoking gun:


# First, examine overall memory status
crash> kmem -i

# Then check slab allocations (often reveals kernel module leaks)
crash> kmem -s

# For DRBD/bonding module analysis
crash> mod -S drbd
crash> mod -S bonding

# Show kernel memory zones
crash> kmem -z

# Detailed slab cache inspection
crash> kmem -S

The real goldmine comes from analyzing slab allocations. Here's what to look for:


crash> kmem -s
CACHE            OBJSIZE  ALLOCATED     TOTAL  SLABS  SSIZE  NAME
ffff88083e3d3800     32    1733712    1734400   677     8k  kmalloc-32
ffff88083e3d3400     64     987632     988800  1236     8k  kmalloc-64
ffff88083e3d4000    256     512000     512000   500     8k  kmalloc-256

Compare these numbers against baseline values from a healthy system. Spikes in kmalloc-256 might indicate DRBD buffer issues, while kmalloc-32 could point to network subsystem problems.

Given the network interface manipulation preceding the crash, these commands prove invaluable:


# Show network device structures
crash> net -s

# Examine socket buffers
crash> files -s N

# Check network device memory usage
crash> dev -d

# Specifically for mlx4/mlx5 devices
crash> pci -s | grep -i mellanox

For DRBD-related memory analysis, these specialized commands help:


# Show DRBD resource structures
crash> drbdshow -r

# Check DRBD socket buffers
crash> drbdshow -s

# Examine DRBD metadata
crash> drbdshow -m

# Verify connection states
crash> drbdshow -c

For large memory dumps, automation is key. Here's a bash script to parse crash output:


#!/bin/bash
CRASH_BIN="/usr/bin/crash"
VMLINUX="/usr/lib/debug/lib/modules/$(uname -r)/vmlinux"
COREFILE="$1"

analyze_memory() {
    $CRASH_BIN $VMLINUX $COREFILE <<-EOF
    set pagination off
    kmem -i > kmem_info.txt
    kmem -s > kmem_slab.txt
    ps -u > user_processes.txt
    mod > loaded_modules.txt
    bt -a > backtraces.txt
    exit
EOF
}

generate_report() {
    echo "### Memory Analysis Report ###"
    echo "Slab Cache Abnormalities:"
    grep -A5 "kmalloc" kmem_slab.txt | sort -k4 -nr
    
    echo -e "\nSuspicious Modules:"
    awk '$3 > 1000000 {print}' kmem_slab.txt
    
    echo -e "\nFinal User Processes:"
    cat user_processes.txt
    
    echo -e "\nKernel Stack Traces:"
    grep "Out of memory" backtraces.txt -A10
}

analyze_memory
generate_report > oom_analysis_$(date +%Y%m%d).txt

Watch for these telltale signs in your analysis:

  • Unusually large slab caches associated with specific modules
  • Growing allocations between consecutive OOM events
  • Network-related structures (sk_buff) consuming excessive memory
  • DRBD buffer counts exceeding normal operational thresholds
  • Kernel thread stacks showing allocation patterns

Based on this post-mortem, implement these safeguards:


# Add these to /etc/sysctl.conf
vm.panic_on_oom=1
vm.oom_kill_allocating_task=1
kernel.panic=10

# DRBD-specific tuning
echo 2048 > /proc/sys/net/ipv4/tcp_mem
echo 256 > /sys/block/drbd0/queue/max_sectors_kb