Linux Server Performance Analysis: Decoding Memory Usage and Load Average for High-CPU Systems


2 views

When examining the top output, you'll notice an apparent contradiction: processes show minimal memory usage (0.0%-0.2% MEM) while system reports nearly full utilization. This occurs because Linux aggressively uses available memory for disk caching and buffers to optimize performance.

The key metrics to analyze:

Mem:  130766620k total, 130161072k used,   605548k free,   919300k buffers
Swap: 63111312k total,   500556k used, 62610756k free, 124437752k cached

To get the true "available" memory, use this calculation:

# Calculate truly free memory
free_memory = Mem:free + buffers + cached

The load average (14.04, 14.02, 14.00) represents the system load over 1, 5, and 15 minute periods. For a 24-core system, these values indicate:

  • 14.04 means 14 processes were actively using CPU + 14 waiting for CPU time
  • Healthy range is typically load ≤ number of cores
  • Sustained load > 2x core count suggests performance issues

Here's a Bash script to monitor memory and CPU load:

#!/bin/bash

# Get core count
CORES=$(nproc)

# Get load averages
LOAD=($(cat /proc/loadavg))
LOAD1=${LOAD[0]}
LOAD5=${LOAD[1]}
LOAD15=${LOAD[2]}

# Memory calculation
MEMINFO=$(free -k | grep Mem)
TOTAL=$(echo $MEMINFO | awk '{print $2}')
USED=$(echo $MEMINFO | awk '{print $3}')
FREE=$(echo $MEMINFO | awk '{print $4}')
BUFFERS=$(echo $MEMINFO | awk '{print $6}')
CACHED=$(echo $MEMINFO | awk '{print $7}')

TRUE_FREE=$((FREE + BUFFERS + CACHED))

echo "CPU Cores: $CORES"
echo "Load Average: $LOAD1 (1min), $LOAD5 (5min), $LOAD15 (15min)"
echo "Memory - Total: $((TOTAL/1024))MB, Used: $((USED/1024))MB"
echo "Actual Available: $((TRUE_FREE/1024))MB"

While swap extends available memory, it's not equivalent to RAM:

  • Swap usage indicates memory pressure
  • Frequent swapping causes performance degradation
  • Modern systems with sufficient RAM may disable swap entirely

To check swap activity:

vmstat 1 5

Warning signs for your 24-core, 128GB system:

Metric Warning Level Critical Level
Load Average > 24 > 48
Memory Used > 110GB > 120GB
Swap Used > 4GB > 8GB
CPU Wait (wa) > 5% > 10%

Your top shows MATLAB processes consuming significant resources. Consider these optimizations:

# Limit MATLAB CPU affinity (bind to specific cores)
taskset -c 0-11 matlab &

# Monitor MATLAB memory usage within the application:
feature('memstats')
memory

When examining your Linux server's memory usage through top, you'll notice an apparent contradiction: while individual processes show minimal memory consumption (often 0.0-0.2% in %MEM), the system reports nearly all memory as used. This occurs because Linux employs memory caching aggressively to improve performance.

Key memory components in your top output:

Mem:  130766620k total, 130161072k used,   605548k free,   919300k buffers
Swap: 63111312k total,   500556k used, 62610756k free, 124437752k cached

The "used" memory includes:

  • Application memory (RES column in top)
  • Disk cache (shown as "cached" in buffers/cache line)
  • Kernel buffers (shown as "buffers")

To get accurate memory usage, subtract buffers and cache from used memory:

# Calculate actual free memory
awk '/MemTotal/ {total=$2} /MemFree/ {free=$2} /Buffers/ {buffers=$2} /^Cached/ {cached=$2} END {print (total-free-buffers-cached)/1024 " MB"}' /proc/meminfo

For your server:

130161072k used - 919300k buffers - 124437752k cached = 4.8GB actual usage

Your load averages (14.04, 14.02, 14.00) represent:

  • 1-minute average: 14.04
  • 5-minute average: 14.02
  • 15-minute average: 14.00

For a 24-core system:

# Calculate load/core ratio
echo "scale=2; 14 / 24" | bc
# Result: 0.58

Interpretation:

  • Load < core count: System can handle more
  • Load ≈ core count: System at capacity
  • Load > core count: System overloaded

Better alternatives to top:

# Install htop
sudo apt install htop  # Debian/Ubuntu
sudo yum install htop # CentOS/RHEL

# Sample monitoring script
#!/bin/bash
while true; do
    clear
    echo "===== System Status ====="
    echo -n "Load: "
    cat /proc/loadavg | awk '{print $1}'
    echo -n "Memory: "
    free -h | awk '/Mem/ {print $3 "/" $2}'
    echo -n "Swap: "
    free -h | awk '/Swap/ {print $3 "/" $2}'
    sleep 5
done

Warning signs:

  • Consistent load average > 2x core count
  • Swap usage > 20% of total swap space
  • Actual memory usage (minus cache) > 90% of total

Example alert script:

#!/bin/bash
MAX_LOAD=$(nproc)
SWAP_WARNING=20 # percentage

load=$(cat /proc/loadavg | awk '{print int($1)}')
if [ $load -gt $MAX_LOAD ]; then
    echo "High load detected: $load" | mail -s "Load Alert" admin@example.com
fi

swap_used=$(free | awk '/Swap/ {print $3/$2 * 100}')
if [ $(echo "$swap_used > $SWAP_WARNING" | bc) -eq 1 ]; then
    echo "High swap usage: $swap_used%" | mail -s "Swap Alert" admin@example.com
fi

For memory-intensive applications:

# Adjust swappiness (default 60)
echo 10 > /proc/sys/vm/swappiness

# Clear page cache (emergency only)
echo 1 > /proc/sys/vm/drop_caches

# Limit process memory (example for MATLAB)
ulimit -v 4000000 # 4GB limit

For CPU-bound processes:

# Use taskset to limit CPU cores
taskset -c 0-11 matlab # Use only first 12 cores

# Set process priority
nice -n 10 ./cpu_intensive_script.sh