Optimal Linux Load Average Thresholds for Mail Servers: EXIM/SpamAssassin Performance Metrics and Benchmarking


2 views

In Linux systems, load average represents the system load over 1, 5, and 15-minute intervals. For mail servers running EXIM with SpamAssassin, we need to consider:

# Sample command to check load average
uptime
# Output format: 12:34:56 up 1 day, 2:30, 1 user, load average: 1.30, 1.25, 1.20

A 15-minute load average of 1.3 on a single-core system means:

  • 30% of CPU capacity is idle (70% utilization)
  • System has buffer for occasional spikes
  • No significant process queuing occurs

To properly assess mail server health, monitor these metrics:

# Mail queue monitoring
exim -bpc
# Process details
top -c -b -n 1 | grep -E 'exim|spamd'
# Disk I/O wait
iostat -x 1 5

Based on production experience:

Load Average Interpretation Action
0.0-1.0 Ideal for mail servers Monitor
1.0-2.0 Acceptable Investigate if sustained
2.0+ Concerning Immediate investigation

Configuration tweaks for better load management:

# In exim4.conf
deliver_queue_load_max = 2.0
queue_only_load = 3.0

# SpamAssassin prefork settings
/etc/default/spamassassin:
OPTIONS="--max-children=5 --helper-home-dir"

Sample script to simulate mail load:

#!/bin/bash
for i in {1..1000}; do
  swaks --to test$i@example.com \
        --from loadtest@example.com \
        --server localhost \
        --silent &
done

Consider these warning signs:

  • Consistent load > CPU cores + 1
  • Mail delivery latency > 5 seconds
  • Processes stuck in D state (disk wait)

Load average represents the average system load over 1, 5, and 15-minute periods. For a single-core system, 1.0 means 100% utilization. However, modern servers typically have multiple cores/threads, so interpretation varies.


# Check CPU cores to understand your baseline
$ grep -c ^processor /proc/cpuinfo
# Typical output for a quad-core server: 4

For Exim/SpamAssassin setups, these components create unique workload patterns:

  • SpamAssassin: CPU-intensive during message scanning
  • Exim: I/O-bound during message queue processing
  • Concurrent connections create spikes in both CPU and RAM usage

# Monitor process-specific CPU usage
$ top -c -p $(pgrep -d',' exim spamd)
# Sample output excerpt:
# PID USER   PR  NI  VIRT  RES  SHR S %CPU %MEM  TIME+  COMMAND
# 1234 exim   20   0  256m  48m  12m S 15.3  2.4  10:23.45 exim -q30m

Your 1.3 load average on a quad-core server (25% utilization) suggests:

Load Range Interpretation
0.0-4.0 Generally acceptable for 4-core server
4.1-6.0 Monitor closely, potential contention
6.0+ Immediate investigation needed

Even with acceptable load, these tweaks can improve Exim/SpamAssassin performance:


# Exim configuration adjustments (exim.conf):
deliver_queue_load_max = 4.0
queue_run_max = 20
smtp_accept_max = 100

# SpamAssassin optimizations:
score ANY_BOUNCE_MESSAGE 0
skip_rbl_checks 1

Combine these tools for comprehensive monitoring:

  1. Prometheus + Grafana for historical trends
  2. Nagios for alert thresholds
  3. Perf for deep CPU analysis

# Basic performance snapshot script
#!/bin/bash
echo "==== $(date) ===="
echo "Load: $(cat /proc/loadavg)"
echo "CPU: $(grep 'cpu ' /proc/stat | awk '{print ($2+$4)*100/($2+$4+$5)}')%"
echo "Exim queue: $(exim -bpc)"