Debugging and Fixing “Killed” Errors on Amazon EC2 Ubuntu Instances Due to OOM (Out of Memory) Issues


3 views

When running long-running commands on an Amazon EC2 Ubuntu 10.04 instance, you might encounter processes being abruptly terminated with just the word Killed appearing in the terminal. This typically indicates that the Linux Out-of-Memory (OOM) killer has intervened to prevent system instability.

The dmesg output clearly shows OOM killer activity:

May 14 20:29:15 ip-10-112-33-63 kernel: [11144050.184209] Call Trace:
May 14 20:29:15 ip-10-112-33-63 kernel: [11144050.184218]  [] dump_header+0x7a/0xb0
May 14 20:29:15 ip-10-112-33-63 kernel: [11144050.184221]  [] oom_kill_process+0x5c/0x160

Key indicators from the memory status:

May 14 20:29:15 ip-10-112-33-63 kernel: [11144050.272244] Total swap = 0kB
May 14 20:29:15 ip-10-112-33-63 kernel: [11144050.276842] 435199 pages RAM
May 14 20:29:15 ip-10-112-33-63 kernel: [11144050.276845] 249858 pages HighMem

1. Check current memory usage:

free -m
top -o %MEM

2. Create swap space (if none exists):

sudo fallocate -l 2G /swapfile
sudo chmod 600 /swapfile
sudo mkswap /swapfile
sudo swapon /swapfile

Make it permanent by adding to /etc/fstab:

/swapfile none swap sw 0 0

1. Optimize application memory usage:

For Python applications, consider using generators instead of lists:

# Instead of:
data = [x for x in huge_dataset]

# Use:
def process_data():
    for x in huge_dataset:
        yield process(x)

2. Adjust OOM killer priorities:

# Check current OOM score
cat /proc/[PID]/oom_score

# Adjust OOM score adjustment for important processes
sudo echo -100 > /proc/[PID]/oom_score_adj

Set up monitoring with cron:

# Add to crontab -e
*/5 * * * * /usr/bin/free -m | awk '/^Mem/ {if ($4 < 100) system("echo \"Low memory!\" | mail -s \"Memory Alert\" admin@example.com")}'

Consider using ulimit for memory constraints:

ulimit -v 500000  # Sets virtual memory limit to ~500MB

For AWS EC2 instances:

# Check instance type and memory
curl http://169.254.169.254/latest/meta-data/instance-type

# Consider upgrading to a larger instance type if needed
aws ec2 modify-instance-attribute --instance-id i-1234567890abcdef0 --instance-type m5.large

The abrupt termination of processes with just a "Killed" message typically indicates the Linux Out-of-Memory (OOM) killer has taken action. The dmesg output clearly shows memory pressure:

[11144050.184218] [] dump_header+0x7a/0xb0
[11144050.184221] [] oom_kill_process+0x5c/0x160
[11144050.272164] DMA per-cpu:
[11144050.272176] active_anon:204223 inactive_anon:204177
[11144050.272244] Total swap = 0kB

On AWS EC2 instances (especially older t1/t2 types), memory becomes critical because:

  • No swap space configured by default (Total swap = 0kB)
  • Burstable instances can exhaust CPU credits
  • Older Ubuntu versions have less aggressive memory management

Check current memory status:

free -m
cat /proc/meminfo | grep -E 'MemTotal|MemFree|Swap'
dmesg | grep -i 'out of memory'

1. Create Swap Space:

sudo fallocate -l 1G /swapfile
sudo chmod 600 /swapfile
sudo mkswap /swapfile
sudo swapon /swapfile
echo '/swapfile none swap sw 0 0' | sudo tee -a /etc/fstab

2. Adjust OOM Killer Priorities:

# Protect critical processes
sudo echo -17 > /proc/$(pidof your_process)/oom_adj

# Or for newer kernels:
sudo echo -1000 > /proc/$(pidof your_process)/oom_score_adj

3. Memory Limits via cgroups (Advanced):

sudo cgcreate -g memory:/limited_group
sudo cgset -r memory.limit_in_bytes=512M limited_group
cgexec -g memory:limited_group your_command
  • Upgrade to newer AWS instance types (t3/t4g with better memory management)
  • Migrate to Ubuntu 18.04+ or Amazon Linux 2
  • Implement process monitoring with smem or htop

For memory-intensive tasks in Python:

import resource
import numpy as np

# Set memory limits (in bytes)
soft, hard = resource.getrlimit(resource.RLIMIT_AS)
resource.setrlimit(resource.RLIMIT_AS, (512 * 1024 * 1024, hard))

# Use memory-efficient alternatives
data = np.memmap('large_array.dat', dtype='float32', mode='w+', shape=(10000, 10000))