EC2 Instance Unexpected Reboot: Diagnosing Unauthorized CentOS 5 Restarts on AWS


2 views

When examining system logs on my CentOS 5 EC2 instance, I discovered an unexpected reboot occurred on April 6 that wasn't initiated by me. The last command output shows:

reboot   system boot  2.6.18-xenU-ec2-  Wed Apr  6 15:48         (1+05:27)

No user sessions were active at the time, ruling out manual intervention.

AWS EC2 can automatically reboot instances under specific circumstances:

  • Host system hardware failure or maintenance
  • Underlying hypervisor issues
  • Critical security updates requiring reboot (less common for CentOS 5)

Check AWS's event history with the CLI:

aws ec2 describe-instance-status --instance-id i-1234567890abcdef0 \
--include-all-instances \
--query 'InstanceStatuses[].Events[]'

First, verify system logs around the reboot time:

grep -i "reboot" /var/log/messages*
grep -i "shutdown" /var/log/messages*

For EC2-specific insights, check cloud-init logs:

cat /var/log/cloud-init.log
cat /var/log/cloud-init-output.log

To rule out malicious activity:

# Check cron jobs
cat /etc/crontab
ls -la /etc/cron.*

# Review sudo logs
grep -i sudo /var/log/secure*

# Verify authorized_keys
ls -la ~/.ssh/

For critical instances, consider:

  • Setting instance termination protection
  • Configuring CloudWatch alarms for unexpected state changes
  • Using EC2 Auto Recovery for supported instance types

Example CloudWatch alarm setup via CLI:

aws cloudwatch put-metric-alarm \
--alarm-name "EC2-Reboot-Alarm" \
--metric-name StatusCheckFailed_Instance \
--namespace AWS/EC2 \
--statistic Maximum \
--period 60 \
--threshold 1 \
--comparison-operator GreaterThanOrEqualToThreshold \
--evaluation-periods 1 \
--alarm-actions arn:aws:sns:us-east-1:123456789012:my-sns-topic \
--dimensions Name=InstanceId,Value=i-1234567890abcdef0



The first smoking gun appears in the last command output showing:
reboot   system boot  2.6.18-xenU-ec2- Wed Apr  6 15:48         (1+05:27)

No matching SSH session precedes this reboot event, ruling out manual intervention.

AWS documentation confirms several automatic reboot scenarios:

  1. Host Maintenance: Hardware failures or hypervisor updates
  2. Instance Health Check Failure: Failed status checks trigger automatic recovery
  3. Spot Instance Termination: Though this would terminate, not reboot

Retrieve the exact reason through AWS CLI:

aws ec2 describe-instance-status --instance-id i-1234567890abcdef0 \
--query 'InstanceStatuses[].SystemStatus.Details[]'

Sample response indicating maintenance:

{
  "Name": "reachability",
  "Status": "passed",
  "ImpairedSince": "2022-04-06T15:48:00Z"
}

Kernel Panic Analysis

Check for crashes in /var/log/messages:

grep -i "kernel panic" /var/log/messages

Memory Threshold Monitoring

CentOS 5's aggressive OOM killer settings:

# Current OOM killer configuration
cat /proc/sys/vm/panic_on_oom

SSH Key Audit

Verify authorized_keys integrity:

stat -c %Y /home/*/.ssh/authorized_keys
find / -name "authorized_keys" -mtime -2

Hidden Process Detection

Check for unlogged reboots via utmp:

apt-get install utmpdump
utmpdump /var/log/wtmp | grep reboot

Configure CloudWatch alarms for reboot events:

aws cloudwatch put-metric-alarm \
--alarm-name "InstanceReboot" \
--metric-name StatusCheckFailed_System \
--namespace AWS/EC2 \
--statistic Maximum \
--period 60 \
--threshold 1 \
--comparison-operator GreaterThanOrEqualToThreshold \
--evaluation-periods 1 \
--alarm-actions arn:aws:sns:us-east-1:123456789012:reboot-notify