How to Force Stop and Terminate an Unresponsive EC2 Instance Stuck in “Stopping” State After Failed AMI Creation


2 views

Recently, I encountered a frustrating situation with an EC2 instance running Linux (with a root volume and an additional attached volume). After a routine reboot, SSH access became completely unresponsive - though the AWS console showed the instance as "running". HTTP services were also inaccessible. Attempting to create an AMI from this instance resulted in the AMI staying in "pending" state indefinitely.

Here's what didn't work:

  • Waiting for the AMI creation to complete (stuck in "pending" for hours)
  • Deregistering the incomplete AMI
  • Regular instance stop command
  • Force stop through AWS console
  • Volume detachment attempts (stuck in "detaching" state)

After several hours of frustration, I discovered this sequence of AWS CLI commands that finally resolved the issue:


# First, try to force stop the instance
aws ec2 stop-instances --instance-ids i-1234567890abcdef0 --force

# If that doesn't work after 10 minutes, terminate with force
aws ec2 terminate-instances --instance-ids i-1234567890abcdef0 --force

# For stubborn volumes that won't detach
aws ec2 detach-volume --volume-id vol-1234567890abcdef0 --force

If CLI access isn't available, you can try accessing the instance through AWS Systems Manager (SSM):


# Check if SSM agent is running
aws ssm describe-instance-information --filters "Key=InstanceIds,Values=i-1234567890abcdef0"

# Start a session if possible
aws ssm start-session --target i-1234567890abcdef0

To avoid similar situations:

  • Always create AMIs during low-traffic periods
  • Consider using EBS-backed instances for easier recovery
  • Set up CloudWatch alarms for instance health checks
  • Maintain regular snapshots of critical volumes

If the instance remains stuck after all these attempts, you may need to:

  1. Wait for AWS to automatically resolve the issue (can take up to 24 hours)
  2. Contact AWS support with detailed timestamps of all actions taken
  3. Consider the nuclear option - deleting and recreating the VPC (only for non-production environments)

When an EC2 instance becomes unresponsive after a reboot while showing as "running" in the console, it typically indicates either:

  • Kernel panic during boot sequence
  • Filesystem corruption on root volume
  • Network configuration errors
  • Underlying hypervisor issues

The AMI creation process gets stuck because AWS attempts to create a consistent snapshot of attached volumes. When the instance is in a problematic state, this can lead to:


# Check AMI creation status via AWS CLI
aws ec2 describe-images --filters "Name=state,Values=pending" --query "Images[*].[ImageId,State,StateMessage]"

# Output might show:
# "StateMessage": "pending - Waiting for instance to complete initialization"

When standard stop/terminate operations fail, try this multi-step approach:

1. Forced Termination via API


aws ec2 stop-instances --instance-ids i-1234567890abcdef0 --force
aws ec2 terminate-instances --instance-ids i-1234567890abcdef0 --force

2. Volume Detachment Workaround

If volumes won't detach normally:


aws ec2 detach-volume --volume-id vol-1234567890abcdef0 --force

3. AWS Support API Escalation

Even basic support can be accessed via API:


aws support create-case \
--subject "Emergency: Instance stuck in stopping state" \
--service-code "AmazonEC2" \
--severity-code "low" \
--category-code "instance-stuck" \
--communication-body "Instance i-1234567890abcdef0 stuck..."
  • Implement instance health checks with CloudWatch
  • Create AMIs during maintenance windows, not during issues
  • Use EBS-optimized instances for better volume performance

If the instance remains stuck for more than 6 hours, AWS's internal systems will typically force-terminate it automatically. This is part of their zombie instance cleanup process.