Automatically Attaching Persistent EBS Volumes to Replacement EC2 Spot Instances


7 views

When working with EC2 spot instances, termination is inevitable when the current spot price exceeds your bid. While shutdown scripts can help preserve data by pushing it to EBS before termination, the real challenge comes when launching replacement instances. These new instances won't automatically have your initialization scripts since they're typically stored on the root volume.

There are several technical approaches to solve this problem:

1. Using a Custom AMI

The most straightforward method is to create a custom AMI with your initialization scripts baked in:

# Create an AMI from your configured instance
aws ec2 create-image \
    --instance-id i-1234567890abcdef0 \
    --name "MySpotInstanceAMI" \
    --description "AMI with EBS attachment scripts"

2. User Data Scripts

You can pass user data that runs when the instance launches:

#!/bin/bash
# Mount the EBS volume
DEVICE=/dev/xvdf
MOUNT_POINT=/data

# Check if device exists
if [ -b $DEVICE ]; then
    # Create filesystem if needed
    blkid $DEVICE || mkfs -t ext4 $DEVICE
    mkdir -p $MOUNT_POINT
    mount $DEVICE $MOUNT_POINT
    echo "$DEVICE $MOUNT_POINT ext4 defaults,nofail 0 2" >> /etc/fstab
fi

3. AWS Systems Manager Automation

For more complex scenarios, use SSM Automation documents:

{
  "description": "Attach EBS volume to new spot instance",
  "mainSteps": [
    {
      "action": "aws:attachVolume",
      "name": "attachEBSVolume",
      "inputs": {
        "VolumeId": "vol-1234567890abcdef0",
        "InstanceId": "{{ automation:EXECUTION_ID }}",
        "Device": "/dev/sdf"
      }
    }
  ]
}

Implement a consistent tagging strategy to identify your EBS volumes:

aws ec2 create-tags \
    --resources vol-1234567890abcdef0 \
    --tags Key=Purpose,Value=SpotInstanceData Key=Environment,Value=Production

Here's a full example using AWS CLI and user data:

#!/bin/bash
# Find and attach our persistent volume
VOLUME_ID=$(aws ec2 describe-volumes \
    --filters "Name=tag:Purpose,Values=SpotInstanceData" \
    --query "Volumes[0].VolumeId" --output text)

INSTANCE_ID=$(curl -s http://169.254.169.254/latest/meta-data/instance-id)

aws ec2 attach-volume \
    --volume-id $VOLUME_ID \
    --instance-id $INSTANCE_ID \
    --device /dev/sdf

# Wait for volume to attach
while [ ! -b /dev/xvdf ]; do sleep 1; done

# Mount the volume
mkdir -p /data
mount /dev/xvdf /data

Set up CloudWatch Events to detect spot instance terminations and trigger recovery workflows:

aws events put-rule \
    --name "SpotInstanceTermination" \
    --event-pattern '{
      "source": ["aws.ec2"],
      "detail-type": ["EC2 Spot Instance Interruption Warning"]
    }'

When working with AWS Spot Instances, the ephemeral nature creates a significant challenge for persistent storage needs. The fundamental issue arises when your spot instance gets terminated (either due to price fluctuations or capacity changes) and you need subsequent instances to automatically reconnect to your persistent EBS volume.

While creating a custom AMI with your initialization scripts baked in is one approach, it introduces maintenance overhead:


# Example of what you'd need to include in a custom AMI
#!/bin/bash
DEVICE=/dev/xvdf
MOUNT_POINT=/mnt/data

# Check if volume is attached but not mounted
if [ -b $DEVICE ] && ! mountpoint -q $MOUNT_POINT; then
    mkfs -t ext4 $DEVICE
    mkdir -p $MOUNT_POINT
    mount $DEVICE $MOUNT_POINT
    echo "$DEVICE $MOUNT_POINT ext4 defaults,nofail 0 2" >> /etc/fstab
fi

The problem worsens when you need to update your initialization scripts, requiring you to create new AMI versions.

Instead of baking scripts into AMIs, leverage EC2's user data capability combined with tags for more flexibility:


#!/bin/bash
VOLUME_ID=vol-1234567890abcdef0
INSTANCE_ID=$(curl -s http://169.254.169.254/latest/meta-data/instance-id)
REGION=$(curl -s http://169.254.169.254/latest/meta-data/placement/region)

# Attach volume
aws ec2 attach-volume \
    --volume-id $VOLUME_ID \
    --instance-id $INSTANCE_ID \
    --device /dev/sdf \
    --region $REGION

# Wait for volume to attach
while [ ! -e /dev/xvdf ]; do sleep 1; done

# Format if necessary and mount
if ! blkid /dev/xvdf; then
    mkfs -t ext4 /dev/xvdf
fi

mkdir -p /mnt/data
mount /dev/xvdf /mnt/data
echo "/dev/xvdf /mnt/data ext4 defaults,nofail 0 2" >> /etc/fstab

For a complete solution, combine several AWS services:

  1. Use CloudWatch Events to detect spot instance termination
  2. Trigger a Lambda function to detach the EBS volume
  3. Configure your spot fleet/auto-scaling group to include the user data script

// Sample Lambda function for detaching volume on termination
const AWS = require('aws-sdk');
const ec2 = new AWS.EC2();

exports.handler = async (event) => {
    const instanceId = event.detail['instance-id'];
    const volumes = await ec2.describeVolumes({
        Filters: [{
            Name: 'attachment.instance-id',
            Values: [instanceId]
        }]
    }).promise();
    
    await Promise.all(volumes.Volumes.map(volume => {
        return ec2.detachVolume({
            VolumeId: volume.VolumeId,
            Force: true
        }).promise();
    }));
    
    return { status: 'success' };
};

For production environments, consider these additional factors:

  • Implement retry logic in your scripts for transient AWS API failures
  • Add error handling for cases where the volume might already be attached to another instance
  • Consider using EBS multi-attach for certain workloads (io1/io2 volumes only)
  • Implement proper IAM permissions for the instance role