How to Export AWS EBS Snapshots to S3 as Raw Objects for Cost-Effective Archiving


4 views

While AWS EBS snapshots are technically stored in S3 infrastructure, they exist in a proprietary format managed by Amazon's backend systems. The standard aws ec2 copy-snapshot command only allows region-to-region transfers within the EBS ecosystem, not direct access to the underlying data as regular S3 objects.

The most reliable method involves creating an intermediate EC2 instance:

# Create volume from snapshot
aws ec2 create-volume --snapshot-id snap-1234567890abcdef0 --availability-zone us-east-1a

# Attach to running instance
aws ec2 attach-volume --volume-id vol-1234567890abcdef0 --instance-id i-1234567890abcdef0 --device /dev/sdf

# SSH into instance and dump to file
sudo dd if=/dev/xvdf of=ebs_dump.img bs=1M status=progress

# Upload to S3
aws s3 cp ebs_dump.img s3://your-bucket/archives/

For true cost optimization:

  • Transition to S3 Glacier Flexible Retrieval after 30 days
  • Use S3 Intelligent-Tiering for unpredictable access patterns
  • Enable S3 Lifecycle policies for automatic tier transitions

When you need to recover the archived snapshot:

# Download from S3
aws s3 cp s3://your-bucket/archives/ebs_dump.img .

# Create new volume
aws ec2 create-volume --availability-zone us-east-1a --size 100 --volume-type gp3

# Attach to instance
aws ec2 attach-volume --volume-id vol-9876543210fedcba --instance-id i-1234567890abcdef0 --device /dev/sdg

# Write data back
sudo dd if=ebs_dump.img of=/dev/xvdg bs=1M status=progress

# Create new snapshot
aws ec2 create-snapshot --volume-id vol-9876543210fedcba --description "Restored from S3 archive"
Storage Type us-east-1 Cost/GB-month
EBS Snapshot $0.05
S3 Standard $0.023
S3 Glacier Flexible Retrieval $0.0036

For enterprise-scale operations, consider using AWS DataSync with these parameters:

aws datasync create-task \
--source-location-arn arn:aws:ec2:us-east-1:123456789012:snapshot/snap-1234567890abcdef0 \
--destination-location-arn arn:aws:s3:::your-bucket/archives/ \
--options PreserveDeletedFiles=REMOVE,VerifyMode=POINT_IN_TIME_CONSISTENT

While EBS snapshots are technically stored in S3 backend, they exist in a proprietary format that's not directly accessible as standard S3 objects. This creates two key pain points:

  • Higher storage costs compared to standard S3 (EBS snapshots cost $0.05/GB-month vs S3 Standard at $0.023/GB-month)
  • No direct download capability for on-premises use

Here's the complete workflow we'll implement:

EBS Snapshot → EC2 Instance → S3 Object → (Archive) → Restore to EC2

1. Prepare the Environment

First, ensure you have these AWS permissions:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "ec2:CreateVolume",
        "ec2:AttachVolume",
        "ec2:DescribeVolumes",
        "ec2:CreateSnapshot",
        "s3:PutObject",
        "s3:GetObject"
      ],
      "Resource": "*"
    }
  ]
}

2. Create Volume from Snapshot

Using AWS CLI:

aws ec2 create-volume \
  --snapshot-id snap-1234567890abcdef0 \
  --availability-zone us-east-1a \
  --volume-type gp2

3. Attach Volume to EC2 Instance

aws ec2 attach-volume \
  --volume-id vol-1234567890abcdef0 \
  --instance-id i-1234567890abcdef0 \
  --device /dev/sdf

4. Copy Data to S3

On the EC2 instance, mount the volume and use AWS CLI to sync:

sudo mkdir /mnt/backup
sudo mount /dev/xvdf1 /mnt/backup
aws s3 sync /mnt/backup s3://your-bucket-name/archive/$(date +%Y-%m-%d)/

For long-term archival, transition the S3 objects to cheaper storage classes:

aws s3api put-object-tagging \
  --bucket your-bucket-name \
  --key archive/2023-11-15/ \
  --tagging '{"TagSet": [{"Key": "storage-class", "Value": "DEEP_ARCHIVE"}]}'

When you need to restore:

# Download from S3
aws s3 sync s3://your-bucket-name/archive/2023-11-15/ /mnt/restore

# Create new volume
aws ec2 create-volume \
  --size 100 \
  --availability-zone us-east-1a \
  --volume-type gp2

# Attach and copy data
aws ec2 attach-volume [...]
sudo dd if=/mnt/restore/disk.img of=/dev/xvdg bs=1M

For larger volumes, consider DataSync for better performance:

aws datasync create-task \
  --source-location-arn arn:aws:datasync:us-east-1:123456789012:location/loc-1234567890abcdef0 \
  --destination-location-arn arn:aws:datasync:us-east-1:123456789012:location/loc-abcdef01234567890 \
  --options "VerifyMode=POINT_IN_TIME_CONSISTENT,TransferMode=CHANGED"