While AWS bills customers based on the actual storage consumed by EBS snapshots (which only include changed blocks), the AWS console and most API responses show the original volume size. This discrepancy creates challenges for:
- Cost allocation and billing verification
- Storage optimization efforts
- Capacity planning for snapshot archiving
Here are three technical methods to get accurate size data:
# AWS CLI method (using snapshot-ids.txt)
aws ec2 list-snapshots --snapshot-ids $(cat snapshot-ids.txt) \
--query "Snapshots[*].[SnapshotId,VolumeSize,StartTime]" \
--output table
However, this still shows volume size, not actual storage. For true consumption:
# Get snapshot storage metrics
aws cloudwatch get-metric-statistics \
--namespace AWS/EBS \
--metric-name SnapshotStorageUsed \
--dimensions Name=SnapshotId,Value=snap-1234567890abcdef0 \
--start-time $(date -d "1 day ago" +%F)T00:00:00Z \
--end-time $(date -d "now" +%F)T23:59:59Z \
--period 86400 \
--statistics Average \
--output json
For billing verification across multiple snapshots:
import boto3
from datetime import datetime, timedelta
client = boto3.client('ce')
response = client.get_cost_and_usage(
TimePeriod={
'Start': (datetime.now() - timedelta(days=30)).strftime('%Y-%m-%d'),
'End': datetime.now().strftime('%Y-%m-%d')
},
Granularity='MONTHLY',
Metrics=['UnblendedCost'],
Filter={
'Dimensions': {
'Key': 'USAGE_TYPE',
'Values': ['EBS:SnapshotUsage']
}
}
)
After measuring actual sizes, consider these storage reduction techniques:
- Schedule snapshots during low-write periods
- Use separate volumes for static vs dynamic data
- Implement snapshot lifecycle policies to archive older snaps
Remember that snapshot sizes compound - each incremental snapshot contains all previous changes since the last full backup.
While AWS documentation clearly states that "you're only billed for changed blocks" in EBS snapshots, the actual storage consumption remains frustratingly opaque. The standard AWS CLI command aws ec2 describe-snapshots
only shows the original volume size, not the incremental storage being billed.
Many engineers try to estimate size through:
aws ec2 describe-snapshots --snapshot-ids snap-1234567890abcdef0
But the output's VolumeSize
field refers to the source volume, not the snapshot's actual storage footprint.
Here are three reliable methods to uncover the true size:
Method 1: CloudWatch Metrics
The AWS/EBS
namespace contains the SnapshotStorageUsed
metric:
aws cloudwatch get-metric-statistics \ --namespace AWS/EBS \ --metric-name SnapshotStorageUsed \ --dimensions Name=SnapshotId,Value=snap-1234567890abcdef0 \ --statistics Average \ --period 3600 \ --start-time $(date -d "1 day ago" +%Y-%m-%dT%H:%M:%SZ) \ --end-time $(date +%Y-%m-%dT%H:%M:%SZ)
Method 2: Cost Explorer API
Filter by snapshot ID in Cost Explorer:
aws ce get-cost-and-usage \ --time-period Start=2023-01-01,End=2023-01-31 \ --granularity MONTHLY \ --metrics "UsageQuantity" \ --filter '{"Dimensions": {"Key": "USAGE_TYPE", "Values": ["EBS:SnapshotUsage"]}}'
Method 3: Lambda-based Audit System
Create a scheduled Lambda function that:
import boto3 def lambda_handler(event, context): ec2 = boto3.client('ec2') cloudwatch = boto3.client('cloudwatch') snapshots = ec2.describe_snapshots(OwnerIds=['self'])['Snapshots'] for snap in snapshots: response = cloudwatch.get_metric_statistics( Namespace='AWS/EBS', MetricName='SnapshotStorageUsed', Dimensions=[{'Name':'SnapshotId', 'Value': snap['SnapshotId']}], StartTime=datetime.utcnow() - timedelta(days=1), EndTime=datetime.utcnow(), Period=3600, Statistics=['Average'] ) # Store results in DynamoDB for historical tracking
Armed with actual size data, consider these optimization strategies:
- Schedule snapshots during low-change periods
- Use
fsfreeze
on Linux instances before snapshotting - Implement tiered retention policies
- Consider alternative backup solutions for highly volatile data