When managing production infrastructure with AWS Auto Scaling Groups (ASGs), one common pain point is updating the underlying Amazon Machine Images (AMIs) while maintaining availability. The current manual process of scaling up/down works but introduces operational overhead and potential downtime windows.
Here are proven approaches to automate AMI rotation:
# CloudFormation example using UpdatePolicy
"MyASG": {
"Type": "AWS::AutoScaling::AutoScalingGroup",
"UpdatePolicy": {
"AutoScalingRollingUpdate": {
"MaxBatchSize": "2",
"MinInstancesInService": "1",
"PauseTime": "PT5M",
"WaitOnResourceSignals": "true"
}
}
}
For critical production systems, consider creating a parallel ASG with the new AMI:
- Create new launch template with updated AMI
- Stand up new ASG pointing to same ELB
- Gradually shift traffic using ELB weights
- Decommission old ASG after validation
SSM Automation Documents can orchestrate the entire process:
aws ssm create-automation-execution \
--document-name "AWS-UpdateLinuxAmi" \
--parameters "AutomationAssumeRole=arn:aws:iam::123456789012:role/AutomationServiceRole,SourceAmiId=ami-12345678,InstanceIamRole=MyInstanceProfile,TargetAmiName=web-app-{{timestamp}}"
For GitOps workflows, integrate AMI updates into your CI/CD pipeline:
# Sample Jenkins pipeline stage
stage('Update ASG') {
steps {
script {
def newLT = aws.ec2.createLaunchTemplateVersion(
launchTemplateId: 'lt-0123456789abcdef',
sourceVersion: '1',
amiId: params.AMI_ID
)
aws.autoscaling.updateAutoScalingGroup(
autoScalingGroupName: 'web-app-asg',
launchTemplate: [
launchTemplateId: 'lt-0123456789abcdef',
version: newLT.versionNumber
]
)
}
}
}
- Always test new AMIs in staging first
- Monitor health checks during rotation
- Consider canary deployments for major changes
- Implement proper rollback procedures
When managing web applications on AWS, we often face the dilemma of updating Amazon Machine Images (AMIs) while maintaining continuous availability. The current approach of manually scaling up/down works but introduces operational overhead and potential service disruption.
Here are effective methods to automate AMI rotation in your Auto Scaling Groups (ASGs):
1. Using AWS Systems Manager (SSM) Automation
This native AWS solution provides the most integrated approach. Create an SSM Automation document that:
- Creates a new launch template version with the updated AMI
- Gradually replaces instances using rolling updates
- Verifies health checks before proceeding
# Sample AWS CLI command to start the automation
aws ssm start-automation-execution \
--document-name "AWS-UpdateAutoScalingGroup" \
--parameters '{
"AutoScalingGroupName":["your-asg-name"],
"LaunchTemplateName":["your-launch-template"],
"LaunchTemplateVersion":["$LATEST"],
"MinHealthyPercentage":["90"],
"WaitOnResourceSignals":["false"]
}'
2. AWS CodePipeline Integration
For CI/CD pipelines, you can trigger AMI updates through CodePipeline:
# CloudFormation snippet for Pipeline configuration
Resources:
AMIUpdatePipeline:
Type: AWS::CodePipeline::Pipeline
Properties:
Stages:
- Name: Source
Actions:
- Name: SourceAction
ActionTypeId:
Category: Source
Owner: AWS
Provider: CodeCommit
Configuration:
RepositoryName: your-repo
BranchName: main
- Name: Build
Actions:
- Name: BuildAMIAction
ActionTypeId:
Category: Build
Owner: AWS
Provider: CodeBuild
Configuration:
ProjectName: your-build-project
- Name: Deploy
Actions:
- Name: UpdateASG
ActionTypeId:
Category: Deploy
Owner: AWS
Provider: AutoScaling
Configuration:
LaunchTemplateName: your-template
AutoScalingGroupName: your-asg
3. Custom Lambda Function Solution
For maximum control, implement a Lambda function triggered by CloudWatch Events:
import boto3
import time
def lambda_handler(event, context):
autoscaling = boto3.client('autoscaling')
ec2 = boto3.client('ec2')
# Get current ASG configuration
asg = autoscaling.describe_auto_scaling_groups(
AutoScalingGroupNames=['your-asg-name']
)['AutoScalingGroups'][0]
# Create new launch template version with updated AMI
new_launch_template = ec2.create_launch_template_version(
LaunchTemplateName='your-template',
SourceVersion='$LATEST',
LaunchTemplateData={
'ImageId': 'ami-1234567890abcdef0'
}
)
# Update ASG with new launch template
autoscaling.update_auto_scaling_group(
AutoScalingGroupName='your-asg-name',
LaunchTemplate={
'LaunchTemplateName': 'your-template',
'Version': str(new_launch_template['LaunchTemplateVersion']['VersionNumber'])
},
MinSize=asg['MinSize'],
MaxSize=asg['MaxSize'],
DesiredCapacity=asg['DesiredCapacity']
)
# Implement instance refresh
refresh = autoscaling.start_instance_refresh(
AutoScalingGroupName='your-asg-name',
Preferences={
'MinHealthyPercentage': 90,
'InstanceWarmup': 300
}
)
return {
'statusCode': 200,
'body': f"Instance refresh initiated: {refresh['InstanceRefreshId']}"
}
- Always test new AMIs in a staging environment first
- Implement health checks that accurately reflect application state
- Use canary deployments when possible (gradual rollout)
- Monitor CloudWatch metrics during rotation
- Set appropriate instance warm-up times
Implement these CloudWatch Alarms to detect issues:
aws cloudwatch put-metric-alarm \
--alarm-name "ASG-HealthCheck-Failures" \
--metric-name "HealthyHostCount" \
--namespace "AWS/AutoScaling" \
--statistic "Average" \
--period 60 \
--threshold 2 \
--comparison-operator "LessThanThreshold" \
--dimensions "Name=AutoScalingGroupName,Value=your-asg-name" \
--evaluation-periods 2 \
--alarm-actions "arn:aws:sns:us-east-1:123456789012:your-sns-topic"
For rollback scenarios, maintain previous launch template versions and implement automation to revert if alarms trigger.