When working with Amazon ECS, one of the most common operational tasks is updating running services with new container images. The fundamental question is: how do we safely roll out new versions of our application containers without causing downtime or service disruption?
The standard approach recommended by AWS involves:
- Creating a new task definition with your updated Docker image
- Updating your service to use the new task definition
- Letting ECS handle the gradual replacement of tasks
Here's a basic example using AWS CLI:
# Register new task definition
aws ecs register-task-definition --cli-input-json file://updated-task-definition.json
# Update service
aws ecs update-service --cluster my-cluster --service my-service \
--task-definition my-task-family:revision --desired-count 3
For production environments, you'll want more control over the deployment process:
Blue/Green Deployments
This involves running two identical production environments (Blue and Green) and switching traffic between them:
# Create new task set for Green environment
aws ecs create-task-set --cluster my-cluster --service my-service \
--task-definition my-new-task-definition --launch-type FARGATE \
--network-configuration "awsvpcConfiguration={subnets=[subnet-12345],securityGroups=[sg-12345]}"
# Update primary task set to shift traffic
aws ecs update-service-primary-task-set --cluster my-cluster \
--service my-service --primary-task-set new-task-set-id
Canary Deployments
Gradually shift traffic to the new version while monitoring metrics:
# Update service with deployment configuration
aws ecs update-service --cluster my-cluster --service my-service \
--task-definition my-new-task-definition \
--deployment-configuration "maximumPercent=200,minimumHealthyPercent=50"
For teams using CI/CD pipelines, you can automate the entire process:
# Example CI/CD pipeline snippet
- name: Update ECS Service
run: |
NEW_REVISION=$(aws ecs register-task-definition \
--cli-input-json file://task-definition.json | jq -r '.taskDefinition.revision')
aws ecs update-service --cluster $CLUSTER_NAME \
--service $SERVICE_NAME \
--task-definition $TASK_FAMILY:$NEW_REVISION
While using the "latest" tag might seem convenient, it's generally discouraged in production environments because:
- It makes rollbacks difficult
- Breaks deployment reproducibility
- Can cause inconsistencies between environments
A better approach is to use semantic versioning or commit hashes:
# Good practice example
FROM my-registry/my-app:1.2.3
# Instead of
FROM my-registry/my-app:latest
Always verify your deployments:
# Check deployment status
aws ecs describe-services --cluster my-cluster --services my-service \
--query 'services[0].deployments' --output table
# Check task health
aws ecs describe-tasks --cluster my-cluster --tasks task-id \
--query 'tasks[0].containers[].healthStatus' --output text
When working with Amazon ECS, container updates fundamentally require creating new task definitions. The process isn't as straightforward as simply pointing to a new Docker image with the same tag. Here's why:
# Example of creating new task definition
aws ecs register-task-definition \
--family my-app-task \
--container-definitions '[{
"name": "my-app",
"image": "my-repo/my-app:1.2.0",
"cpu": 1024,
"memory": 2048,
"essential": true
}]'
For production environments, consider implementing blue-green deployments to minimize downtime:
# Update service with new task definition
aws ecs update-service \
--cluster my-cluster \
--service my-service \
--task-definition my-app-task:2 \
--desired-count 2 \
--deployment-configuration "maximumPercent=200,minimumHealthyPercent=50"
While using the "latest" tag might seem convenient, it's generally discouraged in ECS because:
- ECS doesn't automatically pull new images with the same tag
- You lose version tracking and rollback capabilities
- It violates immutable infrastructure principles
Here's a sample Jenkins pipeline snippet for automated ECS updates:
pipeline {
agent any
stages {
stage('Build') {
steps {
sh 'docker build -t my-repo/my-app:${BUILD_NUMBER} .'
sh 'docker push my-repo/my-app:${BUILD_NUMBER}'
}
}
stage('Deploy') {
steps {
sh '''
aws ecs register-task-definition \
--cli-input-json file://task-definition.json \
--region us-west-2
'''
sh '''
aws ecs update-service \
--cluster production \
--service my-app \
--task-definition my-app-task:${BUILD_NUMBER} \
--region us-west-2
'''
}
}
}
}
Always prepare for rollback scenarios by maintaining previous task definitions. You can quickly revert using:
aws ecs update-service \
--cluster my-cluster \
--service my-service \
--task-definition my-app-task:1 \
--region us-west-2
For complex scenarios, consider these parameters in your update-service command:
--force-new-deployment
: Forces new deployment even if no changes--health-check-grace-period-seconds
: For slow-starting containers--enable-execute-command
: For debugging during deployments