Best Practices for Updating Docker Containers in Amazon ECS Services


1 views

When working with Amazon ECS, one of the most common operational tasks is updating running services with new container images. The fundamental question is: how do we safely roll out new versions of our application containers without causing downtime or service disruption?

The standard approach recommended by AWS involves:

  1. Creating a new task definition with your updated Docker image
  2. Updating your service to use the new task definition
  3. Letting ECS handle the gradual replacement of tasks

Here's a basic example using AWS CLI:

# Register new task definition
aws ecs register-task-definition --cli-input-json file://updated-task-definition.json

# Update service
aws ecs update-service --cluster my-cluster --service my-service \
    --task-definition my-task-family:revision --desired-count 3

For production environments, you'll want more control over the deployment process:

Blue/Green Deployments

This involves running two identical production environments (Blue and Green) and switching traffic between them:

# Create new task set for Green environment
aws ecs create-task-set --cluster my-cluster --service my-service \
    --task-definition my-new-task-definition --launch-type FARGATE \
    --network-configuration "awsvpcConfiguration={subnets=[subnet-12345],securityGroups=[sg-12345]}"

# Update primary task set to shift traffic
aws ecs update-service-primary-task-set --cluster my-cluster \
    --service my-service --primary-task-set new-task-set-id

Canary Deployments

Gradually shift traffic to the new version while monitoring metrics:

# Update service with deployment configuration
aws ecs update-service --cluster my-cluster --service my-service \
    --task-definition my-new-task-definition \
    --deployment-configuration "maximumPercent=200,minimumHealthyPercent=50"

For teams using CI/CD pipelines, you can automate the entire process:

# Example CI/CD pipeline snippet
- name: Update ECS Service
  run: |
    NEW_REVISION=$(aws ecs register-task-definition \
        --cli-input-json file://task-definition.json | jq -r '.taskDefinition.revision')
    aws ecs update-service --cluster $CLUSTER_NAME \
        --service $SERVICE_NAME \
        --task-definition $TASK_FAMILY:$NEW_REVISION

While using the "latest" tag might seem convenient, it's generally discouraged in production environments because:

  • It makes rollbacks difficult
  • Breaks deployment reproducibility
  • Can cause inconsistencies between environments

A better approach is to use semantic versioning or commit hashes:

# Good practice example
FROM my-registry/my-app:1.2.3
# Instead of
FROM my-registry/my-app:latest

Always verify your deployments:

# Check deployment status
aws ecs describe-services --cluster my-cluster --services my-service \
    --query 'services[0].deployments' --output table

# Check task health
aws ecs describe-tasks --cluster my-cluster --tasks task-id \
    --query 'tasks[0].containers[].healthStatus' --output text

When working with Amazon ECS, container updates fundamentally require creating new task definitions. The process isn't as straightforward as simply pointing to a new Docker image with the same tag. Here's why:

# Example of creating new task definition
aws ecs register-task-definition \
  --family my-app-task \
  --container-definitions '[{
    "name": "my-app",
    "image": "my-repo/my-app:1.2.0",
    "cpu": 1024,
    "memory": 2048,
    "essential": true
  }]'

For production environments, consider implementing blue-green deployments to minimize downtime:

# Update service with new task definition
aws ecs update-service \
  --cluster my-cluster \
  --service my-service \
  --task-definition my-app-task:2 \
  --desired-count 2 \
  --deployment-configuration "maximumPercent=200,minimumHealthyPercent=50"

While using the "latest" tag might seem convenient, it's generally discouraged in ECS because:

  • ECS doesn't automatically pull new images with the same tag
  • You lose version tracking and rollback capabilities
  • It violates immutable infrastructure principles

Here's a sample Jenkins pipeline snippet for automated ECS updates:

pipeline {
  agent any
  stages {
    stage('Build') {
      steps {
        sh 'docker build -t my-repo/my-app:${BUILD_NUMBER} .'
        sh 'docker push my-repo/my-app:${BUILD_NUMBER}'
      }
    }
    stage('Deploy') {
      steps {
        sh '''
          aws ecs register-task-definition \
            --cli-input-json file://task-definition.json \
            --region us-west-2
        '''
        sh '''
          aws ecs update-service \
            --cluster production \
            --service my-app \
            --task-definition my-app-task:${BUILD_NUMBER} \
            --region us-west-2
        '''
      }
    }
  }
}

Always prepare for rollback scenarios by maintaining previous task definitions. You can quickly revert using:

aws ecs update-service \
  --cluster my-cluster \
  --service my-service \
  --task-definition my-app-task:1 \
  --region us-west-2

For complex scenarios, consider these parameters in your update-service command:

  • --force-new-deployment: Forces new deployment even if no changes
  • --health-check-grace-period-seconds: For slow-starting containers
  • --enable-execute-command: For debugging during deployments