How to Auto-Scale AWS Fargate to Zero Tasks for Cost Optimization


1 views

When running low-traffic applications on AWS Fargate, you're still charged for idle compute capacity. Unlike Aurora Serverless which automatically scales to zero, Fargate requires explicit configuration to achieve true cost optimization during inactive periods.

Fargate's native scaling policies using Application Load Balancer (ALB) metrics have minimum limits:

  • Minimum tasks can't be set to zero in standard scaling policies
  • ALB metrics require at least one running task to collect data
  • No direct "pause" functionality for services

Here's a serverless solution combining CloudWatch alarms and Lambda:

# Python Lambda function to scale Fargate service
import boto3

def lambda_handler(event, context):
    ecs = boto3.client('ecs')
    
    # Scale down to zero
    ecs.update_service(
        cluster='your-cluster',
        service='your-service',
        desiredCount=0
    )
    
    return {
        'statusCode': 200,
        'body': 'Scaled service to zero'
    }

Create an alarm triggering when request count drops below threshold:

aws cloudwatch put-metric-alarm \
--alarm-name "Fargate-Scale-To-Zero" \
--metric-name "RequestCount" \
--namespace "AWS/ApplicationELB" \
--statistic "Sum" \
--period 300 \
--threshold 1 \
--comparison-operator "LessThanThreshold" \
--evaluation-periods 2 \
--alarm-actions "arn:aws:sns:us-east-1:123456789012:ScaleDownTopic" \
--dimensions Name=LoadBalancer,Value=app/your-alb/1234567890abcdef

To manage scale-up latency when requests resume:

  • Set ALB health check grace period to 300 seconds
  • Configure minimum healthy percentage at 100% during scale-out
  • Use Route 53 weighted routing for zero-downtime deployments

For predictable usage patterns, use EventBridge rules:

aws events put-rule \
--name "StopFargateNightly" \
--schedule-expression "cron(0 0 * * ? *)"

aws events put-targets \
--rule "StopFargateNightly" \
--targets "Id"="1","Arn"="arn:aws:lambda:us-east-1:123456789012:function:ScaleToZero"

Sample monthly savings for t3.medium equivalent:

Scenario Cost
24/7 single task $29.52
Active 12hr/day $14.76
Scales to zero $2.95 (90% savings)

When running low-traffic applications on AWS Fargate, keeping containers running 24/7 can lead to unnecessary costs. While Aurora Serverless handles database scaling automatically, Fargate requires explicit configuration to achieve true "scale to zero" behavior.

Fargate's built-in scaling policies don't support scaling to zero tasks because:

  • Minimum capacity must be ≥1 in standard configurations
  • No native integration with ALB request metrics for zero-traffic detection
  • Cold start times make on-demand scaling challenging

Here's a working solution combining multiple AWS services:


# CloudFormation template snippet for scale-to-zero automation
Resources:
  FargateService:
    Type: AWS::ECS::Service
    Properties:
      DesiredCount: 0  # Start with zero tasks
      # ... other service configs
  
  ScaleUpLambda:
    Type: AWS::Lambda::Function
    Properties:
      Handler: index.handler
      Code:
        ZipFile: |
          const AWS = require('aws-sdk');
          const ecs = new AWS.ECS();
          
          exports.handler = async (event) => {
            await ecs.updateService({
              cluster: process.env.CLUSTER_NAME,
              service: process.env.SERVICE_NAME,
              desiredCount: 1
            }).promise();
          };
      Environment:
        Variables:
          CLUSTER_NAME: !Ref ECSCluster
          SERVICE_NAME: !Ref FargateService

  ALBRequestRule:
    Type: AWS::Events::Rule
    Properties:
      EventPattern:
        source: ["aws.application-load-balancer"]
        detail-type: ["AWS API Call via CloudTrail"]
        detail:
          eventSource: ["elasticloadbalancing.amazonaws.com"]
          eventName: ["RegisterTargets"]
          requestParameters:
            targetGroupArn: [!Ref TargetGroup]

This architecture works through several coordinated components:

  1. ALB detects first incoming request (via CloudTrail event)
  2. EventBridge triggers the scale-up Lambda function
  3. Lambda sets Fargate service desired count to 1
  4. New task registers with target group (typically takes 30-60s)
  5. Subsequent requests are served normally

To complete the cycle, implement scale-down logic with:


# CloudWatch Alarm for scale-down
AWSTemplateFormatVersion: '2010-09-09'
Resources:
  ScaleDownAlarm:
    Type: AWS::CloudWatch::Alarm
    Properties:
      AlarmDescription: "Scale down when no requests for 15 minutes"
      MetricName: RequestCount
      Namespace: AWS/ApplicationELB
      Dimensions:
        - Name: LoadBalancer
          Value: !GetAtt ALB.LoadBalancerFullName
      ComparisonOperator: LessThanThreshold
      Threshold: 1
      EvaluationPeriods: 3
      Period: 300  # 5 minutes
      Statistic: Sum
      TreatMissingData: notBreaching

To minimize cold start impact:

  • Pre-warm during business hours using scheduled scaling
  • Use Fargate Spot for cost-sensitive workloads
  • Implement health check grace periods in your tasks
  • Consider keeping one task running during expected active periods

For more complex scenarios:

  • Use Step Functions to orchestrate state transitions
  • Combine with Lambda@Edge for geographic scaling
  • Implement custom metrics via CloudWatch Agent