When running low-traffic applications on AWS Fargate, you're still charged for idle compute capacity. Unlike Aurora Serverless which automatically scales to zero, Fargate requires explicit configuration to achieve true cost optimization during inactive periods.
Fargate's native scaling policies using Application Load Balancer (ALB) metrics have minimum limits:
- Minimum tasks can't be set to zero in standard scaling policies
- ALB metrics require at least one running task to collect data
- No direct "pause" functionality for services
Here's a serverless solution combining CloudWatch alarms and Lambda:
# Python Lambda function to scale Fargate service
import boto3
def lambda_handler(event, context):
ecs = boto3.client('ecs')
# Scale down to zero
ecs.update_service(
cluster='your-cluster',
service='your-service',
desiredCount=0
)
return {
'statusCode': 200,
'body': 'Scaled service to zero'
}
Create an alarm triggering when request count drops below threshold:
aws cloudwatch put-metric-alarm \
--alarm-name "Fargate-Scale-To-Zero" \
--metric-name "RequestCount" \
--namespace "AWS/ApplicationELB" \
--statistic "Sum" \
--period 300 \
--threshold 1 \
--comparison-operator "LessThanThreshold" \
--evaluation-periods 2 \
--alarm-actions "arn:aws:sns:us-east-1:123456789012:ScaleDownTopic" \
--dimensions Name=LoadBalancer,Value=app/your-alb/1234567890abcdef
To manage scale-up latency when requests resume:
- Set ALB health check grace period to 300 seconds
- Configure minimum healthy percentage at 100% during scale-out
- Use Route 53 weighted routing for zero-downtime deployments
For predictable usage patterns, use EventBridge rules:
aws events put-rule \
--name "StopFargateNightly" \
--schedule-expression "cron(0 0 * * ? *)"
aws events put-targets \
--rule "StopFargateNightly" \
--targets "Id"="1","Arn"="arn:aws:lambda:us-east-1:123456789012:function:ScaleToZero"
Sample monthly savings for t3.medium equivalent:
Scenario | Cost |
---|---|
24/7 single task | $29.52 |
Active 12hr/day | $14.76 |
Scales to zero | $2.95 (90% savings) |
When running low-traffic applications on AWS Fargate, keeping containers running 24/7 can lead to unnecessary costs. While Aurora Serverless handles database scaling automatically, Fargate requires explicit configuration to achieve true "scale to zero" behavior.
Fargate's built-in scaling policies don't support scaling to zero tasks because:
- Minimum capacity must be ≥1 in standard configurations
- No native integration with ALB request metrics for zero-traffic detection
- Cold start times make on-demand scaling challenging
Here's a working solution combining multiple AWS services:
# CloudFormation template snippet for scale-to-zero automation
Resources:
FargateService:
Type: AWS::ECS::Service
Properties:
DesiredCount: 0 # Start with zero tasks
# ... other service configs
ScaleUpLambda:
Type: AWS::Lambda::Function
Properties:
Handler: index.handler
Code:
ZipFile: |
const AWS = require('aws-sdk');
const ecs = new AWS.ECS();
exports.handler = async (event) => {
await ecs.updateService({
cluster: process.env.CLUSTER_NAME,
service: process.env.SERVICE_NAME,
desiredCount: 1
}).promise();
};
Environment:
Variables:
CLUSTER_NAME: !Ref ECSCluster
SERVICE_NAME: !Ref FargateService
ALBRequestRule:
Type: AWS::Events::Rule
Properties:
EventPattern:
source: ["aws.application-load-balancer"]
detail-type: ["AWS API Call via CloudTrail"]
detail:
eventSource: ["elasticloadbalancing.amazonaws.com"]
eventName: ["RegisterTargets"]
requestParameters:
targetGroupArn: [!Ref TargetGroup]
This architecture works through several coordinated components:
- ALB detects first incoming request (via CloudTrail event)
- EventBridge triggers the scale-up Lambda function
- Lambda sets Fargate service desired count to 1
- New task registers with target group (typically takes 30-60s)
- Subsequent requests are served normally
To complete the cycle, implement scale-down logic with:
# CloudWatch Alarm for scale-down
AWSTemplateFormatVersion: '2010-09-09'
Resources:
ScaleDownAlarm:
Type: AWS::CloudWatch::Alarm
Properties:
AlarmDescription: "Scale down when no requests for 15 minutes"
MetricName: RequestCount
Namespace: AWS/ApplicationELB
Dimensions:
- Name: LoadBalancer
Value: !GetAtt ALB.LoadBalancerFullName
ComparisonOperator: LessThanThreshold
Threshold: 1
EvaluationPeriods: 3
Period: 300 # 5 minutes
Statistic: Sum
TreatMissingData: notBreaching
To minimize cold start impact:
- Pre-warm during business hours using scheduled scaling
- Use Fargate Spot for cost-sensitive workloads
- Implement health check grace periods in your tasks
- Consider keeping one task running during expected active periods
For more complex scenarios:
- Use Step Functions to orchestrate state transitions
- Combine with Lambda@Edge for geographic scaling
- Implement custom metrics via CloudWatch Agent