Hybrid EC2 Auto-Scaling: Mixing On-Demand and Spot Instances for Cost Optimization

When working with AWS EC2 Auto Scaling Groups (ASGs), many teams face a dilemma: how to maintain reliable baseline capacity while still benefiting from the cost savings of Spot Instances. The current ASG implementation doesn't natively support mixing On-Demand and Spot instances in a single group with different scaling behaviors.

Here are three practical ways to achieve this hybrid scaling pattern:

1. Multiple Auto Scaling Groups Strategy

Create two separate ASGs attached to the same load balancer:


Resources:
  BaselineASG:
    Type: AWS::AutoScaling::AutoScalingGroup
    Properties:
      MinSize: 2
      MaxSize: 2
      LaunchTemplate:
        LaunchTemplateId: !Ref OnDemandLaunchTemplate
        Version: !GetAtt OnDemandLaunchTemplate.LatestVersionNumber
    
  SpotASG:
    Type: AWS::AutoScaling::AutoScalingGroup
    Properties:
      MinSize: 0
      MaxSize: 8
      LaunchTemplate:
        LaunchTemplateId: !Ref SpotLaunchTemplate
        Version: !GetAtt SpotLaunchTemplate.LatestVersionNumber

2. Instance Weighting in Mixed Instances Policy

Use ASG's mixed instances policy with instance weighting:


MixedInstancesPolicy:
  InstancesDistribution:
    OnDemandBaseCapacity: 2
    OnDemandPercentageAboveBaseCapacity: 0
  LaunchTemplate:
    LaunchTemplateSpecification:
      LaunchTemplateId: !Ref MixedLaunchTemplate
    Overrides:
      - InstanceType: c5.large
        WeightedCapacity: 1
      - InstanceType: c5d.large
        WeightedCapacity: 1

3. Custom Scaling with Lambda

Implement a Lambda function to manage the hybrid scaling logic:


import boto3

def lambda_handler(event, context):
    asg = boto3.client('autoscaling')
    ec2 = boto3.client('ec2')
    
    # Get current metrics
    response = asg.describe_auto_scaling_groups(AutoScalingGroupNames=['hybrid-asg'])
    
    # Check if we need to maintain baseline
    if len(response['AutoScalingGroups'][0]['Instances']) < 2:
        # Launch on-demand instances
        ec2.run_instances(
            LaunchTemplate={'LaunchTemplateName':'on-demand-template'},
            MinCount=1,
            MaxCount=1
        )
    elif should_scale_out():  # Your custom scaling logic
        # Request spot instances
        ec2.request_spot_instances(
            SpotPrice='0.05',
            LaunchSpecification={
                'LaunchTemplate': {'LaunchTemplateName':'spot-template'}
            }
        )

When using Spot instances, implement these safeguards:

Set up CloudWatch alarms for Spot interruption warnings
Configure ASG lifecycle hooks to handle Spot terminations gracefully
Use EC2 Fleet with On-Demand fallback capacity

Many AWS users face a dilemma when configuring auto-scaling groups (ASGs): how to maintain reliable baseline capacity while leveraging cost-effective Spot Instances. The default ASG configuration forces an either-or choice between instance types, which doesn't match real-world operational needs.

The native ASG implementation has several constraints regarding mixed instance policies:

Cannot guarantee specific counts of On-Demand vs Spot instances
Spot instance interruptions can violate minimum capacity requirements
No built-in way to prioritize On-Demand for baseline capacity

Option 1: Multiple ASGs with Load Balancer

Create separate ASGs for each instance type and manage them through a common ALB:


# On-Demand ASG
resource "aws_autoscaling_group" "od_asg" {
  min_size         = 2  # Your minimum baseline
  max_size         = 4
  launch_template {
    id      = aws_launch_template.on_demand.id
    version = "$Latest"
  }
}

# Spot ASG 
resource "aws_autoscaling_group" "spot_asg" {
  min_size         = 0
  max_size         = 10
  mixed_instances_policy {
    instances_distribution {
      on_demand_base_capacity = 0
      on_demand_percentage_above_base_capacity = 0
      spot_allocation_strategy = "capacity-optimized"
    }
    launch_template {
      launch_template_specification {
        launch_template_id = aws_launch_template.spot.id
        version = "$Latest"
      }
    }
  }
}

Option 2: Instance Weighting with Mixed Policy

AWS does allow some mixing through the MixedInstancesPolicy, though with limitations:


mixed_instances_policy {
  instances_distribution {
    on_demand_base_capacity = 2  # Minimum On-Demand
    on_demand_percentage_above_base_capacity = 0  # Everything else Spot
  }
  launch_template {
    launch_template_specification {
      launch_template_id = aws_launch_template.base.id
      version = "$Latest"
    }
    override {
      instance_type = "m5.large"
    }
    override {
      instance_type = "m5.xlarge"
      weighted_capacity = 2
    }
  }
}

For precise control, implement custom scaling logic through Lambda:


def lambda_handler(event, context):
    # Get current metrics
    ec2 = boto3.client('ec2')
    cloudwatch = boto3.client('cloudwatch')
    
    # Calculate needed capacity
    current_load = get_cloudwatch_metric(...)
    
    # Determine instance mix
    if current_load > baseline_threshold:
        launch_spot_instances(...)
    else:
        ensure_on_demand_baseline(...)

When implementing hybrid scaling:

Set proper instance protection for On-Demand nodes
Configure distinct termination policies for each type
Monitor Spot interruption notices proactively
Use capacity-optimized allocation strategy for Spot

Essential CloudWatch metrics to track:

SpotInstanceInterruptionRate
OnDemandInstanceCount
HybridCostSavings
CapacityGap (desired vs actual)

ServerDevWorker