When an AWS Elastic Load Balancer (ELB) fails, it doesn't automatically take down your EC2 instances. The instances continue running, but they become inaccessible to clients because the load balancer acts as the entry point for incoming traffic.
In AWS architecture, ELBs are highly available by design. They're actually distributed systems themselves, consisting of multiple nodes across Availability Zones. A complete failure is extremely rare, but let's examine the scenario:
// Example of checking instance health despite LB failure
const AWS = require('aws-sdk');
const ec2 = new AWS.EC2();
async function checkInstanceHealth(instanceIds) {
const params = {
InstanceIds: instanceIds,
IncludeAllInstances: true
};
const data = await ec2.describeInstanceStatus(params).promise();
return data.InstanceStatuses.map(status => ({
InstanceId: status.InstanceId,
State: status.InstanceState.Name,
Status: status.InstanceStatus.Status
}));
}
Modern AWS load balancers (ALB/NLB) have automatic redundancy:
- Multi-AZ deployment by default
- Continuous health checks
- Automatic failover between nodes
For mission-critical applications, consider these patterns:
# CloudFormation snippet for multi-region failover
Resources:
PrimaryLoadBalancer:
Type: AWS::ElasticLoadBalancingV2::LoadBalancer
Properties:
Scheme: internet-facing
Subnets: !Ref PublicSubnets
SecurityGroups: [!Ref LoadBalancerSecurityGroup]
FailoverDNS:
Type: AWS::Route53::RecordSet
Properties:
Name: !Sub "${ApplicationDomain}."
Type: A
AliasTarget:
HostedZoneId: !GetAtt PrimaryLoadBalancer.CanonicalHostedZoneID
DNSName: !GetAtt PrimaryLoadBalancer.DNSName
Failover: PRIMARY
Implement comprehensive monitoring:
// CloudWatch alarm for LB health
{
"AlarmName": "High-Unhealthy-Hosts",
"MetricName": "UnHealthyHostCount",
"Namespace": "AWS/ApplicationELB",
"Statistic": "Average",
"Dimensions": [
{
"Name": "LoadBalancer",
"Value": "app/my-load-balancer/50dc6c495c0c9188"
}
],
"Period": 60,
"EvaluationPeriods": 2,
"Threshold": 1,
"ComparisonOperator": "GreaterThanThreshold"
}
Consider these advanced patterns:
- Active-active deployment across regions
- DNS-based failover with Route53
- Service mesh with retry logic
- Circuit breakers in application code
// Example circuit breaker implementation
class CircuitBreaker {
constructor(request, options = {}) {
this.request = request;
this.state = "CLOSED";
this.failureThreshold = options.failureThreshold || 5;
this.successThreshold = options.successThreshold || 2;
this.timeout = options.timeout || 5000;
this.failureCount = 0;
this.successCount = 0;
}
async fire() {
if (this.state === "OPEN") {
throw new Error("Circuit breaker is OPEN");
}
try {
const response = await this.request();
return this.success(response);
} catch (err) {
return this.fail(err);
}
}
success(response) {
if (this.state === "HALF") {
this.successCount++;
if (this.successCount > this.successThreshold) {
this.close();
}
}
return response;
}
fail(err) {
this.failureCount++;
if (this.failureCount >= this.failureThreshold) {
this.open();
}
throw err;
}
open() {
this.state = "OPEN";
setTimeout(() => this.half(), this.timeout);
}
half() {
this.state = "HALF";
}
close() {
this.state = "CLOSED";
this.failureCount = 0;
this.successCount = 0;
}
}
When an AWS Elastic Load Balancer (ELB) fails, the behavior depends on the type of failure and your architecture configuration. The key thing to understand is that ELB itself is a managed service with built-in redundancy.
In AWS architecture, ELB failures are extremely rare because:
- ELBs are distributed across multiple Availability Zones by default
- AWS automatically replaces unhealthy ELB nodes
- The service has multiple redundant components
However, in the extremely unlikely event of a complete ELB failure:
# Example of checking instance health directly
import boto3
ec2 = boto3.client('ec2')
response = ec2.describe_instance_status(InstanceIds=['i-1234567890abcdef0'])
print(response['InstanceStatuses'][0]['InstanceState']['Name'])
The critical point is that your EC2 instances continue running normally. They don't fail just because the load balancer fails. The impact is:
- Existing connections to instances remain active
- New connections can't be established through the failed ELB
- Health checks from the ELB stop reaching your instances
Here's how to implement DNS failover as a backup:
// Route 53 failover configuration example
{
"Comment": "Failover configuration",
"Changes": [{
"Action": "CREATE",
"ResourceRecordSet": {
"Name": "example.com",
"Type": "A",
"SetIdentifier": "Primary",
"Failover": "PRIMARY",
"AliasTarget": {
"HostedZoneId": "Z3DZXE0EXAMPLE",
"DNSName": "dualstack.primary-elb-123456789.us-west-2.elb.amazonaws.com",
"EvaluateTargetHealth": true
}
}
}]
}
To minimize impact from any potential ELB issues:
- Enable cross-zone load balancing
- Distribute instances across multiple AZs
- Implement health checks at both ELB and application level
- Consider using multiple ELBs in different regions
Set up CloudWatch alarms to detect ELB issues:
aws cloudwatch put-metric-alarm \
--alarm-name "ELB-Unhealthy-Hosts" \
--metric-name "UnHealthyHostCount" \
--namespace "AWS/ELB" \
--statistic "Maximum" \
--period 60 \
--threshold 0 \
--comparison-operator "GreaterThanThreshold" \
--evaluation-periods 1 \
--alarm-actions "arn:aws:sns:us-west-2:123456789012:my-sns-topic"
For critical applications, consider implementing these backup access methods:
- Direct instance access via SSH/RDP (with proper security groups)
- Secondary ELB in another region
- API Gateway direct integration