When ELB's limited L7 capabilities don't meet your needs, a custom HAProxy cluster on EC2 offers superior flexibility. The challenge lies in creating a self-healing infrastructure that:
- Automatically detects node failures
- Distributes traffic evenly across AZs
- Handles dynamic cluster membership changes
Here's the stack I've successfully deployed for high-traffic services:
HAProxy Cluster (3+ nodes) → Keepalived (VRRP) → AWS Route53 (DNS failover)
│
├── Auto Scaling Group (cross-AZ)
├── EC2 User Data for bootstrap
└── CloudWatch for health checks
Keepalived Configuration (for VIP management):
vrrp_script chk_haproxy {
script "killall -0 haproxy"
interval 2
weight 2
}
vrrp_instance VI_1 {
interface eth0
state MASTER
virtual_router_id 51
priority 101
virtual_ipaddress {
192.168.1.100/24 dev eth0
}
track_script {
chk_haproxy
}
}
For auto-discovery of backend servers, use this HAProxy template with Consul:
backend web_servers
balance roundrobin
option httpchk GET /health
server-template web 1-10 _web._tcp.service.consul resolvers consul resolve-opts allow-dup-ip check
CloudFormation snippet for the Launch Template:
UserData:
Fn::Base64: |
#!/bin/bash
yum install -y haproxy keepalived
aws s3 cp s3://my-config-bucket/haproxy.cfg /etc/haproxy/
systemctl enable --now haproxy keepalived
Essential CloudWatch metrics to alert on:
- HAProxy queue depth > 10
- Active connections > 80% of maxconn
- 5xx error rate > 1% for 5 minutes
Critical findings from our 6-node cluster handling 50K RPS:
- Always set
maxconn
20% below instance limits - Use TCP health checks for L4, HTTP for L7
- Enable
option log-health-checks
for debugging
Building a highly available HAProxy cluster on AWS EC2 requires careful consideration of several components. The core challenge lies in maintaining continuous traffic flow while handling node failures and scaling events.
# Infrastructure components needed:
1. HAProxy nodes (min 3 for quorum)
2. IPVS load balancer layer
3. Route 53 for DNS failover
4. Auto Scaling Groups for node management
5. AWS Lambda for configuration updates
IPVS (IP Virtual Server) acts as our frontend load balancer, directing traffic to healthy HAProxy nodes. Here's a sample configuration:
# Sample IPVS configuration
ipvsadm -A -t 192.168.0.1:80 -s rr
ipvsadm -a -t 192.168.0.1:80 -r 10.0.1.10:80 -g
ipvsadm -a -t 192.168.0.1:80 -r 10.0.2.10:80 -g
ipvsadm -a -t 192.168.0.1:80 -r 10.0.3.10:80 -g
For the HAProxy layer, we recommend using keepalived with a floating VIP. This example shows a 3-node configuration:
# /etc/keepalived/keepalived.conf (node1)
vrrp_instance VI_1 {
state MASTER
interface eth0
virtual_router_id 51
priority 101
advert_int 1
authentication {
auth_type PASS
auth_pass secret
}
virtual_ipaddress {
192.168.0.1/24 dev eth0
}
}
To handle dynamic EC2 environments, implement automatic node discovery using AWS APIs:
import boto3
def get_haproxy_nodes():
ec2 = boto3.client('ec2')
response = ec2.describe_instances(
Filters=[
{'Name': 'tag:Role', 'Values': ['haproxy']},
{'Name': 'instance-state-name', 'Values': ['running']}
]
)
return [i['PrivateIpAddress'] for r in response['Reservations'] for i in r['Instances']]
Distribute nodes across availability zones using AWS Auto Scaling Groups:
{
"AutoScalingGroupName": "haproxy-cluster",
"AvailabilityZones": ["us-east-1a", "us-east-1b", "us-east-1c"],
"LaunchConfigurationName": "haproxy-launch-config",
"MinSize": 3,
"MaxSize": 6,
"DesiredCapacity": 3,
"HealthCheckType": "ELB",
"Tags": [
{
"Key": "Role",
"Value": "haproxy",
"PropagateAtLaunch": true
}
]
}
Implement comprehensive health checks combining EC2 status checks and application-level monitoring:
#!/bin/bash
# Health check script for HAProxy
if curl -s http://localhost:8080/health | grep -q "healthy"; then
exit 0
else
exit 1
fi
While SSL can be handled separately, here's how to configure HAProxy for SSL termination:
frontend https-in
bind *:443 ssl crt /etc/ssl/certs/mydomain.pem
mode http
default_backend servers
backend servers
balance roundrobin
server web1 10.0.1.20:80 check
server web2 10.0.2.20:80 check