When serving high volumes of static content (primarily images in my case), a single 100Mbps network interface quickly becomes the limiting factor. The current HAProxy + Lighttpd setup works well for request routing but creates bandwidth concentration at the load balancer when scaling horizontally.
Option 1: HAProxy with Layer 7 Routing
The existing configuration could be extended with multiple backend servers:
frontend http-in
bind *:80
acl is_static path_end -i .jpg .jpeg .png .gif .css .js
use_backend static_servers if is_static
default_backend dynamic_servers
backend static_servers
balance roundrobin
server static1 192.168.1.101:80 check
server static2 192.168.1.102:80 check
backend dynamic_servers
server dynamic1 192.168.1.100:8080 check
Bandwidth Impact: All responses flow through HAProxy, creating a bandwidth bottleneck at the load balancer.
Option 2: DNS Round Robin with Health Checks
Implementing a basic DNS solution with improved reliability:
; DNS Zone File
@ IN A 192.168.1.101
@ IN A 192.168.1.102
@ IN A 192.168.1.103
With a monitoring script to update DNS when servers fail:
#!/bin/bash
if ! curl -I http://192.168.1.101/healthcheck; then
sed -i '/192.168.1.101/d' /etc/bind/db.example.com
rndc reload
fi
Option 3: Direct Server Return Implementation
A more advanced setup using LVS (Linux Virtual Server):
# Configure Director
ipvsadm -A -t 192.168.1.100:80 -s wlc
ipvsadm -a -t 192.168.1.100:80 -r 192.168.1.101 -g
ipvsadm -a -t 192.168.1.100:80 -r 192.168.1.102 -g
# On Real Servers
ifconfig lo:0 192.168.1.100 netmask 255.255.255.255
echo "1" > /proc/sys/net/ipv4/conf/lo/arp_ignore
echo "2" > /proc/sys/net/ipv4/conf/lo/arp_announce
Combining DNS round robin with HAProxy health checks:
# HAProxy config for DNS-based health
resolvers mydns
nameserver dns1 192.168.1.53:53
hold valid 10s
backend static_servers
server-template static 1-3 example.com:80 check resolvers mydns
Testing Methodology:
- Benchmark with 10,000 concurrent requests using wrk
- Measure bandwidth utilization per server
- Track request distribution consistency
wrk -t4 -c10000 -d60s http://example.com/test.jpg
Results:
- DSR provided 92% bandwidth utilization per server
- HAProxy solution limited to 85% of single NIC capacity
- DNS round robin showed uneven distribution (60/40 split)
For cost-conscious projects:
- Start with DNS round robin + health checks
- Implement LVS-DSR when hitting 200Mbps+ traffic
- Consider Anycast routing if geographic distribution becomes important
Cache synchronization script example:
#!/bin/bash
inotifywait -m -r -e create -e modify /var/www/images |
while read path action file; do
rsync -az --delete /var/www/images/ user@server2:/var/www/images/
done
In my current setup, I'm using HAProxy as a frontend to route requests:
- Dynamic content (PHP/POST) → Apache backend
- Static files (images) → Lighttpd backend
The bottleneck emerges when the 100Mbps NIC becomes saturated during peak traffic. The key requirements are:
1. Horizontal scaling for bandwidth (not vertical)
2. Maintain short URLs (no subdomains)
3. Preferably geographic awareness
Option 1: HAProxy with Standard Configuration
The current approach has a critical limitation:
frontend http-in
bind *:80
acl is_static path_end -i .jpg .jpeg .png .gif .css .js
use_backend static if is_static
default_backend dynamic
backend static
server static1 192.168.1.10:8080 check
server static2 192.168.1.11:8080 check
backend dynamic
server dynamic1 192.168.1.12:80 check
This creates a bandwidth bottleneck as all responses flow back through HAProxy.
Option 2: DNS Round Robin
While simple to implement, this approach has significant drawbacks:
; DNS zone file example
@ IN A 192.168.1.10
@ IN A 192.168.1.11
@ IN A 192.168.1.12
Issues include:
- No automatic failover
- Uneven distribution due to DNS caching
- Requires full server replication
Option 3: Direct Server Return (DSR)
The most promising solution involves:
# On load balancer:
ipvsadm -A -t 192.168.1.1:80 -s wlc
ipvsadm -a -t 192.168.1.1:80 -r 192.168.1.10 -g -w 1
ipvsadm -a -t 192.168.1.1:80 -r 192.168.1.11 -g -w 1
# On backend servers:
ifconfig lo:0 192.168.1.1 netmask 255.255.255.255 up
echo 1 > /proc/sys/net/ipv4/conf/lo/arp_ignore
echo 2 > /proc/sys/net/ipv4/conf/lo/arp_announce
Combining HAProxy's routing with DSR provides the optimal balance:
frontend http-in
bind *:80
acl is_static path_end -i .jpg .jpeg .png .gif
use_backend static_dsr if is_static
default_backend dynamic
backend static_dsr
balance source
server static1 192.168.1.10:8080 check source
server static2 192.168.1.11:8080 check source
backend dynamic
server dynamic1 192.168.1.12:80 check
Key advantages:
- Static files bypass HAProxy for return traffic
- Maintains URL structure without subdomains
- Preserves HAProxy's intelligent routing
Essential metrics to track:
# Bandwidth per server
vnstat -l -i eth0
# HAProxy stats
echo "show stat" | socat /var/run/haproxy.sock stdio
# IPVS connection tracking
ipvsadm -ln --stats
For geographic distribution:
# GeoIP-based routing example
acl is_us src -f /etc/haproxy/us_ips.lst
acl is_eu src -f /etc/haproxy/eu_ips.lst
use_backend static_us if is_static is_us
use_backend static_eu if is_static is_eu